How Metaplane, Snowflake, and dbt help Vendr run a lean and adaptable data team
"Without Metaplane our data team wouldn't be able to proactively prevent data outages and swiftly troubleshoot data errors."
The Challenge: Staying lean and adapting in a fast-paced, ever-changing startup environment
Vendr is a high-growth startup that is revolutionizing the SaaS buying experience. Data has always been at the center of the organization: not only is it being operationalized to support the marketing and sales teams, but data also powers insights directly in the product.
Over the past few years, Vendr has raised over $200M and has grown the team exponentially. With this growth has come new challenges for the data team like onboarding new data consumers, centralizing and exposing more data, and the need to create trust in the data.
As Vendr has grown, Schylar Brock, a Senior Analytics Engineering Manager, has been responsible for bringing insights to the entire organization, including the 80% of teammates that use data in their jobs on a daily basis. In this rapidly changing startup environment, it’s hard to scale data and remain flexible because metrics and business strategies can change on a weekly basis.
To keep up with this pace, Schylar’s team needed a data stack that allowed them to move quickly without breaking things. As new teammates join and use data, governance policies need to be added in minutes, not weeks. When definitions or metrics changed, the team needed a way to adjust models in hours rather than months.
And when they made these changes, the team needed more confidence about the downstream impact to BI reports across both Looker and Metabase. Lastly, given the team had several upstream sources like production databases and ETL tools, the data team had to be the first to know about any issues to ensure trust in data. With a lean team of four wearing many hats, it was difficult to consistently keep an eye on all of these systems.
Solution: Snowflake, dbt, and Metaplane
Snowflake as a scalable data cloud
Schylar and her team love Snowflake because it solves their use cases for today while also being a tool that will grow with their team. It integrates with the entire data ecosystem which helps them adopt tools that create leverage for her team.
❝Snowflake is very easy to get started with and scale with our team. It has integrations with everything, so we always know that every other SaaS tool that we use on our data team will fit right in
This has allowed Schylar’s team to remain lean while still supporting the larger team. As the company has grown, Snowflake’s performance and governance functionality have helped the data team keep up with increased data consumption. It takes no time and minimal effort to scale up Snowflake when necessary, and the governance tools have made onboarding teammates simple by ensuring that the right data access is given in minutes.
❝Because we’re so small, only four brains working on tens of different things at the same time, not having to worry about the governance is important because it means you have more time to devote to what’s most important to the business
For example, when Schylar’s team needed to protect sensitive information being stored in the data cloud, they were able to create specific masking policies. Because of Snowflake’s established community, she was able to point her team to resources online that outlined how to solve these specific problems.
With Snowflake, things like performance and governance are a non-issue for Vendr. Rather than needing to hire additional headcount just to manage database performance and governance, their team is able to implement scaling and masking policies with ease.
dbt for building maintainable and extensible data models
Since Vendr is a rapidly growing startup, the business is constantly changing. By adopting dbt, which Schylar considers the gold standard for transformation, the team has been able to adjust effortlessly.
For example, by using dbt, the data team has centralized data from fourteen different sources by referencing production databases, Salesforce, Airtable databases, HubSpot, and others in a single model.
❝Most people who are using our data don’t want to know where all of the data is coming from, they just want to make sure it’s correct. That’s crucial so they don’t need to do it themselves. dbt makes it simple to reference multiple sources
In addition to abstracting the complexities of centralizing data across several sources, dbt also serves as a layer of data consistency across the organization. Schylar’s team is able to define common models in one place, removing ambiguity and creating a set of trusted data models. In addition, leveraging metadata like table and column descriptions and syncing these definitions to Metabase has helped create data awareness across the entire organization.
With dbt, the data team at Vendr has remained lean while still creating centralized data models that the entire organization depends on. Because dbt is flexible and uses concepts like version control, they can make adjustments when the business changes in hours.
❝At my last company, we didn’t use dbt and it was a huge headache. Being able to align on things, have version control, is really crucial. I hope to never go back to a place where dbt is not being used. It’s a game changer.
Metaplane for plug-and-play data observability
By implementing Metaplane early on, the Vendr team built a culture of trust in data from day one. Snowflake and dbt made it simple to centralize important data models for the entire organization, and Metaplane quickly became the one place to visit to understand the health and quality of the data.
Metaplane seamlessly integrated with Vendr’s Snowflake and dbt cloud instance to provide another set of eyes on important processes: dbt job run durations, data freshness and row counts on critical tables, upstream schema changes, and end to end data lineage. It’s as if they have another full time data engineer constantly monitoring their most important data systems.
❝Our use cases have snowballed as the product has grown. We thought about it as a data observability testing tool, but all of the lineage work and detecting schema changes are all things we didn’t necessarily expect from the beginning.
With Metaplane, Schylar and her team have become more proactive with data quality issues. Despite using multiple ETL tools, they are able to add monitoring for freshness and volume tests in one place. Her team receives a Slack notification when data quality issues rather than being notified by a teammate trying to use incorrect data.
❝We’re not waiting for incorrect data to find its way to a downstream report and then have one of our senior leaders asking why a number is crazy. Metaplane saves us time and helps create a sense of trust. It saves everyone hours of Slack messages and Zoom calls.
To get ahead of changes being made to production databases by the software engineering team, Schylar’s team added schema change monitoring by connecting them to Metaplane. Her team can proactively reach out to the software engineering team and find ways of operationalizing that data more productively. Before refactoring or removing important fields used in downstream reports, her team uses Metaplane’s warehouse to BI lineage to figure out the impact of a change before code is merged.
Results
- Vendr has been able to build a lean data team and highly leveraged data stack that supports hundreds of data consumers by leveraging Snowflake, dbt, and Metaplane.
- Using Snowflake’s auto scaling capabilities and easy to use governance tools, the data team saves around four hours per week and doesn't need to hire database administrators to tune database performance and maintain governance policies.
- By building scalable and flexible models using dbt, Schylar and her team are able to adapt quickly to changing business requirements on a monthly basis.
- Using Metaplane, the Vendr data team saves around eight hours per week by proactively fixing data quality issues before affecting downstream data consumers.
- With Snowflake, dbt, and Metaplane, the Vendr data team has created a culture of trust in data from day one.