How SpotOn reduced time to actionable data by 6x and increased data engineering contribution by 8.5x using Snowflake, dbt Cloud, and Metaplane
“We’ve grown incredibly in the last 2.5 years and there’s no way that growth would have been possible without bringing in Snowflake, dbt, Metaplane and the modern data stack.”
SpotOn is a rapidly growing business that offers mobile payment processing and management software for restaurants and small businesses. As the company has scaled, data has increasingly become a differentiator to drive the business forward. More than 500 team members rely on data on a daily basis to make decisions and SpotOn’s customers rely on data for merchant reporting and a recommendation engine to power better online ordering experiences.
With this widespread integration of data across the business came new challenges for the data team. Ben Cohen, the Data Engineering Team Lead at SpotOn, and his team were running into bottlenecks in the performance, accessibility, and engineering workflows for iterating on data.
Performance of their Postgres database was routinely slow, causing delayed ETL jobs and degraded BI reporting experiences. The database was undersized and tuning was difficult; “ingestion jobs would fail because of whatever the new Postgres error was,” explained Ben. Data was either missing, or delayed.
These performance issues had a ripple effect—data was not accessible because it was often slow or broken. Users were unable to run queries because resources were being consumed by upstream ETL jobs for several hours every morning. New and advanced analytics use cases were impossible to create on top of the existing warehouse because Postgres couldn't handle reprocessing large scale data aggregations.
When the data team needed to address incoming requests or improve models, only a small subset of the team had the skills necessary to deploy changes in a timely manner. Ben and his team couldn’t keep up with the amount of data requests, and they wanted to start using data for new use cases that could unlock more growth for the entire company. The data stack became a bottleneck, and the team needed to move quicker as the company scaled.
With these challenges in mind, Ben decided to implement Snowflake, dbt, and Metaplane to scale the analytics capabilities and create a new team culture, all without adding undo complexity or cost.
Solution: Scaling analytics capabilities with Snowflake, dbt Cloud, and Metaplane
How Snowflake improved performance, made data more accessible, and improved engineering efficiency
When Ben and his team migrated from Postgres to Snowflake, new advanced analytics use cases were immediately possible. For example, the recommendation engine for online ordering platforms is powered by large scale pre-aggregated data that is augmented from multiple sources. Whereas Postgres could not support these types of aggregations, Snowflake was able to handle this gracefully because of the scaling capabilities provided by the separation of storage and compute.
❝With the scale of data and infrastructure we had, we couldn’t even approach to solve these problems using Postgres. We knew we had to upgrade to a data cloud like Snowflake.
Other work that used to take up much of the team’s time, like tuning Postgres instance sizes, re-indexing data, or designing physical table structures, simply went away with the power of the Snowflake data cloud.
Snowflake’s easy to use integrations with tools like Snowpipe made ETL’ing and operationalizing data fast and simple. It wasn’t necessary to build out complex pipelines that needed to be maintained and scaled. For example, the SpotOn data team used Snowpipe to ingest a weather data set from OpenWeather augment order data. They were also able to seamlessly integrate with their product analytics platform, Heap, and use data sharing to easily use data without needing to build new pipelines.
❝There are pretty expansive, geospatial cross-joins that we need to do on a regular basis that Postgres would never be able to handle.
Snowflake also became a key part of data integration after SpotOn acquired Appetize, who was also already a Snowflake customer. The data team was able to securely and instantaneously share data between SpotOn and Appetizes’ Snowflake instances.
Migrating from Postgres to Snowflake was a game changer for not only the data team, but the entire company. The performance improvements led to faster reporting load times and more advanced analytics use cases. To make this data even more accessible and improve engineering workflows, Ben and his team adopted dbt Cloud.
How dbt Cloud made data accessible and improved engineering contribution by 8.5x
When Ben joined SpotOn, there were only two engineers who could consistently contribute to their ETL project. Adding data sources took months, data models were built in siloes, and the testing and deployment process was painful. Their engineering workflows were centered around custom Python jobs in Airflow which required advanced engineering skills.
If data wasn’t already in Postgres, the team needed to update Python scripts to ingest this new data. The Airflow instance needed to be tested and deployed, and if historical data was needed, they would need to backfill Postgres which could take days or a week depending on the size of the dataset. Only then would they start the modeling process which meant another Airflow deployment. After all of this, there was no guarantee that the data was modeled in a way that the team needed. If this model was delivered, and then an additional piece of data was needed, the entire process would start over again.
The cumbersome workflow limited the team’s velocity and capacity. The only way to scale was to hire more people, which was not a reasonable path forward.
❝Before dbt, only two engineers could contribute to modeling. Bringing in dbt changed all of that for us. The number of contributors has grown by 750% and there is no limit to scale.
After Ben and his team adopted Snowflake to improve performance and scalability, they migrated all existing models to dbt Cloud—financial, operation, sales, and product analytic data were all transformed using dbt.. With this core business logic in dbt, the data team’s workflow became much easier due to greater collaboration and accessibility.
❝dbt Cloud enables more people to build models, self-serve, and feel empowered. That translates to people being more engaged all around.
After SpotOn switched to dbt Cloud, creating data models went from taking days to hours. dbt quickly became the workhorse that powered the entire SpotOn warehouse, internal BI, and analytics. Ben and his team also operationalized the data to be used in the SpotOn product. By using dbt to pre-aggregate data, the product teams can power merchant facing reporting as well as a recommendation engine. For example, they can augment ordering data with weather data to make better recommendations in the online ordering platform.
With dbt Cloud’s analytics workflow in place, SpotOn went from two to 17 contributors to their ETL code. Product teams, engineers, and finance all contribute to their dbt project, including documentation.
❝We use existing dbt packages to push all of our definitions directly to Snowflake and Metabase in an effort to make our documentation available where our internal stakeholders work everyday. We even have some people on the finance team contributing to YAML files and building out documentation
Not only are more people contributing, but dbt Cloud bakes in software engineering best practices, challenging the team to learn new skills and implement testing and version control as a default.
“An analyst can work with the pipeline and can feel engaged and empowered. Stakeholders are happier, and ask for help in creating new data use cases. It’s a virtuous cycle of improvement and empowerment,” said Ben.
How Metaplane increased trust in data and reduced time to identify issues from days to seconds
Ben and his team quickly noticed that as they improved performance and made the data more accessible, they created a data feedback loop: more capabilities allowed the entire organization to move more quickly, which resulted in more teammates asking for data and analytics. On the one hand, this feedback loop was driving SpotOn’s entire organization to leverage data and make more informed decisions. But with this came more attention and scrutiny to the data team’s work, and trust in data became top of mind for the data team.
❝As stakeholders use more data and have new capabilities, they ask more from your team, and you need to move quickly. You can’t test everything yourself. The other way to scale is to hire more people that add data quality checks, but that doesn’t scale well from a cost and efficiency standpoint.
Metaplane was able to help the SpotOn data team scale this feedback loop. By providing observability across their data stack, the team was able to build and retain trust so the feedback loop of and usage could continue. With Metaplane’s machine learning based testing approach and ability to automatically add hundreds of tests, they saved engineering time and always received context about potential root causes and downstream impact when data incidents arose.
In one scenario, the data team was ingesting transactional data from payment processor partners. This data is critical in that it helps the company determine and report on key KPIs. The source systems can be legacy databases and don’t always deliver data on a regular schedule. At times the data was incomplete or contained values outside the expected data contract. All of this fed data into reports that executives used every morning. By proactively catching data incidents like these, Metaplane helped Ben’s team get in front of any issues that would impact downstream stakeholders, helping the data team retain trust in the data. After receiving data incident alerts, Ben and his team could pull back scheduled reports until they were able to verify that the data was fixed after an issue.
❝We were always behind the 8 ball in terms of communicating with the executive team when there was an issue. We were starting to lose trust and they weren’t going to use the reports. If they can’t move forward on using this data, that’s bad not only for our team but also our business. Metaplane helped us get in front of those issues.
Data quality issues don’t just impact executive reporting. SpotOn is core to their customers’ businesses because they process all of their transactions. When SpotOn experiences data quality issues, it affects their customers and potentially impacts how they earn money and serve their own customers. Being the first to know about data quality issues allows the data team to quickly triage and fix issues, preventing them from ever impacting downstream customers.
Ben’s team went from chasing down data bugs and data anomalies to proactively finding out about them and spending more time actually fixing the issues. Time-to-identify data quality issues went from hours or days to seconds.
❝Metaplane is key to preserving trust in our data. You’ve spent so much time to move to this great modern stack, but if the end result is you lost people’s trust and they won’t use it, that work is for nothing.
Results
- After migrating from Postgres to Snowflake, Ben and his team don’t need to plan database changes like tuning and migration nor do they have to build complex ETL pipelines for product analytics and clickstream data, saving the team at least 10 hours every week.
- With dbt cloud, the SpotOn data team increased engineering contribution by 8.5x and building models went from taking days or weeks to hours.
- After SpotOn adopted Metaplane, the time to identify data quality problems went from taking hours to seconds. The organization went from losing trust in the data, to asking for more data to power more advanced analytical use cases.
- With Snowflake, dbt cloud, and Metaplane, SpotOn was able to quickly integrate with Appetize’s data stack after acquisition. Zero second latency data sharing between databases, a shared skillset of dbt, documentation, and ensuring data quality using Metaplane helped the teams work together.
What’s next for SpotOn
Over the next year, Ben and his team have plans to continue to invest in Snowflake, dbt, and Metaplane.
In order to create more consistency, the data team plans on migrating some legacy pipelines that ingest and cleanse important analytical and operational data. In addition to this migration, Ben and his team will be looking into ways to continue to use dbt to power modeling, but decrease latency so the data can be provided closer to real time.
In an effort to reduce the number of data quality issues introduced by code changes, the SpotOn data team is adopting Metaplane’s CI/CD tooling to automate impact analysis and data test previews.
Lastly, the company’s organic growth and the Appetize acquisition means a large number of new team members need to be incorporated to the larger data culture. Ben is focused on leveraging the dbt metrics layer to help bridge this gap and make analytical data more accessible and consistent across the company.