Announcing Data Test Previews in Pull Requests
Today we're releasing Data Test Previews — a way to give busy data teams a hand by automatically testing any changes made to their data in a development branch. Data teams can find out if they’re about to introduce breaking data changes to their environment in minutes, rather than hours.
In modern companies, the data team acts as a switchboard operator for critical business insights, helping the marketing, sales, finance, product (and other) teams make smarter decisions. But data practitioners don’t just operate the switchboard — they’re building the switchboard themselves, out of tools and systems they’ve often cobbled together over time.
Despite the explosion of tools in the marketplace, data practitioners have long been underserved when it comes to being able to use engineering best practices, such as automatic CI checks that give data teams confidence when merging their code. Such confidence is important, because — as many data practitioners know — making code changes to data models can result in any number of data quality issues. Downstream models can break. Dashboards can look funky. Bugs can be inadvertently introduced in transformation logic, wreaking havoc on your metrics.
Imagine that the CMO of Howl’s Moving Castles wanted a slightly different report for her dashboard, necessitating an adjustment to how the marketing model is transformed — and she needs it soon. How can the data team understand not just how that change will impact their data’s dependencies, but how it will impact the data itself? And how can the data team confirm they’re not introducing breaking changes into their data in seconds, rather than hours or days?
Many teams don’t have the time or resources to answer those questions. And the ones who do often have to spend hours or days writing and running manual scripts to verify changes. In the example above, the data team at Howl’s might need to run several SQL queries on a development branch of their data, then manually compare the resulting metadata, such as row counts or group by statements, to understand how the proposed changes differ from production data. When the change is large, or there are many downstream dependencies, this process can eat up enormous amounts of the team’s time and effort.
We are excited to share today that Metaplane has released Data Test Previews, which gives busy data teams a hand by automatically testing any changes made to their data in a development branch. Now data teams can find out in minutes, rather than hours, if they’re about to introduce breaking data changes to their environment.
Test your data in the pull request
Whenever your data team opens a pull request, in addition to running lineage impact analysis, Metaplane will run a suite of data tests against your PR and production warehouse and compare the results. Then, Metaplane will let you know about any large changes to your production data’s mean, uniqueness, nullness, cardinality, and row count as a result of the change. This helps data teams incorporate best practices from the software development world, and inspires confidence and peace of mind when reviewing and merging pull requests.
Easily configure tests
One of Metaplane’s design philosophies is to offer sensible defaults and flexibility so that data teams can customize Metaplane to their unique setups and workflows. That’s why we’ve made data test previews configurable. You can choose which tests run, as well as fine-tune the testing thresholds so that the sensitivity is right where you want it.
Share data test reports with teammates
Frequently, when data teams are making changes, those changes impact several other people and teams in the organization. That’s why we’ve built in the ability to share both the pull request and a Data Test Preview report with stakeholders about the changes being made and the impact of those changes. You can raise awareness on the work your team is doing while raising awareness on the importance of data quality. Win-win!
The future of Data CI/CD automation
When used alongside Metaplane’s Data Impact Previews, Data Test Previews can help instill confidence that changes to data won’t break downstream dependencies or the data itself.
Data engineers deserve tooling and automation for their development lifecycles that are just as functional and powerful as the tools that exist for software engineers, and Data Test Previews and Data Impact Previews are just the beginning of that toolset.
If you’re interested in trying out Metaplane’s CI/CD automation, try Metaplane free today.