Get the essential data observability guide
Download this guide to learn:
What is data observability?
4 pillars of data observability
How to evaluate platforms
Common mistakes to avoid
The ROI of data observability
Unlock now
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Sign up for a free data observability workshop today.
Assess your company's data health and learn how to start monitoring your entire data stack.
Book free workshop
Sign up for news, updates, and events
Subscribe for free
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Getting started with Data Observability Guide

Make a plan to implement data observability across your company’s entire data stack

Download for free
Book a data observability workshop with an expert.

Assess your company's data health and learn how to start monitoring your entire data stack.

Book free workshop

New feature: fine-tuned regression testing for proactive incident prevention

Learn more about configuration options for dbt CI checks.

March 26, 2024

Sr. Software Engineer

Founding Engineer

March 26, 2024
New feature: fine-tuned regression testing for proactive incident prevention

If you use dbt, you’re probably used to data analysts, business stakeholders, and even other data engineers building on all of your hard work. A new analytics initiative means new dashboards and reports. New business needs mean shiny new models derived from yours. 

As a result, it can become daunting to update your dbt models, for fear of unintentionally breaking downstream dependencies. To help your teams merge those changes with confidence, we built a native GitHub application that automates CI checks alongside your pull requests. The checks include:

  • Impact analyses—an outline of downstream tables and BI dashboards, workbooks, and other objects that may be impacted by your change
  • Test previews—a report that shows how the data itself in your tables, dashboards, reports, and other dbt models will be affected by the change

Configure regression testing 

One challenge with automated CI checks, especially for large codebases, is the execution time required to run them. For Metaplane CI tests in particular, regression tests are run against your warehouse. The more data you have, the longer it takes for Metaplane to query it and let you know about any possible downstream impacts that could trigger a data quality incident.

To help strike the right balance between speed and power, we’ve recently refined impact and test previews to help you test a more targeted set of objects and cut down on potential noise. Those configurations include:

  • Filter by Metaplane tags. Specify exactly which objects Metaplane should regression test, such as your most critical “p0” tables or  tables upstream of important dashboards
  • Turn off checks. For example, if you’re aware that you’ll be making breaking changes to downstream tables that you’re no longer using, you may not need Metaplane’s CI checks to run.
  • Ignore draft PRs. You may not want to run tests on PRs that aren’t ready for review.
  • Specify where these tests should run. If you have a lot of data you’d like to test, you may decide to run the checks on a bigger warehouse.
  • And more, including fine-tuned timeout options and the ability to specify the % threshold that constitutes a failure for the regression tests.

How to get started with Metaplane for data regression testing

If you already have a Metaplane account, you’ll need to connect your dbt instance and GitHub repo hosting your dbt code first. After doing so, navigate to your dbt connection and click “Edit GitHub integration” on the top right corner.

Otherwise, to get started with Metaplane, you can create an account or pick a time to learn more about data observability best practices from the team. New users can implement data quality monitoring, including these regression tests, within the hour!

Table of contents
    Tags

    We’re hard at work helping you improve trust in your data in less time than ever. We promise to send a maximum of 1 update email per week.

    Your email
    No items found.
    Ensure trust in data

    Start monitoring your data in minutes.

    Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.