Get the essential data observability guide
Download this guide to learn:
What is data observability?
4 pillars of data observability
How to evaluate platforms
Common mistakes to avoid
The ROI of data observability
Unlock now
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Sign up for a free data observability workshop today.
Assess your company's data health and learn how to start monitoring your entire data stack.
Book free workshop
Sign up for news, updates, and events
Subscribe for free
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Getting started with Data Observability Guide

Make a plan to implement data observability across your company’s entire data stack

Download for free
Book a data observability workshop with an expert.

Assess your company's data health and learn how to start monitoring your entire data stack.

Book free workshop

Deep Dive: Data Observability and Transactional Databases

Transactional databases are an important part of the modern data stack—and they shouldn't be left out of your data observability journey.

August 30, 2023

Product, Design

Hooked on Data

August 30, 2023
Deep Dive: Data Observability and Transactional Databases

In many modern data stacks, important data starts in transactional databases like MySQL, SQL Server, or PostgreSQL. Part of what makes transactional databases so attractive is being purpose-built for handling everything from user events from your software apps and activities from website visitors, to logs of manufacturing processes and records of e-commerce sales, not to mention, free for PostgreSQL and MySQL. The data frequently gets replicated in a data warehouse and flows into the BI tool for analysis, but the transactional database is often the bedrock that the rest of the data stack is built upon.

Traditionally, data observability focused mostly on the warehouse and (sometimes) on the BI tool. But when issues occur, understanding how the data flows and changes in its journey through the stack can be invaluable to catching and resolving data anomalies. That’s why Metaplane has deep integrations with MySQL, SQL Server, and PostgreSQL to help give you peace of mind that your data replication process is working as designed, and to give you the full picture of the data quality of your whole data stack.

Schema changes

Data observability, as a field, is mostly concerned with the quality or status of the data itself—the volume or freshness or distribution. But the structure of the database is just as important. And in many companies, the transactional database is frequently the domain of another team entirely, meaning that data engineers and analysts can sometimes be out of the loop when it comes to important developments in the database. If a column’s type changes, your downstream queries might have issues, particularly for math functions. If a column gets deleted or re-named, the dashboards and workbooks that depend on it might break. And, on a happier note, if a new table gets created, one that’s filled with useful data, you probably want to know as soon as possible so you can start using it in your queries, your BI software, and whatever other tools you power with data.

That’s where Metaplane’s schema change feature comes in. By connecting Metaplane to your transactional database, Metaplane will send you an alert about any changes to the database: new tables, deleted columns, type changes, re-named entities, and so on. This way, data engineers can make sure they’re never out of the loop when it comes to knowing what’s happing upstream of the warehouse, and they can react to any changes before they cause issues.

Common database replication issues

Database replication can go awry for any number of reasons, whether due to connection interruptions, unexpected schema drift, and/or a need for more atomicity in last_modified timestamps being used to page extract queries. When issues do come up, you’ll usually find them in the form of incomplete or duplicate data.

Metaplane’s monitors can help catch issues like this when they occur: uniqueness monitors can catch duplicates in ID fields, and row count monitors can catch unusual spikes in the number of records or missing data.

End-to-end lineage

As we mentioned above, data flows between several tools in modern data stacks. Each tool has its own metadata about the data it handles, and all hold pieces of the story as your data moves from system to system. But this also means that understanding the lineage of your data can feel like assembling a puzzle where the pieces are scattered across different systems.

Metaplane helps make tracing that lineage simple. You can look at a database or report and trace the lineage of a column upwards as it moves through the data warehouse, all the way to the Fivetran connector that replicates the data. From there it’s an easy task to check where that Fivetran connector is pulling data from in your transactional database. This kind of analysis is incredibly useful when you’re trying to trace the root cause of an issue, and makes refactoring your data a much simpler task.

No matter your company’s unique data challenges, understanding the full picture of your data stack is critical for improving or maintaining the health of your data—and Metaplane can help. If you’d like to try out Metaplane’s transactional database integration, you can sign up for free, and get started right away, or chat with one of our product experts to learn more.

Table of contents
    Tags

    We’re hard at work helping you improve trust in your data in less time than ever. We promise to send a maximum of 1 update email per week.

    Your email
    No items found.
    Ensure trust in data

    Start monitoring your data in minutes.

    Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.