What is Data Consistency? Definition and Examples

September 27, 2022

Co-founder / Data and ML

Head of Marketing

September 27, 2022
What is Data Consistency? Definition and Examples

If you care about whether your business succeeds or fails, you should care about data consistency. Consistent data is important because it has a huge impact on your bottom line. Unfortunately, that impact often goes undetected—until it’s too late.

Say your business uses data for operational purposes, and your data is inconsistent. You could inadvertently send an automated renewal email to a customer who’s at risk of churning because your “account status” wasn’t identical across Gainsight and Salesforce. 

If your business uses data for decision-making purposes, on the other hand, and your data is inconsistent, it could cost you. As an example, imagine deciding to double-down on digital advertising because your return on ad spend appeared high, when according to a second source you’re barely breaking even. 

Now that you know why data consistency matters, let’s dive into exactly what it means. In this blog post, you’ll find a definition, examples, and four methods for measuring data consistency.

What is data consistency?

Data consistency is one of ten dimensions of data quality. Data is considered consistent if two or more values in different locations are identical. Ask yourself: Is the data internally consistent? If there are redundant data values, do they have the same value? Or, if values are aggregations of each other, are the values consistent with each other? 

What are some examples of inconsistent data?

Imagine you’re a lead analytics engineer at Rainforest, an ecommerce company that sells hydroponic aquariums to high-end restaurants. Your data would be considered inconsistent if the engineering team records aquarium models that don’t match the models recorded by the sales team. Another example would be if the monthly profit number is not consistent with the monthly revenue and cost numbers.

How do you measure data consistency?

To test your any data quality dimension, you must measure, track, and assess a relevant data quality metric. In the case of data consistency, you can measure the number of passed checks to track the uniqueness of values, uniqueness of entities, corroboration within the system, or whether referential integrity is maintained. Codd’s Referential Integrity constraint is one example of a consistency check.

How to ensure data consistency

One way to ensure data consistency is through anomaly detection, sometimes called outlier analysis, which helps you to identify unexpected values or events in a data set. 

Using the example of two numbers that are inconsistent with one another, anomaly detection software would notify you instantly when data you expect to match doesn’t. The software knows it’s unusual because its machine learning model learns from your historical metadata. 

Here’s how anomaly detection helps Andrew Mackenzie, Business Intelligence Architect at Appcues, perform his role:

“The important thing is that when things break, I know immediately—and I can usually fix them before any of my stakeholders find out.”

In other words, you can say goodbye to the dreaded WTF message from your stakeholders. In that way, automated, real-time anomaly detection is like a friend who has always got your back.

To take anomaly detection for a spin and put an end to poor data quality, sign up for Metaplane’s free-forever plan or test our most advanced features with a 14-day free trial. Implementation takes under 30 minutes.

Contents

    Start monitoring your data in minutes.

    Connect your warehouse and start generating a baseline in less than 10 minutes. Start for free, no credit-card required.