What is Data Freshness? Definition and Examples
If you care about whether your business succeeds or fails, you should care about data freshness. Fresh data is important because it has a huge impact on your bottom line. Unfortunately, that impact often goes undetected—until it’s too late.
Say your business uses data for operational purposes, and your data is stale, you could inadvertently send a discount code to a cohort of customers who already purchased your solution, inviting them to demand the same deal terms, costing you thousands of dollars.
If your business uses data for decision-making purposes, on the other hand, and your data is stale, you could underreport your return on ad spend, causing you to withdraw an investment that is actually paying dividends in reality.
Now that you know why data freshness matters, let’s dive into exactly what it means. In this blog post, you’ll find a definition, examples, and four methods for measuring data freshness.
What is data freshness?
Data freshness, sometimes called data up-to-dateness, is one of ten dimensions of data quality. Data is considered fresh if it describes the real world right now. This data quality dimension is closely related to the timeliness of the data but is compared against the present moment, rather than the time of a task.
What are some examples of stale data?
Imagine you’re a lead analytics engineer at Rainforest, an ecommerce company that sells hydroponic aquariums to high-end restaurants. Your data would be considered stale if the reported number of products sold were one week behind because of a Salesforce data ingestion issue. The same would be true if your list of sales reps was not up-to-date because of a problem with your HRIS software.
How do you measure data freshness?
To test your any data quality dimension, you must measure, track, and assess a relevant data quality metric. In the case of data freshness, you can measure the difference between latest timestamps against the present moment, the difference between a destination and a source system, verification against an expected rate of change, or corroboration against other pieces of data.
How to ensure data freshness
One way to ensure data freshness is through anomaly detection, sometimes called outlier analysis, which helps you to identify unexpected values or events in a data set.
Using the example of a stale number of products sold, anomaly detection software would notify you instantly if the frequency at which the table was updated was outside of the normal range. The software knows it’s an outlier because its machine learning model learns from your historical metadata.
Here’s how anomaly detection helps Andrew Mackenzie, Business Intelligence Architect at Appcues, perform his role:
“The important thing is that when things break, I know immediately—and I can usually fix them before any of my stakeholders find out.”
In other words, you can say goodbye to the dreaded WTF message from your stakeholders. In that way, automated, real-time anomaly detection is like a friend who has always got your back.
To take anomaly detection for a spin and put an end to poor data quality, sign up for Metaplane’s free-forever plan or test our most advanced features with a 14-day free trial. Implementation takes under 30 minutes.