Metaplane propels LogRocket's data quality forward
"Metaplane’s helped us to maintain values in key tables with 99.99% accuracy."
LogRocket identifies technical and UX issues in your application with AI. You can then quantify impact with analytics and watch session replays to see exactly what went wrong. As you can imagine, a company that relies so much on product data treats their own internal analytics with a heavy degree of rigor. We spoke with Elise Eagan, Lead Analytics Engineer, and Dave Fitzgerald, Sr. Analytics Engineer, who form the data team at LogRocket.
The data team at LogRocket today oversees a data stack including Google BigQuery, dbt, Fivetran, Hightouch, Metabase, and Metaplane, to handle the data pipelines that bring insights to drive the business forward, with examples being:
- Product: refining the LogRocket customer experience with insights from LogRocket
- Customer success: finding customers to reach out to that’d benefit from adding related feature(s) to their existing usage
- Finance: reporting revenue to the board and calculating sales commissions
- Sales: feeding data directly into Salesforce to provide additional context for every account
- Marketing: improving the efficiency of paid ad and other marketing channels
Over the course of the past two years, Elise and Dave have scaled the data program from a few events fed from Segment into BigQuery to a full scale analytics program that includes Metabase dashboards, models, and full coverage of the critical data sources for every business unit.
Data Observability for Analytics
Evaluating data quality tools
The quest to find a data quality solution began with a particularly significant data source, Stripe. From the very start of the data team, they’ve been close partners with their finance counterparts, who were immediately helpful in using their domain knowledge to identify scenarios where data looked slightly “off”. At this point, drawing on their past experiences, the team had already placed several unit testing safeguards in place, including:
- Out of the box dbt tests - to check whether values were unique and not null
- Open source solutions - to alert on anomalous dbt job runtimes
The team jumped into action to triage the issue and discovered that there was data drift over time, with the suspicion that updates to dbt models were the cause. With that in mind, their search for a new solution included evaluation criteria such as:
- Understanding the impact on their data profile when updating a dbt model
- Preventing data drift and other issues stemming from updates to dbt models
That focus on pull requests for changes to dbt models led to the first of several proof points for the need for Metaplane.
❝At this point, we started to suspect an errant pull request (PR) led to Stripe data drift. While combing through PRs didn’t reveal any data quality issues, it raised the search for a solution that would find issues prior to merging a PR.
You can probably infer that a team led by Elise, who set up dbt in her first day on the job, moves fast - that extended to putting Metaplane through the wringer by integrating a few key tools starting with BigQuery, dbt, and Metabase to understand how they’d be able to avoid unintentional issues caused by updates to existing models.
With the desire to proactively prevent issues, the team began testing out Metaplane’s column level lineage feature to understand how a change to one table might impact Metabase dashboards downstream.
❝Before that (column level lineage + regression testing features in Metaplane), for every model change, we’d have to manually identify downstream dashboards of models. Those relationships between Metabase and models are constantly shifting with the velocity and volume of stakeholder requests, forcing us to do that manual scoping each time.
The team wanted to prevent making negative impacts for data quality issues for the finance team, which was described as:
❝If we presented slightly different revenue numbers each month, we would naturally begin losing trust from our finance and executive teams. Providing something solid that they can trust is important for us to encourage use of this data.
Those lineage graphs only continued to become more valuable as the team began to discover incidents from their new data quality monitors.
LogRocket’s data quality monitoring strategy
Shifting the paradigm from firefighting to fire prevention, the data team knew that they needed to focus on business-critical data sources such as Salesforce, their customer relationship management platform, and Stripe, which helped them recognize revenue. Their key data quality metrics include:
- Row Counts: to understand whether data has materially changed in the source
- Freshness: to understand if ingestion and transformation pipelines have successfully run
- SUM of revenue in a monthly reporting tables: continuing the partnership with the finance team to ensure that revenue for each month is within an expected, calculated range
While Metaplane’s machine learning automatically calculated acceptable thresholds for each of these data quality metrics, uniquely tailored to each table that monitors were placed on, LogRocket’s data team leveraged monitor configurations including:
- Group by Monitors - Their table used for revenue reporting has a date column that Metaplane’s monitors are grouped by, to train machine learning monitors on distinct logical groups within the given table(s).
- Manual thresholds - Stripe table(s) with revenue values have manually set thresholds based on the finance team calculations for each month. Data can retroactively change due to source-side inputs.
- Sensitivity settings - For key tables such as feature usage metrics relied upon by Customer Success Managers for renewals and upsells, the data team’s increased monitors to higher sensitivities to alert on any potential issues.
After deploying more than 50 data quality monitors, the team was able to relax (as much as a team of 2 supporting over 100 people can, anyway), knowing that they were able to centralize data quality incident alerts for their key objects. For example, in the Stripe data drift issue that kickstarted the search for a data observability solution, they were able to discover that the issue wasn’t related to dbt model changes, but rather, changes in values (e.g. statuses) in the Stripe application itself, which led to gaps in previously accurate conditional statements found in modeling queries.
Stripe data, again, became the subject of recent scrutiny, though, this time, due to an outage for the Fivetran connector loading Stripe data into their BigQuery instance.
❝Metaplane was the first to alert on an issue when metrics began failing due to a Fivetran connector outage. We were able to use the downstream lineage to find the impact on all of the BigQuery tables and Metabase dashboards that needed cleanup.
When it comes to their initial evaluation criteria, avoiding incidents caused by model updates - the team continues to use the Metaplane Github App to forecast impacted tables, models, and dashboards stemming from changes, along with the actual variances in values themselves, shown through regression testing.
The data team at LogRocket continues to use Metaplane as additional insurance for their data. They’ve been able to maintain key values within 0.01% accuracy, one of their service level agreements (SLAs), and had this to say:
❝As we add more important data sets that people are relying on, we definitely want to incorporate Metaplane into those plans to make sure that those tables can be trusted, and the data team can fix every issue before anyone is negatively impacted.
Future plans for Metaplane usage include deployment of monitors on objects related to future product releases and Custom SQL monitors representing unique business logic.
Among the recent product releases is LogRocket’s Streaming Data Export: Streaming Data Export (SDE) creates a direct, near-real-time connection between LogRocket and your preferred data warehouse, delivering session data in a ready-to-analyze format. You can now export dozens of data points covering user behavior, traits, telemetry, errors, and more. This lets you immediately inspect and understand your LogRocket data through the lens of your entire BI stack. If this sounds interesting to you, sign up for a free account or get a personalized demo today!