How Appcues reduced data quality issues by 77% with automated, real-time anomaly detection
“The important thing is that when things break, I know immediately—and I can usually fix them before any of my stakeholders find out.”
“I’m the only one doing what I do.”
As Business Intelligence Architect at Appcues, Andrew “Andy” Mackenzie is responsible for the entire spectrum of data engineering, from data collection to visualization.
“I’ve been able to handle everything that’s thrown at me so far, but that definitely won’t last if we keep growing at this pace,” Andy said.
Andy serves a variety of stakeholders, from business leaders to individual contributors, across both customer-facing and non-customer-facing teams.
“My goal is to make sure that everyone has the data they need to do their work efficiently and effectively,” Andy said.
While business leaders prefer to consume data via sophisticated business intelligence tools like Looker, individual contributors often need data extracted from unique sources and pushed into third-party operational tools.
“I try to minimize the number of tools employees need to manage,” Andy said. “Otherwise, it interrupts their workflows.”
Feeling nervous about silent data bugs
Before Metaplane, data quality issues were flagged by Andy’s stakeholders at least once a week—and sometimes two, or even three, times in a single week.
“There were weeks in which I spent much of my time dealing with data quality issues that my colleagues identified,” Andy said.
Being the only “data person” at Appcues team means everyone assumes Andy’s responsible when data quality issues arise. That made him nervous before he brought on Metaplane.
“I wouldn’t even know about an issue until someone pinged me,” Andy said. “It reflected poorly on me.”
As an example, Andy shared the time he wiped a table, loaded it fresh, and only half the records showed up. Because he didn’t know, he pushed it into a third-party system. Only after a stakeholder did he discover the error.
“I kept experiencing events like this,” Andy said. “It made others question whether they could trust our data—a data engineer’s worst nightmare.”
Andy knew he needed to take action.
“I kept saying to myself that I should really build some data quality checks into my ETL jobs,” Andy said.
If he took that route, he would need to conceptualize how to solve the problem, implement his solution, and maintain the code base. Building in-house would take a greater time investment up front than continuing to react to issues as they happened.
“The time it would take held me back,” Andy said. “Everything on my to-do list was a higher priority.”
Finding a fast solve to a frustrating problem
When Andy discovered Metaplane, he felt relieved.
“I finally found something that does all of my checks for me,” Andy said.
Implementation was quick and smooth, and instantly led to fewer data quality issues.
“It took minutes to plug into my data stack,” Andy said. “Now, when I break something, or something breaks on its own, I know immediately because Metaplane flags it on Slack.”
For example, four days before our interview, Andy received an alert that his row counts on a specific table had doubled.
“I loaded all the data in twice, so everything was duplicated,” Andy said.
His sales team would have received a flood of notifications about upgrade opportunities—if it hadn’t been for Metaplane.
“I was instantly notified, which allowed me to resolve the problem before it impacted my stakeholders,” Andy said.
Streamlining productivity for the entire team
Data quality issues now arise just once a month for Andy. In other words, he experiences 77% fewer issues now that he has Metaplane.
The best part for him is how reliable the data is.
“My confidence is a lot higher now, because I can trust that if I haven’t seen an alert, everything is in good shape,” Andy said. “Now, most of the time when people raise issues with me, it’s far more likely that they’re just confused or it was a user error, not that anything’s wrong with their data.”
He estimates that he’s five times more confident than before and receives ten times fewer messages from stakeholders.
“Metaplane has streamlined the entire team’s productivity, because stakeholders spend an average of 20 to 40 hours fewer per month investigating issues with their data,” Andy said.
That’s in addition to the five hours per month it saves Andy.
Andy’s advice for you (and his younger self)
If Andy could go back in time, he would do things differently. Not only would he implement Metaplane sooner, but he would also put it on as many tables as possible.
“I don’t get too many alerts because the model isn’t overly sensitive,” Andy said. “So, I’d fully leverage the tool.”
Andy would also take advantage of the custom SQL tool.
“I have injured myself with data changes and might have known about them if I used custom SQL,” Andy said. “I definitely could have saved myself some pain.”
Finally, Andy recommends other data teams use the Slack integration.
“It’s the best because it pings my phone,” Andy said. “I know the second something gets detected.”