Data Trends & Insights

One of the biggest problems with solving data quality: No one wants to own it.

April 16, 2025

Sophia Granfors

There’s a silent struggle unfolding within data-driven organisations — and it's one that isn't getting enough attention.

No one wants to own data quality.
And that’s become one of the biggest challenges in solving it.

A few years ago, we half-joked that “everyone owns data quality — which means no one really does.” In practice, this meant many stakeholders were involved, but no one took full responsibility to actually fix the underlying issues. Still, there was usually a shared willingness to engage and improve things, even if the ownership was fuzzy.

But something has shifted in recent years, particularly with the rise of AI. There’s a growing reluctance to own data quality at all. Where people used to at least partially own the responsibility, many now seem to actively avoid it altogether.

Why does no one want to own data quality?

Generally, people seem to agree that data quality is indeed important: for accurate decisions, reliable ML-models and implementation of AI. So how come no one wants to take ownership to ensure the quality of data?

In conversations across organisations, two main reasons keep surfacing:

1. Unrealistic expectations and high pressure

The stakes around data and AI have never been higher. There's pressure to deliver fast, impress stakeholders, and “unlock value” overnight. But when you're asked to fix complex, systemic data quality issues under tight deadlines — often without the proper support — it quickly starts to feel like mission impossible. The task is daunting, and people are understandably hesitant to take it on when they’re not set up for success.

2. The allure of the exciting, visible work

Let’s be honest — applying AI is more glamorous than cleaning up a messy data pipeline. It’s more enjoyable to prototype a shiny new model than to dig into source systems and figure out why a key field is missing half its values. Many are drawn to the visible, high-impact side of data work and quietly hope someone else will fix the foundations.

What does data quality even mean?

One of the less discussed, but equally important, reasons data quality is so hard to own is deceptively simple: data quality (and ownership of it) means different things to different people.

Data quality in a technical sense is often boiled down to a checklist of dimensions — freshness, completeness, consistency, distribution, validity, and uniqueness. These six pillars are frequently referenced in frameworks, and while useful in some sense, they only describe a part of the picture. That’s where the disconnect starts.

Data issues can emerge anywhere in the pipeline: from collection and ingestion, to transformation and transportation, and finally: at consumption. Each stage has its own stakeholders with different needs and definitions of what quality looks like:

At the ingestion stage, data engineers tend to focus on freshness (is the data arriving on time?), validity (does it match the expected schema?), and completeness (are all expected records present?). If those checks pass, the data is labeled “high quality.”
But as the data moves downstream into the transformation layer, where it’s joined with other sources and reshaped into business-friendly tables, new expectations surface. Distribution suddenly matters. Outliers might suggest broken logic or faulty assumptions. And consistency becomes critical: are fields aligned across datasets? Does a customer ID mean the same thing across the board?
Finally, in the consumption layer, where dashboards, metrics, and machine learning outputs are created, quality takes on yet another form. Business users typically assume that the data is complete and fresh by default. Their concern is with accuracy in the metrics, reliability of trends, and whether the data makes sense. They’re operating at a level where even subtle shifts in data distribution can undermine trust, but they don’t have visibility into upstream issues.

This mismatch in expectations leads to a classic situation: a data engineer sees the data as high quality because all ingestion checks passed. Meanwhile, a business stakeholder opens their dashboard and sees a 30% drop in monthly revenue — and immediately declares the data “broken.”

Both perspectives are valid. And that’s the problem.

Without a shared understanding of what data quality really means — across the full pipeline — it becomes nearly impossible to define ownership or enforce accountability. Everyone’s measuring quality through their own lens. No one’s aligned on what matters most.

The six dimensions of data quality: accuracy, completeness, consistency, timeliness, validity, uniqueness.

What are successful organizations doing differently

Of course, there are organizations who do this well, and they’re worth highlighting. The organisations that are serious about realising long-term value from their data and AI initiatives are the ones that take ownership of data quality. Not just in theory, but in practice.

What sets these organizations is a cultural and strategic decision to take data quality seriously and as a core pillar of how they operate. For this to work, it also requires leadership to treat data quality as a strategic enabler instead of an “engineering problem”. Beyond leadership buy-in, successful organizations also embed data quality in their ways of working:

Set expectations and agree on definitions: Instead of assuming that quality will “emerge” from good intentions, they set explicit expectations. They define what quality means, and not sure technical terms but in real-world impact. But they don’t stop at definitions — they also decide how to measure and track data quality in each part of the pipeline, and build accountability around them.
They invest in catching issues early. Whether it's anomaly detection, schema validation, or lineage tracking, quality is monitored continuously — not just when someone complains about a broken dashboard. The goal is to shift from reactive firefighting to proactive prevention.
When data issues happen, they have a process to deal with them: There's often a formalized path from detection to resolution, and it's clear who owns each step. Part of this could be to bridge the gap between data producers and consumers. For example, ensuring there’s a graphical interface business users can navigate, or integrating data quality with business intelligence.
Successful organizations don’t treat ownership as a binary — either centralized or decentralized. Instead, they create hybrid models that reflect the complexity of their data landscape. Data engineers might own ingestion-level checks. Analytics engineers handle transformations. Data consumers are involved in defining what “good” looks like. Everyone has a piece of the puzzle — and knows what their piece is.

These aren’t just process improvements. They’re signs of a deeper belief: that data is a product, and quality is a prerequisite — not a nice-to-have.

Final thoughts

Addressing the data quality ownership dilemma requires a multifaceted approach that includes setting clear definitions, assigning appropriate responsibilities, and securing unwavering leadership support. By acknowledging and tackling these challenges head-on, organizations can unlock the full potential of their data assets, driving informed decisions and sustained growth.