Digital Transformation Doesn’t Solve “Garbage in/Garbage out"

But COVID-19 adaptations can improve your data discipline

Tom Hirata
5 min readMay 12, 2020
(Image: Rufino / CC BY-SA-2.0)

When you shake hands, you’re “touching” everything the other person touched after their last sanitizing. That’s always been true, but the COVID-19 pandemic forced us to rethink this ancient custom. Now we realize that common everyday actions can increase or decrease spread of the virus. Now it’s not rude to avoid a handshake. Facemasks are common items, even fashion statements. These are rapid adaptations to the pandemic upheaval.

Digital transformation (DX) is a partial solution to today’s essential challenge: Creating insights and innovation from today’s data wave. Data come from smartphones, social media, connected devices, and new technologies every year. So volume, velocity, and variety of data explodes. Technology helps to gather, transport, and store data more efficiently, and it creates new data at each step. Just like handshakes and COVID-19, the DX upheaval requires attention to “hygiene” risk in your data.

DX doesn’t eliminate “garbage in/garbage out”

(Image: Jerzy Górecki/Pixabay)

Digital transformation is about changing your organization through the application of “big data” tools to advance decision-making, efficiency, or customer experience. Transformations solve some problems, and create new ones. Data hygiene is an old problem with new urgency under DX.

The increased volume, velocity, and variety of data sources increases the risk of “garbage in/garbage out” (GIGO). DX adds complicated layers of data manipulation and black boxes, making the GIGO problem worse. The COVID-19 upheaval reminds us how small actions — like handshakes — can add up to large-scale risks in your environment.

Do you trust your data?

Data of doubtful quality and questionable meaning risk inefficiency and poor decisions. In data management, the “handshake risk” occurs when people enter, retrieve, transform, output, or represent data. Here are some examples:

Examples of data risk: [a]numeric values can be misheard or mistyped; [b] date values can be converted across different formats; [c] a days-in-month denominator depends upon what you’re measuring; [d] code that imports data must read the correct columns in the file; [e] data labels need to match the underlying metric

I’ve seen “hygiene” problems — i.e., data risk — in every step of the data supply chain. For example: An agent hears “fifty” as “fifteen”, or enters 500. To save production hours, a text column is “repurposed” to hold a date. An ETL process is coded to read the wrong data columns. A staffing report divides by calendar days vs. production days. A dashboard report mislabels a value. The common element is people. The solution is discipline and process.

The cost of untrustworthy data multiplies with the complexity in your data environment. Data volume doubles in ~3 years as more technologies create more data. Artificial intelligence (AI) and business intelligence (BI) tools add layers and black-boxes between data and people — your decision makers and consumers. “Garbage in/garbage out” will impair the value of DX investments. And emerging privacy regulations require new obligations— such as data deletion for a single customer — that rely upon trustworthy data.

Ask the right questions to uncover hidden complexity — and costs

“Repurposing” a text field for a date value is necessary sometimes. Text fields are flexible, but the flexibility comes at a cost. It adds complexity in hidden places. Inconsistent date formats might be used when importing data. Invalid dates could be entered (“2/31/2020”) manually. Report code needs additional transformation steps to convert to a date. These workarounds create hidden costs in error correction, additional quality reviews, and operational complexity.

This hidden complexity increases change management risk. People have to ask the right questions. Did you engage the operations and training teams? Can technology deliver an input control to validate the format? What other quality processes are in place to support the people? If the answer is “no”, then poor data quality becomes part of the implementation cost. You may save development costs, but the full costs are visible when you redefine essential workers.

“Only the paranoid survive”

Andrew Grove’s paranoia dictum is about adapting to environmental changes. It applies equally to the trustworthiness of data. Do you trust the quality? Do your dashboards inform decisions? If not, you aren’t getting full value from your data or your big data tools.

I’ve helped businesses create data quality programs through platform conversions and regulatory changes. In my experience, maintaining trustworthy data requires “appropriate vigilance” if not paranoia. Trustworthiness is the result of creative energy focused on asking the right questions. and designing repeatable processes.

Start small, but get started!

(Image: Mohamed Hassan & Pexels)

Avoiding data risk can start anywhere in your organization, and with almost anyone. Floor supervisors know when effective closers are quitting because of unnecessary data entry steps. Quality analysts see where errors pile up. The people who care the most will bring passion and energy to the effort. Find them, and they’ll spread a good “viral contagion” to jump-start your transformation.

Delivering value from DX — creating a “data-centric organization” — is about creating appropriate processes to bridge technology, analytics, and business teams. The cross-functional teamwork and processes aren’t “off-the-shelf” widgets, and endure when they’re “home grown” from within your organization.

Technology doesn’t create insights and innovation. People do. People interpret meaning from data, choose how to organize data, find insights in data, and innovate from the insights. The growing complexity of technology and organizations means that these “people-driven activities” aren’t effective without appropriate processes to support them.

My next two posts will provide practical examples of how to improve data quality and business metadata. Then we’ll circle back to data stewardship more broadly as a business capability that enables digital transformation.

This article is part of a series about lessons from the COVID-19 pandemic. Check out my earlier posts here.

Tom Hirata is the Founder of Data Mandala and developer of the More Meaning and Less Cleaning framework for small-medium businesses to level-up their data environments for digital transformation and privacy compliance.

To schedule a Privacy Compliance Check-up call with Tom, where you’ll learn 3 capabilities you must have for consumer privacy regulations like the CCPA and GDPR, click here. You can also message me on LinkedIn.

When you’re ready to think beyond COVID-19, I’m here to accelerate your transformation.

--

--

Tom Hirata

I write about keeping people and purpose at the center of technology transformation.