Tuesday, March 23, 2010

Data Quality is a DATA issue

In the blogosphere, on Twitter and at conferences I often come across mentions and discussions of whether data quality is a business issue or whether it is a technical/technological issue. It's normally an either/or question, with no other options considered.

And when I come across such discussions and statements I stick my fingers in my ears, sing la la la and try to think happy thoughts whilst I click on to something more to my liking. But sometimes I feel an overwhelming need to comment ...

Not that data quality CAN'T be a business issue - it can ... in businesses; and it can certainly be a technical issue. What gets me is the tunnel vision that surrounds this discussion. One would think that the only place that data exists is in businesses (and large ones at that), and that no data quality professionals existed outside them; and that data only exists on computers.

We are surrounded by data (and their cousins information and intelligence). It is in businesses but it is also to be found and used in huge quantities outside them in government, public utilities, health services. It is found in large businesses but is also to be found in huge quantities in small businesses where there is no talk of warehousing, OLAP, management buy-in or any other expression we can think of.

When a patient goes into surgery, for example, the purpose of data quality is not to make money but firstly to prevent a death (for example by transfusing blood of the wrong type) and secondly to achieve a health improvement to the patient.

Obviously, for those wrestling with data quality and company politics within large corporations every day data quality can appear to be a business issue. But we need to have a much more generic outlook with data quality.

Maybe we just need to be more careful in our use of language. "Poor data quality has an effect on the economics of a business": absolutely!

Data quality CAN be a business issue.
Data quality CAN be a technical issue.
Data quality CAN be a customer issue.
Data quality CAN be a health issue.

Data quality is ALWAYS a DATA issue.

And that's how I think we should regard it.