The year is ending and two calendars for 2010 arrive from "data quality" companies, each with strange ideas about my postal address.
The first is from a UK data quality exhibition organiser, who addresses me as:
Mr Other Graham Rhind
I can't imagine what is supposed to have been in the field which output as "Other", and I celebrate my individuality, but not in this particular way.
The address block (for my Dutch address) ends with a British postal code and "GB", struck out by the postal services to allow the mailing to reach me - eventually. I recognise the postal code as that I use when I am provided with a web form which does not allow me to add my Dutch postal code. This company is happy to invite foreigners to its exhibitions, it just won't allow them to register without providing false data.
The second is from a Spanish company, who puts my name below the final address line, guaranteed to delay mail because it's where the sorting machines expect to find the country name or the postal code. I am addressed a "D. Graham Rhind" - that's D for Don - and the postal code line reads:
52000 1018 VV Amsterdam
I guess that this company uses a Spanish CRM system that only allows a Spanish postal code to be entered. To allow my (not at all Spanish) details to be entered they have added the least used Spanish code in the postal code field (for Melilla) and then put my Dutch code into the place name field.
Remember, these are both data quality companies. I can see that I still have plenty of work to do to bring the message of our cultural diversity to all in 2010.
Thursday, December 31, 2009
Wednesday, December 2, 2009
Informatica - a step forward in web form quality
Yet another e-mail (from Informatica Netherlands this time) with news of a new white paper. Wearily (why wearily? check this blog entry to find out), I click to check the web form ...

Hey! Hang on a minute. Informatica have actually had the nous (British slang approximately meaning common sense, intelligence ...) to have pre-filled the form with my data (which they indeed already have in their system). And what's this? No state field? And yes, there's Montenegro in the country list, back in all its glory.
Could this be a result in my crusade for better Internet data collection? I gingerly change the country to Canada and yes! The province field appears! Not in a sensible place, unfortunately (if you're going to change fields, don't change them where the customer has already been in the form - the country should be asked beforehand).

But what can I say? RESULT!!

Hey! Hang on a minute. Informatica have actually had the nous (British slang approximately meaning common sense, intelligence ...) to have pre-filled the form with my data (which they indeed already have in their system). And what's this? No state field? And yes, there's Montenegro in the country list, back in all its glory.
Could this be a result in my crusade for better Internet data collection? I gingerly change the country to Canada and yes! The province field appears! Not in a sensible place, unfortunately (if you're going to change fields, don't change them where the customer has already been in the form - the country should be asked beforehand).

But what can I say? RESULT!!
Tuesday, November 24, 2009
B-Eye Network - another web form of shame
Another invitation to download a white paper, this time from the B-Eye Network, another organisation with data quality at its heart - except when it comes to its web form.

What hit me first with this form is that there is no indication of which fields are required, though you can be sure that they are there. In fact, to find out which fields I should fill in (according to B-Eye) I have to fill in the fields I can complete, hit the send button, and then hope for the best (or pray, depending on your personal preference). Only at that point will the form tell you what is required. And what is required shows a grave lack of understanding of the world out there.

I don't have to fill in a state (or a province), and I am very grateful for small mercies. But I do have to fill in a "last" name (that's a culturally loaded field label, by the way), even if I don't have one, and fill in a Zip/Postal Code, again, even if I live in one of the 60 or so countries or territories without one.
And then we look at the country drop down, always very instructive. I'll gloss over the "Falkland Islands (Islas Malvinas)" ... touchy subject ... and land instead on Serbia and Montenegro, a country that hasn't existed since 2006.

I await eagerly the next invitation to download a white paper. Who dares?

What hit me first with this form is that there is no indication of which fields are required, though you can be sure that they are there. In fact, to find out which fields I should fill in (according to B-Eye) I have to fill in the fields I can complete, hit the send button, and then hope for the best (or pray, depending on your personal preference). Only at that point will the form tell you what is required. And what is required shows a grave lack of understanding of the world out there.

I don't have to fill in a state (or a province), and I am very grateful for small mercies. But I do have to fill in a "last" name (that's a culturally loaded field label, by the way), even if I don't have one, and fill in a Zip/Postal Code, again, even if I live in one of the 60 or so countries or territories without one.
And then we look at the country drop down, always very instructive. I'll gloss over the "Falkland Islands (Islas Malvinas)" ... touchy subject ... and land instead on Serbia and Montenegro, a country that hasn't existed since 2006.

I await eagerly the next invitation to download a white paper. Who dares?
Monday, November 23, 2009
Chapeau Talend!
Talend have corrected their web forms! Thanks to them for listening and reacting! Chapeau!
Saturday, November 21, 2009
Broken business processes
On the day that I posted about the poor web form design at Informatica I received an e-mail from Talend (another data quality company) inviting me to download a white paper. Inevitably, this wasn't free - I had to provide them with information. And on their drop down for country name they forgot Faeroe Islands but, more damagingly, included "Yugoslavia". Yugoslavia hasn't existed since 2003, and our memories shouldn't be so short that we forget the bloodshed which accompanied its disintegration.

Look, I know how difficult it is for companies to ensure that everything is known to all people in all departments. Information doesn't flow well - there are barriers everywhere, and though there are people at Talend and Informatica who know better than to make mistakes like this, they can't be everywhere checking everything before it gets posted.
But what really gets me down about these examples is that in both cases the companies concerned contacted me and let it be known that they saw the problems and would correct them. And yet in both cases the forms are still online and are still unchanged.
So how broken do your company processes have to be to allow such obvious embarrassments (people, you purport to be DATA QUALITY companies!) to remain online? What is standing in the way of actually correcting these forms? How many dissatisfied customers do you have to lose before anything changes? How bruised does my forehead have to become from bashing my head against these brick walls?
Somebody took the time to point out the errors. Do yourselves a favour Informatica and Talend - correct your forms!

Look, I know how difficult it is for companies to ensure that everything is known to all people in all departments. Information doesn't flow well - there are barriers everywhere, and though there are people at Talend and Informatica who know better than to make mistakes like this, they can't be everywhere checking everything before it gets posted.
But what really gets me down about these examples is that in both cases the companies concerned contacted me and let it be known that they saw the problems and would correct them. And yet in both cases the forms are still online and are still unchanged.
So how broken do your company processes have to be to allow such obvious embarrassments (people, you purport to be DATA QUALITY companies!) to remain online? What is standing in the way of actually correcting these forms? How many dissatisfied customers do you have to lose before anything changes? How bruised does my forehead have to become from bashing my head against these brick walls?
Somebody took the time to point out the errors. Do yourselves a favour Informatica and Talend - correct your forms!
Friday, October 30, 2009
Informatica and form rage
Dear reader,
you'll know by now that the best way to achieve data quality is not by cleansing data after it has been collected - that's just an expensive way of mitigating the effects of poor data quality. You'll know that preventing data quality issues at source is a more effective and ultimately more cost effective way to manage data quality.
There are hundreds of thousands of companies who continue to attempt reactive data quality cleansing rather than instigating preventative data quality, and that won't change in the short (or even middle) term; but when you're a company working within the data quality sphere you have to be very careful about how you manage your own data quality, because if you don't ne'er do wells such as myself will be quick to pick up on it.
Informatica, a data quality company, posted a white paper - about data quality - here, and included with it a web form guaranteed to collect the worst quality data imaginable. (To me, a white paper is not free if I am expected to provide my information (which has value) in exchange for it - but that's another blog post ...). Now, don't get me wrong - I have nothing again Informatica as a company - they just seem (on this evidence) to have reached the size and structure which has stopped them being able to concentrate on data quality in all parts of their company, and with too many employees not understanding, or being part of, the data quality focus.

A quick look at the form and we can see some of the issues. Though I am allowed to enter data from my address in The Netherlands, I am forced to add an American state (or a Canadian province or territory) with which to pollute Informatica's data. The field labels suggest that my name is written in the same way as that of most Americans, that is given name first, family name last, and if I don't have a family name I'll have to make one up, because I can't leave that empty. I must add a postal code, even if my country doesn't have one, and though they have managed correctly to lose "Serbia and Montenegro" from their country list, they have lost Montenegro in the process.

When I pointed these errors out to Informatica they promised to recreate their forms, and they may be so doing; but it shouldn't take 15 days to stop a web form "State" field being a required field, one of the most obvious and widespread errors any web form can make, and the cause of more form rage than anything else. I hope they manage to get it sorted before next Tuesday, when I am presenting about web forms to the DDMA in Amsterdam - I'd like to be able to show a success story.
So, dear reader, do yourself a favour. Prevent your data becoming polluted at source. Look at your web forms. I'll make it easy for you - download my free e-book "Better data quality from your web form - Effective international name and address Internet data collection" and learn how to avoid those common errors. And I don't even ask you to fill in a form to get it ...
you'll know by now that the best way to achieve data quality is not by cleansing data after it has been collected - that's just an expensive way of mitigating the effects of poor data quality. You'll know that preventing data quality issues at source is a more effective and ultimately more cost effective way to manage data quality.
There are hundreds of thousands of companies who continue to attempt reactive data quality cleansing rather than instigating preventative data quality, and that won't change in the short (or even middle) term; but when you're a company working within the data quality sphere you have to be very careful about how you manage your own data quality, because if you don't ne'er do wells such as myself will be quick to pick up on it.
Informatica, a data quality company, posted a white paper - about data quality - here, and included with it a web form guaranteed to collect the worst quality data imaginable. (To me, a white paper is not free if I am expected to provide my information (which has value) in exchange for it - but that's another blog post ...). Now, don't get me wrong - I have nothing again Informatica as a company - they just seem (on this evidence) to have reached the size and structure which has stopped them being able to concentrate on data quality in all parts of their company, and with too many employees not understanding, or being part of, the data quality focus.

A quick look at the form and we can see some of the issues. Though I am allowed to enter data from my address in The Netherlands, I am forced to add an American state (or a Canadian province or territory) with which to pollute Informatica's data. The field labels suggest that my name is written in the same way as that of most Americans, that is given name first, family name last, and if I don't have a family name I'll have to make one up, because I can't leave that empty. I must add a postal code, even if my country doesn't have one, and though they have managed correctly to lose "Serbia and Montenegro" from their country list, they have lost Montenegro in the process.

When I pointed these errors out to Informatica they promised to recreate their forms, and they may be so doing; but it shouldn't take 15 days to stop a web form "State" field being a required field, one of the most obvious and widespread errors any web form can make, and the cause of more form rage than anything else. I hope they manage to get it sorted before next Tuesday, when I am presenting about web forms to the DDMA in Amsterdam - I'd like to be able to show a success story.
So, dear reader, do yourself a favour. Prevent your data becoming polluted at source. Look at your web forms. I'll make it easy for you - download my free e-book "Better data quality from your web form - Effective international name and address Internet data collection" and learn how to avoid those common errors. And I don't even ask you to fill in a form to get it ...
Friday, October 2, 2009
Data quality definitions: fit for purpose?
As data quality professionals, some of us spend far too much time philosophising, particularly about how to define the term "Data quality".
Some professionals, particularly those in a business environment, define data quality as data which is fit for purpose. To me, far from clarifying, this definition throws up far too many new questions. Fit in what way, and for which purpose? Fit for my purpose or for his? Or both? Fit for the purpose I have now or those I may have in the future?
I don't like chain definitions, phrases that becomed defined by new phrases which themselves have to be defined - for example data quality = information quality = fit for purpose = ... This simply obfuscates the issues and moves us away from their core.
I also think we should avoid trying to attempt to bring definitions under umbrella terms when we are blessed with thousands of languages, each containing thousands of words, which can be used to define each issue. Instead of
"This data has no quality, because it doesn't help me do what I want to do"
wouldn't it be great if we said:
"The way this data has been provided to me is not fit for the purpose of calling all our customers as the telephone area code is not shown on the interface/printout"
without feeling we needed to park this under one or other defining phrase or buzz word?
After a couple of decades of intensive work with data, I firmly believe that data quality is an inherent property of the data itself and is not definable by what can be achieved with that data. But while I juggle with this issue in my head, I am open to other input. For me, data has quality if it is a true representation of the real world constructs to which it refers; being accurate, relevant, complete and up-to-date.
To me, if your data fulfils those criteria, there's nothing that can't be done with it and it could, if used properly, be fit for each and every purpose. In all my years working with data I've not found a case when this was not true.
Do YOU know of a case, real or imaginary, in which data that is accurate, relevant, complete and up-to-date would not be fit for each and every purpose? Note: we're talking about the data here, not information. If the data is not represented on your screen with the telephone area code, that's an information quality problem; but if the data is complete, relevant and up-to-date, the data will include the telephone area code which could therefore be used to make the information fit for purpose.
I'd love to hear of any examples! Leave a comment!
Some professionals, particularly those in a business environment, define data quality as data which is fit for purpose. To me, far from clarifying, this definition throws up far too many new questions. Fit in what way, and for which purpose? Fit for my purpose or for his? Or both? Fit for the purpose I have now or those I may have in the future?
I don't like chain definitions, phrases that becomed defined by new phrases which themselves have to be defined - for example data quality = information quality = fit for purpose = ... This simply obfuscates the issues and moves us away from their core.
I also think we should avoid trying to attempt to bring definitions under umbrella terms when we are blessed with thousands of languages, each containing thousands of words, which can be used to define each issue. Instead of
"This data has no quality, because it doesn't help me do what I want to do"
wouldn't it be great if we said:
"The way this data has been provided to me is not fit for the purpose of calling all our customers as the telephone area code is not shown on the interface/printout"
without feeling we needed to park this under one or other defining phrase or buzz word?
After a couple of decades of intensive work with data, I firmly believe that data quality is an inherent property of the data itself and is not definable by what can be achieved with that data. But while I juggle with this issue in my head, I am open to other input. For me, data has quality if it is a true representation of the real world constructs to which it refers; being accurate, relevant, complete and up-to-date.
To me, if your data fulfils those criteria, there's nothing that can't be done with it and it could, if used properly, be fit for each and every purpose. In all my years working with data I've not found a case when this was not true.
Do YOU know of a case, real or imaginary, in which data that is accurate, relevant, complete and up-to-date would not be fit for each and every purpose? Note: we're talking about the data here, not information. If the data is not represented on your screen with the telephone area code, that's an information quality problem; but if the data is complete, relevant and up-to-date, the data will include the telephone area code which could therefore be used to make the information fit for purpose.
I'd love to hear of any examples! Leave a comment!
Subscribe to:
Posts (Atom)