Thursday, May 21, 2009

How Microsoft Access can damage your data quality

OK, so I've bitten the bullet. Much against my better judgement I'm making a concerted effort to learn Microsoft Access 2007, to allow me to produce some data tables with Unicode characters.

Plugging away, I found an interesting aspect of field input masks which is guaranteed to produce data quality issues. When adding a field mask to Access 2007, it "helpfully" provides a number of ready-made options:



Having a UK edition of the software, Microsoft have helpfully provided a telephone number mask and a postal code mask that it thinks covers the UK. Looking at the postal code mask itself:



and we see how it is made up. '0' indicates a required digit, 'L' a required letter, and '>' forces upper case.

Now, in the UK, AA99 9AA is indeed a valid postal code format. It is, however, only one of seven valid formats:

A9 9AA
A99 9AA
A9A 9AA
AA9 9AA
AA99 9AA
AA9A 9AA
AAA 9AA

Thus, whilst this mask can take OX19 6RY, it can't take OX9 6RY or SW1A 4WW or S1 1AA or N45 1AP or .... well, most of them.

Microsoft may feel that they are being helpful by adding this sample mask, but we all know that programmers, like most of us, will take any route that make their life easy, and are unlikely to make any attempt to alter this input mask to make it valid for the UK, let alone valid for postal codes in every other country. And we know that this happens - my God, don't we all suffer regularly at the hands of forms designed like this? Many programmers would not even be aware that the mask is not valid for most UK postal codes - they trust that the software provider has done their homework.

Back to the drawing board, please, Microsoft. This is not helping in the fight for better data quality.

No comments: