There are recurring themes in each of these four areas for establishing competitive advantage: accuracy, completeness, timeliness, and currency of the underlying data. Simply looking at the quality dimensions of customer and location data, understanding what types of errors can occur, and instituting methods for remediation can contribute to your company’s efforts in establishing competitive advantage.
What is data quality?
What is high quality data? If you were to presume that all data values were entered, computed, or imported in a “pristine state,” you’d probably be disappointed. It is relatively infrequent that every record in every dataset meets all user expectations. Most datasets originate through data entry processes in which real (and fallible) people are transcribing data into systems, which have little or no validation at entry. There is also a lot of data that exists in forms that are not immediately suited to typical business applications, such as text documents, presentations, spreadsheets, and web pages.
The disparity of sources, input channels, and workflows all open up the potential for data errors. Examples include:
- Mistyping letters, typing addresses or emails into the wrong fields, or inserting superfluous characters into form fields
- Non-adherence to formats/flexibility in the data entry process that allows errors such as missing street numbers or flaws in other demographic or location data to pass
- Transcription errors, variations in spellings, abbreviations, or phonetic similarities
- “Misfielded” or floating data, introduced as a result of typing values into the wrong fields on a form
- Systems that automatically “tab over” from a full file to its neighbor on the screen
- Attribute overloading (i.e. data fields that are generally sparsely used, under certain circumstances, to hold other types of data, such as storing an email address in a “Country” field