|
DataTune was one of the first Data cleansing systems, developed during 2000-2003 by Target Eye LTD, UK. The idea behind DataTune came from the importance of maintaining the serviceability of Information systems, being one of the most important assets of the organization. Since most of the organizations maintain more than one database or information system, DataTune also addressed the need for mutual synchronization and the need of updating various items of information between the various databases. Since a large number of errors would arise, that might cause duplication and loss of information. For example: An identical record might be stored as two separate records, or two records stored as identical (in two different data sources) are in fact different. That will result it difficult to locate items of information at a later stage, overloading is produced, mail is returned, and malfunctions related to customers or suppliers occur. Problems as such, are common even today, but were a much larger problem 10 years ago, when DataTune was operated. Datatune was discontinued at around 2003. The Datatune website was closed. Notable features DataTune has a error correction and Data Cleansing mechanisms , based on several sources and algorithms which were new at the time they were first introduced. It was developed under Microsoft Visual Basic 6.0. I contained the software part and several MS Access databases. DataTune was developed under the assumption that Data cleansing should be performed partially automatically and right after, all Rejects and Exceptions, must be reviewed manually. DataTune had an automatic processes which perform the Data Cleansing and estimate the level of accuracy of each cleansed record. Records with low expected accuracy, are defined as "Rejects" and are sent to a queue for human processing. * Comparing records with an Error Bank, which holds typical errors, and is expanded during each Data cleansing project. For example, if one of the records accessed holds the city name "Ney York", the DataTune system replaces it with "New York City" * Comparing records with public domain data sources. Such data sources may contain official lists of countries, cities, addresses, locations, etc. * Uniting records, using a pre-defined format, such as : House Number + Street Name, for a Street Address, modifying all records to meet the same structure. * Using Soundex based algorithm for all searches, ensuring that typical replacements between common characters (such as 'c' instead of 's' and vice versa) and tracked and fixed. * Using databases provided by the customers - these databases were used for cross-checking the various fields and records. * Separation of fields - In several databases different fields are kept as a single field. For example, a customer's address can be stored as “350 Elm Street. Suite #35”), will be replaced with separate fields: Street Name: "Elm Street", House Number: "350", Additional Address Data: "Suite #35".
|
|
|