Many companies or organizations do regular
data cleansing. When you cleanse the
data, the
data quality goes up to some higher level. The
data quality level is determined by the amount of work invested in the cleansing. As time passes, the
data quality deteriorates, and you need to repeat the cleansing process. If you spend an equal amount of effort as you did with the previous cleansing, you can expect the same level of
data quality as you had after the previous cleansing. And then the
data quality deteriorates over time again, and the cleansing process starts over and over again. The idea of
Data Quality Services is to mitigate the cleansing process. While the amount of time you need to spend on cleansing decreases, you will achieve higher and higher levels of
data quality. While cleansing, you learn what types of errors to expect, discover error patterns, find domains of correct values, etc. You don’t throw away this knowledge. You store it and use it to find and correct the same issues automatically during your next cleansing process. The following figure shows this graphically. The idea of master
data management, which you can perform with Master
Data Services (MDS), is to prevent
data quality from deteriorating. Once you reach a particular quality level, the MDS application—together with the defined policies, people, and master
data management processes—allow you to maintain this level permanently. This idea is shown in the following picture. OK, now you know what DQS and MDS are about. You can imagine the importance on maintaining the
data quality. Here are some resources that help you preparing and executing the
data quality (DQ) and master
data management (MDM) activities. Books Dejan Sarka and Davide Mauri:
Data Quality and Master
Data Management with Microsoft SQL Server 2008 R2 – a general introduction to MDM, MDS, and
data profiling. Matching explained in depth. Dejan Sarka, Matija Lah and Grega Jerkic: MCTS Self-Paced Training Kit (Exam 70-463): Building
Data Warehouses with Microsoft SQL Server 2012 – I wrote quite a few chapters about DQ and MDM, and introduced also SQL Server 2012 DQS. Thomas Redman:
Data Quality: The Field Guide – you should start with this book. Thomas Redman is the father of DQ and MDM. Tyler Graham: Microsoft SQL Server 2012 Master
Data Services – MDS in depth from a product team mate. Arkady Maydanchik:
Data Quality Assessment –
data profiling in depth. Tamraparni Dasu, Theodore Johnson: Exploratory
Data Mining and
Data Cleaning – advanced
data profiling with
data mining. Forthcoming presentations I am presenting a DQS and MDM seminar at PASS SQL Rally Amsterdam 2013: Wednesday, November 6th, 2013: Enterprise Information Management with SQL Server 2012 – a good kick start to your first DQ and / or MDM project. Courses
Data Quality and Master
Data Management with SQL Server 2012 – I wrote a 2-day course for SolidQ. If you are interested in this course, which I could also deliver in a shorter seminar way, you can contact your closes SolidQ subsidiary, or, of course, me directly on addresses
[email protected] or dsarka@siol.
net. This course could also complement the existing courseware portfolio of training providers, which are welcome to contact me as well. Start improving the quality of your
data now!