Overview
Understanding and mastering the various Microsoft SQL Server Tools available for Data Quality , Master Data Management , Fuzzy Matching and Enterprise Information Management can be daunting and exasperating. In the hopes of reducing the anxiety and frustration I am providing a practical roadmap using openly available resources and requiring only a few days to complete.
I have reviewed and organized several articles and blog post that would allow you within a few days to master the skills required to utilize several SQL Server 2012 capabilities mentioned above.
From a business perspective where to be covering the following:
Basic
Master Data Management (MDS)
Data Quality (DQS)
Integration (SSIS)
Advanced
Fuzzy Matching(data Deduplication)
Data Profiling (Source analysis against business rules)
Basic EIM
Master Data Services
Nick Barclay: BI-Lingual: MDS Architecture Notes
This article will provide overview of the MDS architecture in support of MDM (Master Data Management)
http://nickbarclay.blogspot.com/2009/12/mds-architecture-notes.html
Nick Barclay: BI-Lingual: Beginning Master Data Services (Part 1 thru 7)
This article will teach you the steps for implementing Microsoft MDS as well as the tasks required to develop and load a Model. Complete all exercises in order. MDS is the tool Microsoft provide to support creating and maintaining reference data set (Lookup and Code tables) in support of Master Data Management.
http://nickbarclay.blogspot.com/2009/11/beginning-master-data-services-part-1.html
Data Quality Services
Enterprise Information Management using SSIS, MDS, and DQS Together [Tutorial]
This next article will teach you how to implement and utilize the Data Quality cleansing and matching capabilities.
http://technet.microsoft.com/en-us/library/jj819782.aspx
Fuzzy Matching and Deduplication
Advanced SSIS Fuzzy Matching via Record Linkage Methodology – SQLServerCentral
This article will teach you the concepts and methodology recommended for Fuzzy Matching or Deduplication, in the context of a well known Record Linkage Methodology.
http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/71486/
Advanced Matching and Data Profiling
I have also included the next two articles to further explore the code and capabilities to solve complex matching and deduplication efforts
Roll Your Own Fuzzy Match / Grouping (Jaro Winkler) – T-SQL – SQLServerCentral
http://www.sqlservercentral.com/articles/Fuzzy+Match/65702/
Roll Your Own SSIS Fuzzy Matching / Grouping SSIS (Jaro – Winkler) – SQLServerCentral
http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/65616/
Creating a Metadata Mart via TSQL – Complete Data Profiling Kit – Download
https://irawarrenwhiteside.com/2014/04/13/creating-a-metadata-mart-via-tsql/
Great list, thanks a lot for posting it. I found this video and the author’s blog very informative as an introduction to data profiling. http://technet.microsoft.com/en-us/sqlserver/ff686909.aspx
Oracle has a good article too, though it is of course product specific and from ‘the other team’, but it didn’t hurt reading it and I learned a lot from it too .
http://docs.oracle.com/cd/E11882_01/owb.112/e10935/data_profiling.htm
thank you for your work
LikeLike