The Pillars of Master Data Management: Data Profiling, Data Integration and Data Quality
The wave of workgroup and desktop computing in the 1980s led to distributed data management, resulting in applications supporting line of business operations with similar requirements yet variant
models, representations and management of information objects. Data replication across mainframe, servers and the desktop has led to ambiguity in representation and semantics associated with
implementing business concepts.
Initiatives in centralization (such as data warehousing) intend to consolidate organizational data into an information asset to be mined for actionable knowledge. Although centralization of
information for analysis and reporting has great promise, a new challenge emerges: as data sets are integrated and transformed for analysis and reporting, cleansing and corrections applied at the
warehouse imply that the analysis and reports may no longer be synchronized with the source data, suggesting the necessity for having a single source of truth for all applications –
not just analysis and/or reporting.
Over the past ten years, data profiling, data cleansing and matching, and data integration tools have matured in concert with a desire to aggregate and consolidate “master data,” but
today’s master data management (MDM) initiatives differ from previous attempts at enterprise data consolidation. An MDM program creates a synchronized, consistent repository of quality master
data to feed enterprise applications. Successful MDM solutions require quality integration of master data from across the enterprise, relying on:
- Inventory and identification of candidate master data objects;
- Resolution of semantics, hierarchies and relationships for master entities;
- Seamless standardized information extraction, sharing and delivery;
- A migration process for consolidating the “best records” for the master repository;
- A service-oriented approach for accessing the consolidated master directory;
- Managing enterprise data integration using a data governance framework.
These tasks depend on traditional data quality and integration techniques: data profiling for discovery and analysis; parsing; standardization for data cleansing; duplicate analysis/householding
and matching for identity resolution; data integration for information sharing; and data governance, stewardship, and standards oversight to ensure ongoing consistency.
Essentially, data profiling, data integration and data quality tools are the three pillars upon which today’s MDM solutions are supported. Vendor and customer analyses indicate that:
- Many master data programs have evolved from customer data quality, product data quality, data assessment and validation, and data integration activities.
- MDM solutions are triggered by the introduction of data quality activities to support technical infrastructure acquired for a specific purpose (e.g., enterprise resource planning or customer
- Data governance is a common success theme for MDM.
During the conversations and interviews with both vendors and their customers, recurring themes led us to draw some conclusions about the evolution of successful master data management initiatives:
- There is a significant bidirectional influence between data quality and master data management.
- Customer data still is the main focus of MDM activities, but product information management is rapidly growing in importance.
- Formalizing data governance is a critical success factor for MDM.
- Master data management is not always about consolidation of data.
- The need for semantic integration has driven users to adapt existing tools for broader purposes than originally intended.
As organizations increasingly focus on master data integration, their reliance on readily available technologies, couched within an enterprise governance framework, will continue to drive both
analytic and operational productivity improvement for the foreseeable future.
Read the entire study.