Master Data Management - Information Management: Strategies for Gaining a Competitive Advantage with Data (2014)

Information Management: Strategies for Gaining a Competitive Advantage with Data (2014)

Chapter 7. Master Data Management

One Chapter Here, but Ramifications Everywhere

Organizations are struggling to keep the data that drives their businesses synchronized with their many applications. Master data could cover or encompass customers, products, store locations, or anything that is essential to the business and is shared among multiple systems. Managing this data is beyond a technical challenge, it’s very much an organizational one.

Keywords

master data management; data governance; data origination; data integration; data quality; information management

The No-Reference architecture touted in this book needs to find common ways to address commonalities across the enterprise. We can talk about technical tools like data integration and data virtualization and the softer factors of environment support like data governance, but there is also the data itself. This is where Master Data Management (MDM) plays a role.

I will argue that MDM is much more important than it has been given credit for. The more heterogenous the environment becomes—and it should become pretty heterogenous—the more important threading common data elements throughout the environment becomes. Ten, twenty, or fifty versions of customer data, for example, are counterproductive. Actually, there needs to be a way that any set of data that is interesting to multiple applications can be shared.

Master Data Management is an essential discipline for getting a single, consistent view of an enterprise’s core business entities—customers, products, suppliers, employees, and others. MDM solutions enable enterprise-wide master data synchronization.

Every application needs master data. The question is how well—for both the application and the enterprise—the application will get its master data. It’s HOW, not IF.

Some subject areas require input from across the enterprise. MDM, in a process known as governance (not to be confused with enterprise data governance) also facilitates the origination of master data. Business approval, business process change, and capture of master data at optimal, early points in the data life cycle are essential to achieving true enterprise master data.

A form of master data management is to master it in the data warehouse. However, as information becomes your corporate asset and you wish to control and utilize it as much as possible, this form of master data management is seldom sufficient. Likewise, the enterprise resource planning (ERP) promise of everything in one system leads companies to think master data could be managed there. However, ERP manages just the master data it needs to function and lacks governance and strong real-time enterprise distribution capabilities. I’ve yet to meet the ERP project concerned with the enterprise as opposed to its own functions. True MDM takes an enterprise orientation.

MDM Justification

The most straightforward way to think about the economic payback of master data management is “build once, use often.” Master data must be built for each new system. Systems routinely have up to 50% effort and budget directed toward collecting master data. When master data is built in a scalable, sharable manner, such as within a master data management approach, this will streamline project development time, reducing the time it takes to get new systems up and running. Reducing scope also reduces project risk.

However, having multiple systems working from the same master data is where the ultimate benefit comes from. This is far greater than the total cost of ownership (TCO) “build once” approach, but more difficult to measure. There are efficiencies that come about from elimination of the contention and correlation of numerous “versions of the truth.” One former pre-MDM client used to spend 80% of their campaign development time poring through competing product lists to determine which one was the true list for the set of products to be promoted. This left little time for the value-added creativity of the campaign. It also elongated development cycles to the point at which time-to-market opportunities would routinely be missed. Clearing up a problem like this is measurable.

I have been speaking of MDM as a support function. That is, MDM is in support of other projects such as campaign management. However, MDM may actually be a prime enabler for many projects such as those centered on customer or product analytics. There is high value to having customer lifetime value calculated. It improves campaigns and customer management Also, there may not truly be a complete set of master data anywhere in the enterprise today—only bits and pieces here and there. MDM may be the mechanism for most effectively introducing master data into the environment, as well as leveraging it into many systems.

You may have a more nuanced situation, but justification will often have to tie back to one of these total cost of ownership or return on investment approaches. MDM actually addresses such a wide range of information and cultural issues that seldom are two business cases alike. I have done business cases focused on the TCO aspects of MDM that will span several projects, on generating customer hierarchies so that customer organizations are understood for risk aversion, on cleaning up customer lists for marketing purposes, and on generating customer analytics for more effective marketing, among others.

A Subject-Area Culture

Master data management programs focus on the high-quality shared data that the organization needs in numerous systems. This data is grouped into subject areas and consequently the MDM culture is a subject-area culture.

Master Data will ultimately comprise a small percentage, perhaps 5% of the volume of all organizational data. It’s quality data, not quantity data.

Customer and product are two popular subject areas for the MDM treatment. However, depending on your organization, customer and product may be too large to comprise a single subject area and both, and others, may need to be divided further into smaller, more manageable, subject areas. If you cannot master the subject area in 6 months or you cannot locate a single organization responsible for the data stewardship of the subject area, then you should break the subject area into multiple subject areas. For example, I have divided customer into domestic/international, into gold/silver/regular, and by product line.

In the early days of master data management software, customer and product spawned their own software categories and, consequently, there are quite a few constructs in master data management specific to these subject areas. You need to decide if you want to invest time in reengineering your business to the predefined constructs of the subject area intellectual property or if you want MDM to be a palette upon which you can model your business bottom-up.

Other common subject areas that are mastered with master data management are parts, vendors, suppliers, partners, policies, stores/locations, and sales hierarchy. In reality, the list is unbounded and you should let your business needs guide your program’s definition of subject areas and rollout schedule. I’m continually amazed at what constitutes subject areas for MDM. Once you get started with MDM, as long as it is leverageable, you can repeat MDM across many (dozens) of enterprise subject areas over time.

MDM is an iterative program, rolled out across the organization over time. I recommend mastering by subject area, although it is also effective to master by complete systems, taking from them what they publish and providing to them what is available and interesting as a subscriber. Often, a combination is best. Regardless of the rollout strategy, you will want to choose subject areas for MDM that have the following characteristics:

1. High interest to the enterprise

2. High reuse potential across many systems

3. High difficulty assigning ownership—either no one wants it or too many want it

4. Diverse input to its build

5. Scattered pieces of data throughout the enterprise

These may sound like complicating factors, but without these kinds of data problems, MDM would not be needed. However, it is usually not hard to determine numerous subject areas which can benefit from the MDM value proposition. That’s the easy part. The harder part is actually mastering them!

Mastering Data

Master data management is about “build once, use often.” It is about providing the same corporately adjudicated, useful data to applications across the enterprise. I’ve emphasized the data sharing aspect of MDM and even the data origination aspects. However, not to be overlooked is the fact that the high leverage you get with MDM means a little investment goes a long way. That investment may include more complicated attributes than simple, core attributes that are one-to-one with the subject—like a customer’s address. It may get into more complex analytical data like customer lifetime value. Such attributes can be calculated and continually maintained in the master data management hub. Such attributes can utilize whatever data is sent to the hub in the calculations. Data may be sent that is not even stored and made available otherwise. Transaction data, for example, is not master data in the sense of being data that MDM hubs can handle volume-wise for sharing to the enterprise. However, transaction data is key to many analytics. Advanced uses of MDM include transaction data in this way.

Regardless of what you are sharing via MDM, the whole idea is not without its challenges and obstacles. Cultures that are decentralized, and that further lack documentation, cross-departmental working relationships, metadata, and a consistent lexicon have challenges. These are not factors that should keep you from doing MDM. They do, however, dictate about half of the cost and length of projects.

The benefits of master data management are pronounced and evident and well worth pursuing for organizations pursuing the No-Reference, heterogenous architecture. Information is being recognized as a corporate asset and there is the need to have that data managed operationally, not just in post-operational data warehouses and the like.

Tangible technical benefits of MDM include (for each subject area):

1. A multi-application data model, scalable to the enterprise

2. Master data publish and subscribe (sources and targets) ability

3. As appropriate, workflow processes that support the origination, verification, matching, etc. of data (automated and with human intervention)

4. Improved data quality (see Chapter 4)

As I’ve said, it’s not IF, it’s HOW each application, and consequently the enterprise, will manage master data. A tool is not necessary, but the approach outlined here is. All the disciplines of data modeling, data integration, data quality, and workflow management are necessary in managing enterprise master data management. All are available with a robust MDM tool.

Keep in mind, however, that it’s actually the intangible by-products of these deliverables that may be more impressive. These include the fact that many knowledge workers and their applications will be working from the same set of data. They will not have to “roll their own” haphazardly and with only local control and interest. As with most of the advice in this book, I don’t believe it costs more to do it the “right” way. It does takes knowledge and effort directed in a specific fashion.

The Architecture of MDM

MDM data is the organization’s crown jewels. It can be housed in different structures mentioned elsewhere in this book, but one of the most prominent ones is the data warehouse. Often this is inadvertent. Many organizations are pulling together their master data in a data warehouse (perhaps without referencing it as “master data”), which, you will recall from Chapter 6 is batch-loaded and downstream from operations.1 This approach will mostly be inadequate in the No-Reference architecture.

I support the data warehouse as a key component of the architecture, but it bears repeating here that the data warehouse remains important even though it may not be storing master data. The data warehouse still provides a remedy to the inability to access data in operational environments. It still provides integrated, historical, and high-quality data. With MDM in the mix, the data warehouse will actually receive the master data from MDM.

In regards to the earlier comment about transaction data, it could be the data warehouse where the detailed transactions are accessible. Analytics can be generated from these transactions and fed back into MDM, augmenting the base of data that is there.

A second strategy for MDM data involves simply identifying where it exists today in the operational environment and creating pointers to those systems collected in an MDM “hub,” then leveraging the hub when master data is needed. For example, the system of record for the base product data may be the Product system. The product analytics system of record could be the Sales system, and the financials related to the product could be kept in Lawson. These 3 systems would be joined when a full and complete customer master record is needed. Each subject area would have its own strategy similar to this.

This “virtual” or “registry” MDM strategy is quickest to deploy because it involves no systemic data movement. This strategy, cannot have workflow components added to it to enrich the data because there is not a separate place to store enriched data. The presumption is that the operational systems are, in the aggregate, generating master data for the enterprise. Data quality is reduced to whatever occurs in the origination systems.

image

FIGURE 7.1 Getting data into the MDM hub.

The bigger challenge with this approach is performance. To do cross-system joins on the fly for the customer, in the example, can be quite costly. Ultimately, it is most effective when limited sets of customer data are needed in the enterprise and data does not need to be moved systemically.

A better architectural approach for MDM, one that can support the No-Reference architecture, is to physically replicate the master data into a central hub and disperse it from there to other systems that need it. This separate physical hub exists in the architecture as a relational database, usually isolated, and maintained real-time for receiving published data and sending data that is subscribed to. This approach minimizes network traffic and system complexity. When master data is needed, it will be gathered from the MDM hub. Subscribers do not have to know where the data originated, although it may be communicated in the metadata for those who care.

Other value-added activity occurring at the MDM hub includes data cleansing, intake of syndicated data, and workflow execution for data origination and changes.

Data quality rules, such as those described in Chapter 4, should be applied to improve the data. Third-party syndicated data could be appended to the data at the hub. Workflows could be used to secure business governance to improve the data. Workflows could even be used to completely generate the master data and take the place of the originating “system-of-record” for the data in the architecture. For many implementations, this workflow/governance is the main value proposition for MDM. I will say more about the workflows in the next section, where it is referred to by its more common name of governance.

Third-party Syndicated Data

Data has existed for purchase for a while, but the data has mostly been sourced into a very specific need, such as a marketing list for a promotion. As organizations make the move to the leverageable data store that is MDM, an investment in syndicated data can be leveraged throughout the enterprise. With MDM, organizations have a system in which their efforts in data quality, sourcing syndicated data, and organizing a superset of attributes about important subject areas can be leveraged across the organization.

One big use of the syndicated marketplace is to augment and validate customer data and create prospect lists. Through a process called a reverse append, a small number of fields can go to the vendor and a very large number—of varying quality—can be returned.

The need for syndicated data has started many MDM programs since MDM is a central point for the collection and dissemination of master data.

Master data use tends to be very focused on distribution. However, there is a minor function for MDM data that, in certain shops, can become major. That function is the query of the data.

Master Data Query is also a function provided by MDM. MDM tools provide query “portals” to their databases. For some business functions, all that is needed is query access to the data. Implementations that focus on enriching raw data and turning it into analytics (see Chapter 3) for distribution can be referred to as Analytic MDM. Analytic MDM systems tend to have a higher interest in Master Data Query.

Most MDM is referred to as Operational MDM due to the focus on sharing consistent, clean information in real-time across the enterprise. Back to the notion that the value proposition for MDM is multifaceted, shops should drop this Analytical vs. Operational labeling and think about doing both. Analytic data is readily shared in mature MDM systems.

MDM Governance

There are frequently manual aspects to the development of master data. We call these elements governance.2,3 Each subject area has different requirements for MDM Governance. Some will need it for the complete build of the record from the very start through to completion. The screens used for the entry often resemble interactive dashboards or portals. Other subject areas will pick up incomplete records from systems and complete the record with governance. Others still will only use governance for a manual verification of the record.

The manual efforts are formed from workflow capabilities in MDM tools. For example, in order for a new product to be accepted into a retail operation, the Purchasing Manager needs cost, the Marketing Manager needs pictures, the Service Manager needs repair and warranty information, and the Training Manager needs features and benefits. Figure 1 is an example of a basic workflow.

image

FIGURE 7.2 Master data management workflow.

Workflow is used to enrich the record, then “pass” the record from one person, department, or other collection of people as a group, to the next, perhaps going back and forth several times until the record is complete. The idea behind using a group is to have escalation and coverage and not have a dependency on a single person.

The flow can have all manner of forked paths, including parallel operations. The building blocks of the workflow are events, states, transitions, and users. The resultant actions from workflow events could be record manipulation or an email or a trigger event for organizational tasks in other systems. In the event of an unwanted delay, MDM workflow can re-prompt for action and even reassign tasks that are not getting executed. Within MDM, you define the escalation path and the time allotted at each person, ensuring (for example) new products continue to be introduced.

This process for getting to good records with governance is considered by many to be MDM’s main value proposition. It is also frequently MDM’s biggest challenge. It is for this reason, and others, that Organizational Change Management (see Chapter 17) is essential for MDM projects. Organizational Change Management’s biggest value proposition to the No-Reference architecture is frequently in the area of MDM.

Data Quality and MDM

You could be moving all kinds of interesting data around the organization with MDM, but if it does not adhere to a high standard of quality, it can all be for naught. Actually, that would be an MDM implementation that would not be worth doing at all. Data quality is the absence of intolerable defects. Those defects are defined by the business and implemented with governance as well as data quality rules. Governance enriches data along the workflow. Data quality rules are applied in the hub to data as it enters and all the data disseminated has data quality.

MDM Roles and Responsibilities

The business has major roles and responsibilities in MDM. The aforementioned governance process is one major example of this—both in the formation of the workflow, as well as being on the spokes of the workflow during its execution. We call both types of participants Data Stewards, as well as those contributing to the data modeling and data quality rules from the business side.

These senior business analysts in the subject areas being built into MDM should actually be considered part of the extended development team for the subject area they are the expert in.

There is also a Project Sponsor. This sponsor is ideally a business executive who understands the importance of information to the business and the importance of MDM to those objectives. The sponsor understands the short- and long-term capabilities of MDM in the organization and actively contributes to shaping them in the form of the iteration plan.

The sponsor will keep MDM out of internal cross-fire and chair recurring executive briefings on the project. They will chair the aforementioned data governance group and be the executive voice of requirements to vendors in the tool selection process.

MDM Technology

While MDM does not necessarily need a vendor-purchased tool, practically speaking it is difficult to pull off MDM without one. MDM tools coordinate all of the aforementioned functions without the need for disparate, discrete tools for each function.

Many vendors are claiming the MDM category and the large software companies have clearly positioned themselves in this space. You should base your shortlist on vendors who support the characteristics of your particular MDM need. Inevitably, given the breadth of MDM, every vendor will be providing many features that you would not consider important. Don’t sell yourself short, though—those features may be prominent in your implementation someday.

Data quality tools may also be needed to address grievous data quality challenges above and beyond MDM data quality functions. Integration technologies such as an enterprise services bus, a data integration tool or some other manner of moving data will be necessary. Some MDM tools do not provide matching capabilities. Finally, syndicated data, naturally, is a separate purchase entirely.

Action Items

• Divide your organization into its subject areas

• Prioritize the subject areas

• Determine the source(s) of record for each subject area for the new architecture

• Determine which need governance, and to what degree

• Learn what is available in the syndicated data marketplace and develop a value proposition for syndicated data

• Assign value to the various components of MDM—data modeling, data integration, data governance, data quality

www.mcknightcg.com/bookch7


1There are some data warehouses out there with selective real-time data, which is great and may alleviate some need for a separate MDM hub, but realistically pursuing real-time data warehousing is less worth it than pursuing operational business intelligence, of which MDM is a part.

2Note this is different from “data governance.”

3Arguably, all MDM records could be considered “governed” since even those without MDM Governance form from confirmed sources.