Businesses have been deploying enterprise data governance (defining what the data should be) and master data management (ensuring the data is as defined) programs for decades. Even if your company doesn’t have a formal master data management program by name, chances are good that they are doing some form of master data management in your data warehouse, CRM or ERP systems. As the trend towards decentralized data analysis continues to progress we see a few forces in play that make the case for incorporating a master data management capability into your organizational roadmap:
The Power of a Single View
Organizations have acknowledged the benefits of bringing together all of their data from all of their disparate systems to maximize their data-driven problem-solving potential, identify new business opportunities, and increase the accuracy of machine learning models. This single view is not only used to power more accurate data analysis, but is also flexible enough to drive your operational business process.
Increased Machine Learning Potential
Businesses that are investing in data science and machine learning are quickly realizing that some of their greatest opportunities for optimization are stifled by a lack of usable machine learning training data. This is most often due to both insufficient data collection and/or variable execution (e.g. human error, governance) of the business process. An MDM program can benefit your data science initiatives by applying the taxonomies and hierarchies to data that would be needed to power machine learning.
The Democratization of Data Analysis
Domain and subject matter experts are becoming increasingly more responsible for developing their own data value hypothesis (and modeling their own data), and they are autonomously doing so with highly accessible and capable self-service analytical tools. Often the most difficult, time consuming, and error prone data wrangling task is entity resolution; Standardizing, deduplicating, cleansing and keying so that data seamlessly resolves down to the unique entity at the center of an analysis (e.g. Customer, Location, Product, Employee, etc.) In this context, mastered data is often vastly easier to blend, analyze, interpret and trust – overall reducing the time and cost for an individual to derive insight from business data.
The Importance of Data Privacy
As data privacy regulation is rolled out, organizations will be required to manage how key customer data is used in the business. They will need to manage and track consent and usage across all sources and ensure that information is only being utilized for purposes that were authorized by its owner. In addition, concepts such as data obfuscation and masking could enable broader business innovation through both internal and external crowd-sourcing.
Take for example Numerai, a hedge fund that has encrypted sensitive elements of the training data they use to power their trading algorithms, and then published that data as part of an ongoing Kaggle-style data science competition where anyone can compete to improve their performance and earn financial rewards. They’ve successfully used advanced data privacy techniques to both negate potential bias and crowd-source the engine that runs their business without revealing any of their most valuable intellectual property. What could your business do if it viewed data privacy as more than just risk management?