Data Consistency is Key to Analytics
Whether it’s “simply” the data that gets generated by the business applications we support or the availability of data from other sources such as partners/distributors or even the brave new world of social media – the availability of data typically isn’t an issue when it comes to BI applications. The volume of data continues to grow by unprecedented volumes each year.
The quality and ‘usability’ of that data, however, is critical to the success and acceptance of any BI strategy. Data inconsistency results in misinformation and incorrect decisions.
As the car mechanic in the 1970’s Fram Oil Filter commercial would say, “You can pay me now or pay me later”. Either you can address the issue of data quality at its source or pay the price to correct the issue later. If left to the consumer of the data during analysis/reporting to correct – inaccurate or misaligned data has the potential to be such a burden to correct that it renders the BI system unusable.
From a company perspective, there needs to be a data management policy in place which addresses all sources of data – both internal and external to the organization – that is required by the BI repository. There are many names for this process – Master Data Management, Data Quality/Hygiene, Data Governance, etc. Each have unique requirements/activities and complexity. All encompass two basic items to address:
- What is the Process? What are the minimum requirements for the data? What is the timing, how much lead time is required? Are approval(s) required as part of the process? Is data quality achieved while minimizing the impact on business operations?
- Who is the Owner of the data? Who is responsible for setting up new customer accounts or new products? Is it Customer Service for new customers? Is it the Product Manager or Inventory Manager for new products? In could be a combination depending on the attributes of the data in question. Regardless of what area takes on the ownership, it’s their responsibility to ensure that the data and its associated attributes meet company requirements.
Similar to other operational processes, a data management policy is not static. It must be reviewed and updated on a periodic basis. As the business changes, so might the data requirements. New product lines, new channels of distribution, sales force reorganizations, creation of new business divisions – due to acquisitions / mergers – are some of the activities that force a review of established data management policies. For data rationalization of common customers from 3rd party / acquisitions – consider maintaining the source value in your BI application for ease in tracking back to the originating source system.
As changes are made to master data thru the data management process, consideration must be made to the data existing in the BI data repository itself. In most cases the historical data will need to be ‘re-aligned’ to coincide with the new requirements. In some instances you might choose to also maintain a ‘historical’ view – for example, align the data based on the current sales organization, but also maintain the original sales region/territory/rep hierarchy.
Our own customers take advantage of the BI software that we provide to help with the data management process. And as part of this, they can also schedule “Actions” to email information to data owners where key attributes for customer and product dimensions are missing, incomplete, or inaccurate.
Establishing and maintaining a data management policy to help ensure data consistency in your BI solution is well worth the time and energy expended upfront compared to continually compensating for inconsistencies in the data during analysis.
On two occasions I have been asked, “Pray, Mr. Babbage, if you put into
the machine wrong figures, will the right answers come out?”
– Charles Babbage, Passages from the Life of a Philosopher