Data Quality in BI: It’s More Than Putting Lipstick on a Pig!

Published by

Data Quality and BIData quality is one of the biggest challenges that enterprises face when it comes to business intelligence. If the data isn’t accurate, inferior reporting and poor business decisions that can have potentially serious consequences on the entire organization can occur.

When first examining the quality of data as you implement a business intelligence (or BI) solution, there are a number of things that need to be considered and several questions that you need to ask yourself. For example, as Paul Dorsett shared in one of his blog posts, Self-Service BI: Fill It Up!:

  • If you’re a manufacturer that deals with major accounts, are all Sold-To parties aggregating to the right national account?
  • Did a key customer acquire one of its competitors (whom you also sold to directly) and are you including those sales dollars? Plus, what business processes need to be put into place so this information is applied to the data?
  • And when using POS data, are the product numbers used by partners correctly cross-referencing to your internal product / material numbers? If not, are transactions being rejected and excluded in calculations?

While data considerations like these must be thoroughly reviewed when implementing a BI solution, there should be an ongoing data governance plan that’s regularly reviewed as your business changes due to an acquisition, product restructuring, etc. The purpose of such a plan is to filter bad information by defining and enforcing policies and approval procedures – not making data “look better” (like putting lipstick on a pig)!

Here are some suggestions on quality control measures that should be included as part of a data governance process:

Perform a Quality Assessment

Before a business begins to improve the quality of its data, it’s a good idea to first know where things stand. Assess current data sets by checking for errors, anything that looks inconsistent, repeated data strings, or missing fields, which can all be hard to find. Bad data can come from a variety of sources including external applications and be buried within the in-house system itself. This is why having a quality assessment done first (even by an independent organization) is a great first step to helping you identify issues and helping your company create a strategy to better manage the data.

Create a Virtual Data Quality Firewall

Another way to preserve the quality of your company’s data is to contain it in a protective environment that guarantees that the data flowing into it isn’t invalid or corrupt. By using pre-defined business rules, a virtual data quality firewall detects and blocks bad data at the point where it enters your BI environment (repository), acting to proactively prevent bad data from polluting your good data.

Determine Which Data Will Be Used

Most organizations have access to such large amounts of data through their enterprise systems that being expected to manage peak data quality at all times is virtually impossible. The key here is to identify which sets of data will require quality management and governance the most and to move that data to a repository for cleansing and analysis. Fortunately, BI solutions allow organizations to determine which data sets are most likely to be utilized and targeted for quality management and governance – making this part of the process a bit more straight-forward.

Mitigate Risk With Data Integrity Teams

For larger organizations, it is worthwhile to put data governance in the hands of the users themselves by creating a data integrity team representing various areas of the business to resolve data integrity issues and serve as liaisons with the IT group that supports the organization’s information management environment.

It can also be helpful to create a data governance board of business and IT users who can set data policies, make sure that mechanisms exist for resolving data issues, facilitate data quality improvement efforts, and enforce proactive measures to stop data problems before they occur.

We somewhat often get calls from frustrated business and IT users who run into issues with their analytics and reports; and 9 times out of 10, it gets back to bad data. While dirty data will happen time and again, having a governance policy in place that’s supported by a good data quality management system for profiling, transforming and standardizing your information, will help mitigate the risks (and all the finger pointing that typically occurs) due to bad data!

Categorized in: ,

This post was written by Pat Hennel