The importance of data for the success of an organization's advertising and marketing efforts is widely accepted. It is seen as a valuable resource and can become an advantage in a highly competitive market. Dirty data on the other hand may hinder the success of data-driven marketing initiatives.
In a study by the Experian from 2019 the authors found that 95% of respondents see impacts in their organization from low data quality.
But what exactly determines data quality (DQ)?
Lee et al (2006) define data quality as a measure of the condition of data based on the following dimensions:
The assessment of the data quality depends on the data requirements and the purpose for which they are to be used. Thus, the same standard for data quality can be sufficient in one case but not in another. For example, the invoicing data for advertising campaigns need to meet very high requirements of the criteria mentioned above, while there might be a higher tolerance for errors in third-party data like Nielsen Ad Intel.
All too often data quality management has no operational priority and consequently data quality is unknown. When data is dirty the true picture is biased and the probability of costly decision-making increases. You probably have heard of the "garbage in, garbage out" principle. It states that if you feed a system with inferior data, it will likely produce an inferior output. The negative effects of wrong decision-making on your marketing efforts can be many. Inefficient targeting might waste budget in scattering losses and thus lowering profitability. Unpersonalized content could arouse aversion in consumers and thereby worsen your customer relationship. Errors in invoicing and biased reports can undermine trust and credibility in your company. Dirty data will also increase the expenses for data cleansing. Spending costly time on resolving those issues will occupy resources that can not be used elsewhere.
All in all, poor data quality can put your enterprise in an economic position where it is exposed to a competitive disadvantage. If the market's products and services are significantly better in the long run, the success of your business might be jeopardized.
As more and more firms are processing data in the terabytes they are facing data quality problems in the context of big data. In their paper published in 2015, Cai and Zhu elaborate on those effects. They note that the diversity of data sources bring about a variety of data types and complex data structures; which further complicates the data integration process. As the term big data already suggests, it is becoming increasingly difficult to assess the DQ for ever-growing amounts of data in a given amount of time. And further, they argue that data is changing at a very fast pace, posing more sophisticated requirements upon data processing technology.
Increasing requirements for data-related roles paired with rising demand for qualified candidates have to lead to a skill gap. In a recent study by the Experian, the authors found that 87% of respondents see difficulties in hiring data-related roles in their companies.
All activities related to analyzing, improving, and assuring data quality can be summarized under the term data quality management (DQM).
In general, a distinction is made between preventive and reactive measures. The former aims to avoid errors that have a negative effect on data quality. While the latter tries to detect and resolve already existing data quality problems. In general, the goal should be to hinder data deficiencies from entering the data warehouse.
But not only dirty data is associated with costs, but DQM measures also consume valuable resources. Otto and Österle presented an economic interpretation of the optimal level of data quality. They state that the cost of dirty data is sinking with higher data quality.
On contrary, the marginal cost of data quality measures is rising with higher data quality.
So the optimal level of data quality is not the absence of dirty data but in the minimum of the total costs curve. Therefore data quality management should use a cost-optimal combination of reactive and preventive measures. As these suggestions are rather theoretical I'd like to provide you with a couple of hands-on tips:
Dirty data is a serious threat to brand success. Recent developments as the mentioned skill gap or big data are making it even harder to manage data quality. Nonetheless, you need to take action and start an initiative against dirty data in your organization. High DQ will require a high-performing competence center that is able to implement carefully designed data pipelines. Furthermore, it is advisable to reduce or restrict human error to a minimum and automate processes where it's possible. By following these tips you will be able to increase and sustain DQ in your organization. While this can take some effort, it can be extremely beneficial for your brand.
If you feel overwhelmed by the duties of data quality management or you fear the consequences of dirty data, then reach out to us!
We are looking forward to supporting you become a data-driven marketing enterprise.