top of page

Traveling Community

Public·49 members
Mason Perez
Mason Perez

Measuring Data Quality For Ongoing Improvement:...


Listen to Laura Sebastian-Coleman, author of the book Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework, and I discuss bringing together a better understanding of what is represented in data, and how it is represented, with the expectations for use in order to improve the overall quality of data. Our discussion also includes avoiding two common mistakes made when starting a data quality project, and defining five dimensions of data quality.




Measuring Data Quality for Ongoing Improvement:...



Laura Sebastian-Coleman has worked on data quality in large health care data warehouses since 2003. She has implemented data quality metrics and reporting, launched and facilitated a data quality community, contributed to data consumer training programs, and has led efforts to establish data standards and to manage metadata. In 2009, she led a group of analysts in developing the original Data Quality Assessment Framework (DQAF), which is the basis for her book.


Data quality measures are very much in line with the dimensions listed above. Put plainly, they measure the quality of the data being reported. Any well-made data quality report example will make a point to highlight these measures. Data quality metrics come down to what you can control within the data delivery process.


These measures are in place to ensure that no duplicate data, poorly organized data, incomplete data, and so on. On the other side of that coin, complete, well organized, and unique data are all data quality examples that every organization should strive for. When quality data is handled correctly, business operations are more efficient and thorough.


In the context of an interconnected performance management system, data quality is a key component. Data quality standards are the guidelines used to ensure consistent delivery. The standards fall in line with the data quality dimensions listeeed above as well. They can be monitored and met with the help of powerful data quality tools, like Acceldata.


There are a few best practices to keep in mind when trying to align your data quality with the standards used industry-wide. These data quality standards best practices below will help you achieve consistent data quality.


In order to effectively implement these data quality standards, examples may be a helpful tool to build your own methods around. The Federal Committee on Statistical Methodology offers a very thorough data quality standards PDF free to use for any organization looking to improve their data handling.


Gauging the quality of your data consistently can be a tall order without a detailed framework to adhere to. It can make knowing how to measure data quality a very abstract ordeal. Employing a data quality metrics template can make all of the difference. Utilizing tools can be the key to understanding the metrics and making the most of them.


Data quality measurement tools are effective because of how they lay the information out for your team to understand. Key performance indicators (KPIs) are valuable in the tracking and quality monitoring of data. Finding a tool that offers a user-friendly data quality KPI dashboard will help your team assess and manage data efficiently. Data quality indicators will become a core focus for your team and the overall consistency of data quality will increase.


Tracking data quality metrics is one thing, but the process can be optimized by implementing a data quality metrics scorecard can help your team analyze the overall health of your data and build comparisons to past data. Scoring your data quality metrics and reporting those scores regularly will help the organization keep to a level of quality and immediately identify when they are falling short.


A data governance scorecard is slightly different as it grades the control and shared decision-making surrounding the data assets themselves. While data quality score calculation depends on the dimensions listed earlier, the governance score grades the overall handling and sharing of the data itself.


A data quality health score will provide insight into how well the standards are being met and how often. For building a data quality scorecard template, Excel spreadsheets are the most effective method. Template can be downloaded online, or you can find a data quality scorecard example online and build your own using it as a model. As long as it aligns with your operational model, the scorecard will be an effective tool for your team.


In terms of data quality measurement, framework implementation is a vital step. A book titled Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework is a great foundation for building and maintaining frameworks tailored to your data needs. This book functions as a data quality framework implementation guide, showing its readers not only the frameworks to use as well as the most effective ways to use them.


This book will help in the process of creating a data quality framework template that fits your specific operational strategy. Building a template from scratch allows for the freedom of customization. With this approach, you will have all of the assets on hand to more easily create a data quality framework PPT (PowerPoint Presentation) for company-wide training. The more informed your staff is, the better the framework can be used.


This article is primarily for data practitioners who want to improve the quality of data in their databases and data warehouses, but also for professionals whose work impacts data quality (looking at you, software engineers and sales people) or is affected by poor data quality.


The goal is to give you a framework for thinking about data quality metrics and a process for identifying which metrics your team should use. By the end of this article, you should leave with a sense of which metrics you should track to improve the quality of your data.


Data quality is a topic as old as data itself. Luckily for us, that means we can draw on decades of written experience from researchers and industry practitioners. Specifically, this piece leans on literature from different fields such as data quality management, data quality measurement, and information quality.


Across these fields, one key concept is the different dimensions of data quality, which are categories along which data quality can be grouped. These dimensions can then be instantiated as metrics of data quality, also referred to as database quality metrics or data warehouse metrics depending on where the data resides, that are specific and measurable. (Sebastian-Coleman 2013)


In contrast, an extrinsic data quality dimension (also called a contextual or task-dependent dimension) depends on the use case. A guiding question is: given how the data is used, what measures of quality are most impactful for the use cases at hand? Viewing data as a product again, these metrics are like measuring the amount of downtime in the product, the number of support tickets filed by a consumer, and the average load time for a customer.


In addition to these main intrinsic data quality dimensions, others include: the integrity of the data, which is often a function of the accuracy, consistency, and completeness of the data; the bias of the data, which is related to accuracy but often tied to skewed outcomes; and conciseness, which describes the amount of redundant data (this is less important in our world of cheap data storage).


While intrinsic data quality dimensions can be reasoned about without talking to a stakeholder, extrinsic data quality metrics depend on knowledge of the stakeholder and their use case. These use cases can be analyzed like product requirements: they have a specific purpose, with informal requirements, trust requirements, and a time constraint.


Reliability: Is the data regarded as true and credible by the stakeholders? Some factors that impact the reliability of data are: whether the data is verifiable, if there is sufficient information about its lineage, whether there are guarantees about its quality, whether bias is minimized. (Scannapieco and Catarci 2002)


Other extrinsic data quality dimensions include: sufficiency, the degree to which the data sufficient to complete the given task (related to relevance); consistency, similar to the intrinsic dimension but within user-facing systems; ease-of-manipulation, related to the dimension of usability. Like with intrinsic dimensions, each dimension can have quite a lot of overlap.


People: how can your teammates help you develop data quality metrics, hold the organization accountable for meeting those metrics, and play their part in ensuring high quality data? For instance, the engineering organization could implement pull request reviews to minimize breaking changes to upstream systems, impacting the reliability of your downstream data products. You could hire a data quality engineer, data steward, or data governance lead to directly own the implementation and improvement of data quality metrics.


Process: how can your organization implement business processes for data quality improvement? The data team could perform a one-time data quality assessment. Or, for ongoing improvement, the team could implement quarterly OKRs around data quality metrics, metric scorecards, the marketing and sales teams could introduce training initiatives to influence data entry that impacts the data accuracy and data validity, the data team could implement playbooks for remediation.


Technology: how can technologies help improve your data quality metrics? For example, tools like Segment Personas and Iteratively could ensure that product analytics data is consistent and reliable. ELT solutions like Fivetran and Airbyte and Reverse ETL solutions like Hightouch and Census could help ensure that data is both up-to-date and timely. Lastly, there are data quality tools meant for data quality measurement and management, as well as data cleansing tools for fixing data values. 041b061a72


About

Welcome to the group! You can connect with other members, ge...

Members

  • Patrick Martin
    Patrick Martin
  • Renat Krylov
    Renat Krylov
  • Landon Diaz
    Landon Diaz
  • Jean Collins
    Jean Collins
  • Jason Zollars
    Jason Zollars
Group Page: Groups_SingleGroup
bottom of page