Data Quality: Understanding the past to improve the future

The value of healthcare quality data in an evolving data-driven environment was discussed in a prior HIMSS Business Edge article.  Before we can move forward in improving data however, we need to understand where we are.  This article will attempt to illustrate challenges in documenting and coding data that must be addressed before the goal of reliable health information can be achieved.  The two examples below are based on 3 years of payer data for all lines of business (~10 billion dollars in charges).

Breast Cancer

Virtually every patient treated with breast cancer has one or more transactions for services delivered.  These transactions provide the opportunity to capture data about the nature and location within the breast of these lesions.  Accurate and reliable transactional data could provide an understanding of cost, outcomes, risk and other information on a population-wide basis about the variations of this disease.  If we look at historical patterns, we see that while there is the ability to capture significant detail, it is rare that data includes these parameters on any consistent basis.  While some providers have captured these parameters, the large majority of providers have captured something far less specific.



The only data we can rely on is at a level that loses a great deal of potential value.  The specificity of the few is lost in the vagueness of the majority.

Cardiac Dysrhythmias

Cardiac dysrhythmias have significant impact on the population.  It would be extremely valuable to have a better understanding of the costs, risks, demographics, outcomes and other parameters of this disease across all healthcare enterprises and populations based on large transactional data sets.  Unfortunately, when we look at this data, the quality and specificity is such that we really don’t know with any level of reliability, what types of rhythm disorders are being treated.



Fig 2

This graph illustrates that while atrial fibrillation appears to be the most common specifically reported type of arrhythmia, nearly 50% of all transactions are at a less specific level[1] of reporting that makes it impossible to tell if the transaction was or was not related to atrial fibulation.  Any report on incidence, cost, outcomes or other analysis of this transactional data about atrial fibrillation could be off by a large percentage.


  • Information is only as good as the quality of data that supports it.
  • Data quality is only as good as the observations, documentation and standardized coding of the details that report the key parameters of the patient condition.
  • A focus on data quality is essential to assuring that we have the information needed to improve healthcare value.
  • The third part in this series (September issue), Data quality: Strategies for improving healthcare data will discuss approaches to improving data quality moving forward.


About the author: Dr. Nichols is a board certified orthopedic surgeon. After 16 years in active practice, he has been involved in healthcare IT for the past 18 years.  On behalf of CMS, payers, providers and other healthcare entities, Joe presents on healthcare data, ICD-10 and clinical documentation improvement.  He is also an AHIMA-approved ICD-10 coding trainer.


[1] Transactions where the primary code is at a level of specificity that the condition may or may not have been   atrial fibrillation