Scaling Up from Data Mart to Data Warehouse
Clinical Analytics requires the thoughtful deployment of several business intelligence (BI) technologies. It can transform diverse clinical data from multiple sources into meaningful information that can be used to take action to improve care. The essential technical components are largely the same as for healthcare financial analytics and for business intelligence in other industries. In fact, the most advanced healthcare organizations in deploying analytics to improve performance are moving towards unified platforms for financial and clinical analytics.
Four Stage Model
Data Acquisition involves getting relevant data elements out of source systems for transactional and operational functions such as:
- Clinical order entry
- Nursing documentation
- Medical records
- Surgical management
- Claims payment.
The classic approach to data acquisition is called Extract, Transform, Load (ETL). ETL processes are typically run in batches on a periodic schedule. Depending on the purpose of the analysis, the updates may occur on any frequency from once a year to several times a day.
In recent years, users have required more frequent updates so information can be analyzed and made available in ‘near real-time’. In some situations, health systems have moved beyond the batch ETL process altogether, using modern web service architectures that enable source systems to push individual transaction updates in real-time using XML document formats. Similarly structured XML documents are also being used for data acquisition in multi-stakeholder environments such as Health Information Exchange (HIE). Data acquisition requires technologies for fast, flexible tools for networking, for ETL and for messaging.
Data Integration involves structuring the data from multiple source systems into a unified data model that relates all data elements to one another in a meaningful way.
For clinical analytics, the main focus is usually on a longitudinal, person-centered data model that ties together different types of medical data about an individual, spanning multiple encounters with multiple providers over several years.
Designing a viable data model for complex healthcare information is one of the most challenging aspects of clinical analytics. Technical solutions, such a master patient index (MPI) for cleansing and standardizing data elements from different sources, are also important enablers of data integration. Data dictionaries and related meta-data tools are also essential for a unified database to be efficiently accessed for analysis and reporting.
Enterprise solutions will typically integrate clinical data into an expansive data warehouse that includes information on a broad range of subject domains. For individual analytics applications, relevant subsets of the cleansed and organized data may be formed into smaller more manageable data marts.
Data Enhancement involves adding value to the raw clinical data through classification schemes, risk adjustment formulas, and other processes that add new data elements that are useful in analysis.
Data enhancement can be as simple as an age classification that labels patients 65 and older as ‘seniors’ or as complex as an episode grouper that applies clinical logic to diagnosis codes and ties a claim for rehabilitation services to an orthopedic surgical procedure that occurred months earlier at another facility.
Other forms of data enhancement involve statistical analysis of data in the aggregate rather than simply classifying an individual patient or encounter.
Specialized web service applications and flexible rules-engines are increasingly valuable technologies for implementing complex data enhancements.
Information Delivery involves presenting information to a person who will use it to make a decision and take action.
Dashboards and scorecards similarly present trends or comparative performance information but use visual cues such as stoplight colors to focus the user’s attention on the most important issues. Typically these basic reports, charts and dashboards are designed by an analyst or IT specialist and then delivered to users on a periodic schedule. This traditional approach to ‘push’ out information can work well for accountability reporting in a hierarchical organization.
Other users have less predictable information needs and perform ad hoc analysis as part of their job. These users need a self-service solution for information delivery that lets them define for themselves what information they want to see and make quick modifications to their information displays.
Recent advances in data visualization software with ‘in-memory’ datasets have made it easier than ever for data-savvy end users to manipulate and analyze large amounts of data without needing an IT specialist to help them ‘pull’ it.
In clinical decision support, where doctors and nurses use information while caring for individual patients, different paradigms for information delivery are needed.
Large scale clinical business intelligence deployments will usually require several different types of information delivery tools to meet the needs of different types of information consumers.
Scaling Up from Data Mart to Data Warehouse
The four-stage model for clinical business intelligence is relevant for all implementations regardless of size and scope.
A data mart is a small scale business intelligence solution for a single department or a single subject area. Even a narrowly focused data mart for a department such as a cardiology clinic will need to address all four stages of clinical analytics to get the data in, integrate it, enhance it, and deliver it to the end user.
Integration is often less of a challenge for departmental data marts since only a small number of sources systems are usually involved. Stand-alone data marts can be relatively quick to develop. A plethora of independent data marts in a large organization can be counterproductive because the separate data marts will provide inconsistent information, narrow data sets are not easily repurposed for new users and maintenance of multiple incompatible platforms become quite expensive.
An enterprise data warehouse (EDW) is a larger-scale solution that includes a wide variety of subject domains and serves a diverse group of users across many organizational departments and locations.
A unified EDW is very complex and takes substantial time to implement but the benefits include a ‘single source of truth’ for resolving question about organizational performance and the ability to leverage both information and technical expertise across numerous projects. An EDW may feed its consistent, cleansed data into smaller data marts optimized to meet the needs of select groups of users.
Multi-stakeholder data warehouses are those that integrate data from several independent organizations. They share the same four-stage model but have some different challenges.
Standardizing data from multiple organizations can be very difficult. That is one reason that multi-stakeholder clinical business intelligence initiatives will often focuses on a narrower subject domain where the number of standardization and nomenclature issues can be limited. Patient registries focused on specific diseases or procedures are examples where narrow scope has contributed to success.
Many HIEs aim to eventually build clinical business intelligence solutions by combining participant information, but they must address issues of data ownership and privacy on their way to identifying specific projects with a strong enough ROI for all participants to justify the effort.
Smaller data mart projects can sometimes be successful with standard hardware and database solutions like those used for transactional systems. Just using an SQL query to extract key data points and implementing a simple data visualization tool can sometimes be enough to yield useful insights. But implementing an enterprise-level or multi-stakeholder data warehouse requires a number of specialized IT tools tuned for business intelligence.
Healthcare organizations that traditionally looked to develop an enterprise analytics capability have needed to acquire numerous tools from several different vendors and take time to integrate the pieces into a useable platform. Key pieces of a classic IT infrastructure for analytics include:
Healthcare-specific data warehouse products are available from a number of HIT vendors and include many of these technical components packaged into an application tailored for one or more aspects of performance analytics.
These solutions can be deployed relatively rapidly and can deliver solid ROI in the specific subject areas that their data models are designed to address. Larger healthcare organizations looking to develop a truly comprehensive enterprise warehouse may still need to select and deploy their own best-of-breed IT tools to create a unified database. Their EDW may in turn feed cleansed data into one or more focused analytics solutions from their HIT vendors.
Data Appliances deliver an out-of-the-box analytics platform including all the hardware and software components needed for an optimized analytics platform.
Data appliance solutions are now available from a number of major IT vendors and provide new options for organizations deploying an EDW. Along with options for cloud-based deployment of analytics solutions, data appliances present an opportunity for healthcare organizations to speed up implementations of clinical analytics and reduce ongoing costs.