Abstract
Keywords
Introduction
Clinical decision support (CDS) refers to any supplementary decision aids presented to users for the purpose of facilitating high quality care delivery through the safest means possible while minimizing unforeseen risks and encouraging providers to follow system-determined best practices. For the majority of healthcare workers, the most common way in which they are exposed to CDS is in the form of alerts within the electronic health record (EHR). 1 Health systems rely on these alerts as decision support tools to augment care and influence the domains of safety, efficiency, documentation, legal, and business operations. Within a given health system, alerts vastly differ in importance and impact on care. In instances where an alert is meant to halt an EHR workflow until a critical patient safety concern is addressed, the clinical decision support tool must be functioning as designed or significant preventable harm may result. 2 What is critically missing in most EHRs, and the gap herein addressed, is a comprehensive and time-efficient way to determine when CDS alerts are functioning abnormally.
A growing body of literature is dedicated to the ways in which clinical decision support can benefit care, reduce errors, and improve patient outcomes. 3 Very few studies are dedicated to the specific ways in which CDS systems are routinely monitored to ensure they are functioning as designed after implementation. Current health system solutions for CDS alert monitoring depend on governance structure and vary based on the EHR vendor, the metrics that they have elected to track, and the analytics department responsible for maintaining the data. 4 The built-in functionality from several of the major EHR vendors allows for individual alert review, but not a systematic way of detecting, aggregating, and presenting data from all system alerts to assess atypical function. Without this feature, every alert requires individual review on a regular basis to verify that it is working as designed. The time required to manually perform this task, especially considering that many systems have hundreds of alerts, presents the most prominent barrier – not to mention that alerts need several months of data to detect trends. 5
To address the need for tracking longitudinal CDS alert behavior, a dashboard with visualizations can be used, facilitating faster detection of atypical alert firing patterns and calculation of normal/expected alert frequency variation. Previous research has shown that visualizations can be effective for communicating findings intuitively and quickly so long as they are appropriately designed for the context. 6 In this case, a dashboard could expedite what has traditionally required manual retrospective review by an organization’s CDS committee. One method of detecting variation from normal behavior is a statistical process control (SPC) tool called a control chart. Created in the 1920s by Walter Shewhart, SPC charts allow for direct visualization of abnormal events that are distinguished by two types of process variation – common and special cause. 7 Common cause variation naturally exists within an internal system and is always present, whereas special cause variation is a result of external influence and suggests that a process is out of statistical control. 7 On control charts, the limits of common cause variation are set at three positive and negative standard deviations from the mean and are called the upper (positive) and lower (negative) control limits. Variation within the threshold defined by the upper and lower control limits is inferred to be predictable and normal. 8 This system of detecting process variation has traditionally been used for manufacturing and has been commonly adopted by quality improvement initiatives such as Six Sigma. 9 SPC has also been used in the healthcare system for process improvement because it allows for garnering insight into data quickly and easily for non-experts. 8
Beyond manufacturing, SPC charts can be used for CDS alert frequency visualization because alerts are traditionally designed to work with a single set of trigger criteria and a defined set of outcomes – similar to the defined inputs and outputs of a manufacturing process. 10 When designed according to the Five Rights of CDS, alerts should display the right information to the right person in the right format via right channel and only at the right time. 11 Careful consideration of design criteria should allow for precisely controlled function of CDS alerts, though in one study, 93% of chief medical information officers (CMIOs) reported malfunctions at their organization. 3 Without a tool such as SPC charts, clearly distinguishing normal from erratic behavior is incredibly challenging. To address the need for a more efficient and proactive review of abnormally functioning clinical decision support alerts, we postulate that the introduction of a dashboard with SPC charting will lead to early and effective detection of erratic alert behavior. Further, SPC charts demonstrating special cause variation from external influence will help to direct further investigation into the underlying reason for alert deviation.
Methods
Prior to creation of the alert frequency dashboard, the local CDS committee at an academic medical center identified the data elements they would like to track. The CDS committee was comprised of informatics representatives from pharmacy, nursing, and physicians, as well as EHR analysts and students. To facilitate inclusion of the necessary components and clearly depict the desired user interface, we created wireframes for a Tableau dashboard with SPC charts. Wireframe diagrams are basic pictorial outlines of the proposed design of a system, and Tableau is a customizable visual analytics platform designed to help people and organizations use data for informed decision making.
12
These wireframes were designed to balance metrics currently used by the CDS committee and those necessary for SPC elements. The specific type of control chart used in our study was a u-chart since the data tracked was discrete, based on a count, and from an inconsistent sample size.
13
The discrete alert count was used for control charts instead of the rate of alerts because alert design differed depending on the specific triggering criteria, where some alerts may fire only once per patient but others may trigger many times for a patient. Since the CDS committee often relies on side-by-side comparison of numeric alert data, the dashboard was designed to compare information for the three previous months. A separate table with data on all alerts was made available as well. An overview of the dashboard design can be seen in Figure 1. Overview of clinical decision support dashboard with statistical process control chart.
Each sheet on the dashboard included two visualizations at a time for a given alert: 1) the SPC control chart, which showed a time-series record of monthly counts with two horizontal lines that depict upper and lower thresholds for variation outside of the normal range, and 2) a time-series chart that showed the absolute deviation from 12 mean. The design for the dashboard and parameters chosen for this application of SPC to CDS were derived from theory from Shewhart and Stapenhurst. 14 The primary data collected for the dashboard was defined to be the number of times an alert was raised in a given month, which is designated the “alert count”. To identify which alerts activated an unusual number of times in a given month, its alert count that month was compared with its mean alert count measured over the previous 12 months. The difference between the month alert count and the mean count was then divided by that alert’s standard deviation to normalize the metric. This metric was called “sigmas away from mean”, or simply “the deviation”. For each of the three previous months, the dashboard listed the 20 alerts with the highest deviation for the that month. Alerts could be explored for further information with a call-out box, which included 12-months mean and standard deviation values, as well as the number of standard deviations from the mean for the previous 3 months.
To appropriately populate the dashboard with data, a dedicated IT analyst was assigned to investigate CDS alerts. Alert data was stored in the local enterprise analytic warehouse (EAW), which includes a database collection that serves as a repository for recording all alert information. Since the primary focus of this dashboard was custom alert data, only those specific alerts were included from the EAW, and built-in system alerts for drugs, allergies, and interactions were ignored. Custom alert data was then pre-processed to perform necessary calculations (such as standard deviation and mean) and staged in Tableau for use. Tableau was used for visualization since it is the platform used throughout the organization and could perform all the necessary tasks of this project. No patient data was transferred from the EAW to the Tableau dashboard, so study investigators did not access any patient information related to alerts and this study was institutional review board exempt. The IT analyst configured the dashboard to closely reflect the structure of the initial dashboard wireframes drawn up by the CDS committee. Once created, the dashboard was validated by comparing alert frequencies from the vendor’s built-in tool with information extracted from the EAW. It was confirmed that the dashboard reflected equivalent alert counts between the information stored in the EAW and reported through the EHR tool. After it was determined there was consensus on alert count data, the validated dashboard was moved from a testing domain to a production version and shared with all members of the CDS committee.
Upon moving into a production domain, the specific outcome tracked was identification of abnormally functioning alerts from special cause variation during the testing period of June-August 2022. An outcome secondary to identification of alerts was the ability to intervene and address the underlying cause of alert frequency deviation. In the circumstance that investigation into an alert was needed, stakeholder engagement followed to review the clinical content and technical aspects of alerts to determine a possible root cause. A 3-month window was elected for investigation of these outcomes since it was expected that atypical alert variation happened frequently enough that detection of abnormalities would have been readily apparent. Determination for broader rollout to include an additional corpus of alerts was contingent upon demonstration of benefit from the 3-month testing period.
Results
While the health system is working to remove alerts that are not actively used or maintained, there are still over 170 custom alerts meant for review by some clinical role. Of those, 150 are meant for physicians, nurse practitioners, and physician assistants, and 20 are designed for pharmacy. These do not include system alerts for medications, background alerts that are not meant for provider engagement, or non-interruptive alerts due to the difficulty with tracking provider engagement. Within the first month of usage, several alerts were identified that had standard deviations close to or greater than the upper and lower control limits – all of which were for providers and related to placing an order. The alerts with a high degree of variation included recommendation to order kidney function testing prior to a CT scan with contrast, suggestion to order pregnancy testing prior to a CT scan in women of child-bearing age, and notification that group B strep testing had not been completed or documented for obstetrics (OB) triage patients in the labor and delivery unit. While additional alerts did exceed the threshold, we chose not to focus on them since they were custom designed to function in the background and did not reflect provider choice or behavior. In general, the ignored alerts were necessary for operational tasks such as facilitating single sign-on for ancillary imaging applications and providing easier access to print order requisitions. When alerts were deemed acceptable to ignore, no further investigation was performed. For provider-focused alerts with frequencies outside control limits, stakeholders were engaged to further discuss.
For the abnormalities related to CT scan alerts, the radiology department was contacted to discuss changes in alert frequency and possible causes. An investigation into these alerts revealed that a practice recommendation had been made to drastically reduce the number of CT scans performed with contrast due to a nationwide shortage. Compared to mean alert frequency over the prior year, the alert count for suggesting kidney function tests prior to receiving contrast from the month of June 2022 was 170 compared to the average 873.8 – 3.13 standard deviations below the mean. The dashboard depicting this precipitous drop in alert frequency can be seen in Figure 2. The other identified CT scan alert – suggesting a pregnancy test prior to CT scan – saw a similar drop in frequency at 2.97 standard deviations below the mean, which is just shy of the lower control limit. While this alert did not include use of contrast, there was an overall decrease in the number of CT scans ordered during this time, which impacted all alerts for CT scans regardless of inclusion of contrast. Thus, both abnormalities reflected recommendations for a change in practice due to nationwide supply chain shortages. Clinical decision support dashboard with statistical process control chart demonstrating data from an alert recommending kidney function testing prior to a CT scan with contrast.
Investigation of the abnormal alert for group B strep test orders yielded similar findings. In July 2022, the alert fired only 51 times compared to the yearly average of 232.9 times – 3.03 standard deviations below the mean. The clinical content subject matter experts, obstetrics and gynecology, were consulted to discuss possible reasons for this behavior. After review, it was identified that a change in the workflow and order sets used during intake in the labor and delivery unit rendered this alert unnecessary. Based on these findings, the alert was scheduled to be discontinued since the evoke criteria would no longer be met and maintenance of the alert would be unnecessary.
Discussion
In summary, all of the identified alerts were appropriately recognized as potentially problematic due to the sudden change in alert frequency. SPC control charts on the dashboard allowed for rapid review of alerts with the greatest deviation from their mean, and visualizations with control limit lines allowed for easy and effective determination of which alerts warranted further investigation. None of the typical causes of alert failure – build errors, code changes, accidental disabling – were identified for the three investigated alerts. While the alerts were determined to be functioning appropriately, recommendations for change in clinical practice ultimately resulted in the erratic alert behavior. The influence of these recommendations was classified as special cause variation as it was the result of external influence on normal functioning.
During the studied implementation timeframe from June-August 2022, the dashboard did not identify any abnormal alerts that needed to be fixed, but it did function exactly as intended. The primary goal of the dashboard was to identify abnormally functioning alerts, and secondarily, to direct efficient review of the alerts that otherwise would have been overlooked. EHR vendor-provided services can show information about a single alert, but in contrast, there are no provided tools for quickly understanding the significance of what is presented. The SPC dashboard automatically displayed alerts with greatest deviation from their mean and control chart visualizations allowed for review within a matter of seconds. Within the first 2 months, the three aforementioned alerts had already been identified for further investigation. And while these alerts did not need to be corrected, the potential to recognize and intervene within a short period of time could have a profound impact on clinical care. All of these alerts would have been overlooked had this dashboard not detected them. Since completion of the studied implementation window, there have been alerts identified with special cause variation that did require intervention after further investigation. In one instance, stakeholder engagement identified that an uptick in alert volume was due to data not appropriately migrating from echocardiogram imaging equipment to the EHR.
As defined in a 2007 roadmap for action on CDS, the goal of enhancing health and healthcare through CDS is reliant on effective use and continuous improvement of CDS methods. 15 To achieve this goal, advancements to current methods for detecting malfunctions in CDS need to be improved. SPC charts provide just one method of achieving this, and they demonstrated to be useful shortly after implementation. Numerous best practices exist for testing CDS alerts, including regression testing following major code changes, comparison of alert data stored in data warehouses versus expected alert frequency, and validation of an alert after moving from a non-production to a production environment. 16 We are in agreement that these are essential, but current vendor solutions are often not capable of achieving these tasks and manual in-house review is cumbersome and time-intensive. Incorporating review of a dashboard with control charts in routine CDS meetings has the potential to make CDS safer by highlighting when clinical alerts are malfunctioning. While detection of erratic alert behavior does not have a direct impact on clinical safety, identifying, investigating, and addressing the root cause of atypical alert behavior promotes an overall safer and more trustworthy system. Many alert reviews are currently done post-hoc after a safety event has occurred because an unsafe action was conducted without warning. Regular review of control charts could potentially identify these abnormally functioning alerts and restore desired functionality before they result in a safety event.
As public and private entities emphasize the importance of sharing specifications for CDS alerts, there needs to be better systems for identifying when these alerts behave abnormally. The electronic clinical quality improvement (eCQI) resource center within the US Department of Health & Human Services supports CDS sharing initiatives and the standards needed to correctly transfer this information – namely Health Level Seven International (HL7) – but no such standards exist for CDS monitoring. 17 With the vast majority of EHRs implementing some form of CDS functionality, alerting monitoring needs to become a priority. Investigations have been performed to assess the processes required for implementation of practice recommendations through CDS and the feasibility of doing so, but these do not include what is needed for ongoing monitoring and analysis. 18 Collaboration between institutions on specifications for CDS alerts will help all involved parties, and this needs to expand to sharing methods of maintaining and monitoring the alerts after implementation.
Since there is no literature on monitoring CDS with SPC, we decided to use the traditionally defined three standard deviation control limits for identifying the threshold at which an alert change was potentially significant and warranted further investigation. There was no clinical backing to this decision, so the actual threshold at which alert frequency changes are significant will require additional examination. Further consideration into the design of alternatives within SPC include different types of charts, rate of alerts rather than absolute counts, thresholds for control lines, and methods for including data pre- and post-interventions. Also, automatic interpretation of SPC charts could be performed with built-in logic to explain shifts, trends, and runs in the data without the user providing any input. 13 While SPC was the method chosen for monitoring CDS in our study, other methods of performing this analysis also exist. This work would benefit from machine learning or artificial intelligence to remove the manual aspects of SPC and if allowed to process alert data in real-time with direct connection to data warehouses, then health systems could be notified of potentially abnormal alert behavior even more quickly. Lastly, alert frequency was the primary focus of our SPC chart dashboard, though this could be applied to examining alert override percentages or data specific to clinical role and specialty.
Limitations
This study was conducted post-implementation of the dashboard and limited to only 3 months. Future studies would benefit from a longer duration with more active adjustment of statistical significance thresholds. All investigated alerts during our studied timeframe were found to be due to recommendations for change in practice, so the potential impact of visualizations starting with identification of an alert and ending with code/process/system revision was not fully appreciated. Since no prior work has been done applying control charts to CDS, the appropriate thresholds at which a change should be considered significant have not previously been determined. Only three alerts met criteria for being provider-facing and exceeding the control limits, so adjustment of the thresholds for controls may be necessary. Lastly, only total alert counts were studied in this dashboard, but the ability to focus on other metrics such as rate of alert firing, alert override percentages, and alert engagement may also yield useful information.
Conclusion
While a substantial amount of research has been performed to characterize types of CDS alert malfunctions and provide recommendations for their prevention, very few solutions have been proposed as a method for detecting abnormalities once they have occurred. 3 SPC charting is a unique strategy to be applied to CDS and has proven effective in addressing other healthcare tasks. 19 Within the first 2 months of utilization, the CDS committee had already used the dashboard to identify and investigate several alerts that had drastic shifts in alert frequency. Review of these alerts found that recommendations for change in practice created the external influence that resulted in special cause variation in each circumstance. All of these alerts would have been overlooked had it not been for the dashboard clearly depicting atypical behavior. SPC is only one of many potential methods for addressing the gap in alert monitoring, and future research can advance this method with more specific thresholds, automatic interpretation, and new approaches such as machine learning.
