The Positive Power of Big Data

Author: Philip Loya –  Sr Solution Strategist, Australia IP Strategy, Cerner and Board Director, MSIA

It seems like every day we’re bombarded with another news story around a hacking or data breach.  Many of the Healthcare IT companies have worked especially hard to protect patient data, given the sensitive nature of the data that may be held.  Additionally, there have also been some articles in the press suggesting that de-identified data could be susceptible to re-identification given the large data sets and computing power that exists today.  As more and more data gets stored on public and private clouds, there is inevitably a portion of the populace who questions if there is a benefit to sharing data given the risks around that data might be hacked or re-identified and what that might mean to them.

There is no question that each vendor and organisation must do everything possible to avoid data breaches and there are a number of thoughts around ways in which data should be de-identified, all of which have pros and cons.  However, this blog post isn’t to discuss these issues, and instead, covers the real life benefits that can come from supporting and using Big Data

Sepsis is often described as, “Your body’s overwhelming and life-threatening response to infection that can lead to tissue damage, organ failure, and death.” (  According to the Australian Sepsis Network, over 15,000 new patients are diagnosed with sepsis each year with more than 3000 patients dying annually because of sepsis.  These statistics are larger than the annual national road death toll and is a higher cause of death than some very common cancers (breast, prostate, etc.).  With an average estimated cost per episode of AUD$39,300, the overall cost of treating patients with sepsis is significant. (  Unfortunately, the symptoms of sepsis can be quite generic where other conditions are often considered first prior to treating the patient for sepsis.  While research has shown that intervening earlier in a patient who is developing / developed sepsis leads to a lower mortality rate and lower overall cost, trying to figure out which patients should be flagged for sepsis has been extremely difficult.  Over the years, many doctors and researchers across the world have worked to identify algorithms that could be used to identify patients who have developed the early symptoms of sepsis or are at risk of sepsis.  Many of the published algorithms required clinicians to “connect the dots” and identify trends in real time along with using qualitative results to identify patients at risk without providing tools to let this happen, especially in cases where historical or cross-encounter results were required.  These original algorithms could cause delays to identifying patients who were at risk of sepsis, leading to delays in treatment, higher mortality and overall costs to the health system.

Hospitals have been looking for ways to automate sepsis alerting.  One such model, the St. John’s Sepsis Algorithm, has been created at Cerner.  (Disclaimer:  I work for Cerner.)  The algorithm originally started by automating published research around identifying patients at risk for sepsis.  These initial tools utilised the data sitting in the EMR and the qualitative data to spot trends which might suggest a deteriorating patient.  In looking at the experience with early partners, the alert often displayed too often and allowed some patients to slip through the cracks due to missing data, differences in assessing the qualitative data points, etc.  As a result, Cerner turned to Big Data to see if a better model could be created.  In utilising a large database of de-identified patients, algorithms could be run to identify the key data points such as vital signs, pathology results and other details.  These clinical indicators had the most predictive power to patients who were diagnosed with sepsis as part of their discharge record.  Models could then be created and run against the larger database of patients to rapidly determine if the model had a better prediction rate than current tools while minimising false-positive and false-negative rates.  It took many iterations to finally come up with the best possible model which could then be implemented at hospitals to see if it translated to real life practice.  In fact, Cerner has gone through 15 versions to land on the currently available “best” model, but fully expects that additional versions may need to be created as additional data and/or feedback becomes available.

Through working with partner hospitals, the current model has now been used in hundreds of hospitals around the world, including sites in Australia.  One hospital in the US, New York Methodist, established a baseline and then did a post-implementation review after implementing the sepsis algorithm.  Pre-implementation, the sepsis-related mortality was 30% with an average length of stay of 18.79 days.  Post-implementation, the sepsis-related mortality dropped to 23% with an average length of stay of 14.11 days due to earlier notification of septic patients.  Overall, this translates to a 23% overall reduction in sepsis-related mortality and a 25% reduction in length of stay.  If Australian hospitals were to get similar results, it would be the equivalent of saving approximately 690 Australians.  This would be roughly equivalent to achieving a 50% reduction in road toll deaths!

The use of Big Data to create these types of models allows for the creation of new information in a much faster way than previous research allowed.  Doing retrospective chart reviews to identify baselines and data points for review takes too much time and is prone to human error.  Utilising Big Data allows biostatisticians the ability to quickly create and then test out models in relatively quick timeframe which then allows this information to get to the electronic record and alerting the clinician who can actually do something in a much faster time!

On a personal note, I had a family member admitted to hospital in the past year for the flu.  Upon admission, her record was flagged for possible sepsis using the algorithm and early intervention was started to prevent sepsis.  It gave me great comfort to know that algorithms like this one were working to help my loved ones in their time of need and serves as a reminder of the value in sharing data, in an appropriate manner, with the larger health community.