Article Text

Download PDFPDF

Death is not the answer’: the challenge of measuring the impact of early warning systems
  1. Susan M Chapman1,2,
  2. Jo Wray3,
  3. Kate Oulton3,
  4. Mark J Peters2,4
  1. 1 International and Private Patients Division, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
  2. 2 Anaesthesia, Critical Care and Respiratory Unit, Infection, Immunity, and Inflammation Programme, UCL Great Ormond Street Institute of Child Health, London, UK
  3. 3 Centre for Outcomes and Experience Research in Children’s Health, Illness and Disability (ORCHID), Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
  4. 4 Paediatric and Neonatal Intensive Care Unit, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
  1. Correspondence to Dr Susan M Chapman, International and Private Patients Division, Great Ormond Street Hospital for Children, London WC1N 3JH, UK; Sue.Chapman{at}

Statistics from

We can all remember individual children in whom a deterioration went unrecognised. Sometimes fatally. Our defences were little more than the pearls offered by senior colleagues of grave warning signs: ‘beware grunting in an infant’ or ‘watch out for a tachycardia after the temperature has fallen’. But this advice was unstructured, and children are so different, and their comorbidities so broad, we failed some of them. Paediatric Early Warning Systems (PEWS) are serious attempts to reduce the unacceptable and dangerous variability in this recognition and response process. Scoring systems should provide age-appropriate thresholds for concern for single parameters or aggregated abnormal physiology and prompt standardised responses. The idea has such natural appeal that PEWS use was soon advocated by a number of national bodies1 2 without evidence. This may have been a mistake. Many of the scores in widespread use were not calibrated or validated. When formally assessed, most had poor predictive performance.3 This is not a trivial problem because staff may choose not to raise an alarm in the absence of a raised score or may choose to ignore a score ‘because it never works for him/her’.

Other than optimism, the main reason for the lack of evidence was the low event rates of critical deterioration or death within individual centres. An adequately powered trial was therefore a huge challenge. Fortunately, Parshuram and colleagues took on this challenge with the study ‘Effect of a PEWS on All-Cause Mortality in Hospitalised Paediatric Patients—EPOCH’.4 This trial has many strengths. It assessed a strong candidate for a score: the BedsidePEWS. The BedsidePEWS is notable for having significant prior validation.5 6 It was one of the highest performing of 18 scores in head-to-head comparisons.3 Further, it addressed a relevant and very large population: 144 539 patients in 21 hospitals in seven countries. The intervention was well designed and formally implemented. The validated severity of illness score was implemented alongside an education programme with interprofessionally designed documentation and structured escalation/de-escalation care recommendations. The cluster-randomised design was efficient and the only feasible approach to avoid contamination. The outcome measures were relevant to patients and families: the primary being all-cause hospital mortality. Secondary measures included occurrence of a significant clinical deterioration event, a composite measure felt to reflect late admission to intensive care (table 1). Ten hospitals were randomised to the BedsidePEWS intervention. The remaining 11 hospitals did not have a PEWS, although four did have a rapid response team in situ.

Table 1

Significant clinical deterioration event component outcomes

Despite all these strengths, EPOCH was a ‘negative trial’. Implementing the BedsidePEWS did not significantly decrease all-cause mortality. Of the 22 secondary outcome measures the only positive finding was a reduction in significant clinical deterioration events (relative risk 0.77, 95% CI 0.61 to 0.97; p=0. 03).

So what is to be made of this? Should governments and organisations no longer recommend PEWS? Should hospitals consider removing a PEWS which is already in place or not go forward with implementing a PEWS in the future?

If we look a little more closely at the main outcome measure of all-cause hospital mortality, we can see that this presents some challenges, particularly as death in childhood remains, thankfully, a relatively rare event. The power calculations were based on 2007–2009 data with baseline all-cause hospital mortality of 5.1 deaths per 1000 hospital discharges and anticipated a reduction of 1/1000 hospital discharges. However, there has been a general reduction in mortality over time7 which invalidated these power calculations. The trial observed fewer than two deaths per 1000 hospital discharges in both groups and roughly half of these deaths were in the context of ‘do not attempt resuscitation’ orders (DNAR). Exclusion of children with a DNAR did not alter the findings. The study was therefore underpowered despite the 21 hospitals and 144 539 patients. Assuming mortality fell no further; to detect a 10% relative risk reduction (from 2/1000 to 1.8/1000) with 90% power of in-hospital deaths in the absence of a DNAR would require a study of >2 million individually randomised paediatric admissions. And individual randomisation is not feasible for an intervention such as this. The necessary cluster randomisation would require many hospitals and a significantly higher number of patients.8 Also, children die for different reasons than they did 10–15 years ago.9 Severe or multiple comorbidities are the dominant causes of in-hospital death and the contribution from severely deranged acute physiology is less than it ever was.10 So, a potential effect size of 10% may be unrealistic. Therefore, the required sample size increases into many millions.

While no one would argue that mortality is not important, is it the most appropriate outcome to assess PEWS efficacy? Some cases remain resistant to intervention and death will occur regardless of whether a PEWS is truly effective or not. If mortality is not the optimal outcome, what are the alternatives? The EPOCH team examined potentially preventable cardiac arrests. This has significant appeal but has been challenged as subjective. Initial efforts were hampered by low levels of reviewer agreement, prompting amendments to the previous agreed protocol.

Ensuring rapid intensive care access for those who would benefit from it may be a better marker, given the purpose of PEWS. Within EPOCH, the authors used significant clinical deterioration events as a marker of late transfers but closer examination reveals components associated with critical illness that is well established. The thresholds may be inappropriately high and children deemed to be a ‘timely’ transfer may in fact have been inappropriately classified. This may have masked the true effect of PEWS. Some of these critical events, such as cardiopulmonary resuscitation, tracheal intubation and death occurring outside or immediately on arrival to intensive care, may indicate a lost opportunity for preventative action.

As well as challenges relating to the appropriateness of the outcome measures, the EPOCH authors also faced the complexities of conducting a randomised trial in a real-world setting, where organisational culture and human interaction are pivotal. The successful deployment of a PEWS is dependent on the complex interplay between multiple factors such as leadership, culture, teamwork, nurse and family empowerment, safe staffing levels, effective communication and continuity of care,11 which take time and effort to be embedded in practice. Data collection for the EPOCH trial may not have continued long enough, following the introduction of PEWS, for the long-term benefits (or harms) to be identified.

While detecting and managing deterioration in children is complex, the need for systems to deal with avoidable deterioration is not disputed. For this, we need to agree robust, valid and clinically meaningful outcomes to evaluate patient safety efforts, not matter how challenging that might be. The BedsidePEWS is better validated than other systems, performs very well in head-to-head comparisons and with EPOCH has clear 1A evidence that it is not harmful. Determining a mortality effect may have been unrealistic given the many steps between deterioration and death, but we should take serious note of the significant critical deterioration signal. The EPOCH trial remains a valuable study, and no other PEWS has anything approaching this level of evidence to support it. In evaluating PEWS, death is not the answer.


View Abstract


  • Contributors SMC drafted the initial manuscript. MJP undertook the statistical analysis. All authors contributed to the writing and revision of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.