Objective To test the predictability of the Melbourne criteria for activation of the medical emergency team (MET) to identify children at risk of developing critical illness.
Design Cohort study.
Setting Admissions to all paediatric wards at the University Hospital of Wales.
Outcome measures Paediatric high dependency unit admission, paediatric intensive care unit admission and death.
Results Data were collected on 1000 patients. A single abnormal observation determined by the Melbourne Activation Criteria (MAC) had a sensitivity of 68.3% (95% CI 57.7 to 77.3), specificity 83.2% (95% CI 83.1 to 83.2), positive predictive value (PPV) 3.6% (95% CI 3.0 to 4.0) and negative predictive value 99.7% (95% CI 99.5 to 99.8) for an adverse outcome. Seven of the 16 children (43.8%) would not have transgressed the MAC prior to the adverse outcomes. Four hundred and sixty-nine of the 984 children (47.7%) who did not have an adverse outcome would have transgressed the MAC at least once during the admission.
Conclusion The MAC has a low PPV and its full implementation would result in a large number of false positive triggers. Further research is required to determine the relative contribution of the components of this complex intervention (Paediatric Early Warning System, education and MET) on patient outcome.
Statistics from Altmetric.com
Paediatric Early Warning Systems (PEWS) have gained popularity and widespread use in the clinical settings following claims that they improve hospital-wide mortality.1 2 There are two broad types of paediatric early warning tools; trigger scores and early warning scores.3 The indicators that are used in all types of tools include physiological parameters, such as heart rate, respiratory rate and blood pressure; clinical signs such as respiratory distress; therapeutic intervention, such as oxygen therapy; and diagnostic criteria such as seizures. Trigger systems are either single parameter where one or more abnormal indicator triggers the tool or multiple parameter triggered by two or more abnormal indicators. The early warning scores are a composite collection of indicators where increased deviation from the normal accrues and increasing aggregate score and a call for assistance is made when a particular threshold score has been reached.3 A systematic review of paediatric alert criteria by Chapman et al4 suggested that the potential of these criteria to aid early identification of those at risk of critical deterioration and thereby improved outcome had not been demonstrated. Our initial study,5 judged by Chapman et al4 to be one of only three published studies with appropriate methodology to study the predictability of alert criteria, also suggested that further validation studies were required before these trigger tools were widely implemented. The main issue in our original study5 which examined the predictability of the Cardiff and Vale Paediatric Early Warning System (C&VPEWS) was the low PPV of the trigger criteria and the large number of false positive triggers. When it operated as a trigger activated by a single abnormal parameter3 the C&VPEWS was sensitive but had low specificity, as a result the PPV was very low and most activations were false positives. Operating the C&VPEWS as a multiple parameter trigger score3 only marginally improved its performance. The optimum score cut off of two had a sensitivity of 70% and PPV of 6%. It was not possible to reconcile the difference between the higher number of children triggering the C&VPEWS and the low numbers in the four systems evaluated in before and after studies1 2 6,–,8 without the data to validate their activation criteria. The aim of this study was to use the data obtained from the prospective cohort study to test the predictability of the C&VPEWS to test the predictability of the Melbourne criteria for activation of a Medical Emergency Team (MET) as described by Tibballs and Kinney.6
What is already known on this topic
▶ It has been recommended that Paediatric Early Warning Systems (PEWS) are implemented in UK hospitals, no paediatric randomised control trials have been performed in a paediatric population and the results from observational studies are inconsistent.
▶ The performance characteristics of activation criteria for PEWS have not been fully established.
▶ Activation criteria are part of a complex intervention designed to reduce paediatric mortality; this complex intervention has not been clearly defined or consistently implemented in the field.
What this study adds
▶ This cohort study demonstrated that the Melbourne Activation Criteria had reasonable sensitivity, but at the cost of low specificity and positive predictive value.
▶ The physiological parameters used for the MAC were more extreme than those in the C&VPEWS, but despite the use of parameters that were well outside what most clinicians would feel was normal they were frequently observed in children who did not have an adverse outcome.
▶ If this complex intervention does reduce mortality the relative contributions of education, PEWS and Medical Emergency Teams to clinical effectiveness is unknown.
Paediatric (age 0–16 years) admissions to any of the paediatric wards at the University Hospital of Wales were eligible for inclusion into the study. Patients admitted directly to the paediatric intensive care unit (PICU) and the paediatric high dependency unit (PHDU) and those patients presenting in cardiac or respiratory arrest were excluded. The University Hospital of Wales is tertiary centre for paediatric care with 50 medical, 34 surgical, 16 oncology, 7 PICU, 4 PHDU, 4 cardiac and 4 renal beds. Data were collected on patients admitted in the 12-month period between 1 December 2005 and 30 November 2006.
The method of data collection is described elsewhere.5 The outcome measures defining an adverse outcome were PHDU admission, PICU admission and death. Data were available from our original study to provide a measure of all nine of the Melbourne Activation Criteria (MAC) required to trigger the MET6 (table 1). Identical measurements were available for six out of the nine MAC: nurse or doctor worried about clinical state, airway threat, tachypnoea, tachycardia or bradycardia, hypotension, cardiac or respiratory arrest. The nurse or doctor worried about clinical state was prospectively recorded as yes or no on the observation chart. Data on pre-existing diagnosis of cyanotic heart disease had not been collected and therefore the hypoxaemia criterion was positive for all patients if the SpO2 was less than 90% in air or any amount of oxygen. The severe respiratory distress, apnoea or cyanosis criterion was positive if there were signs of respiratory distress as per the Advanced Paediatric Life Support9 guidelines on the recognition of potential respiratory distress. These include recessions, inspiratory or expiratory noises, grunting, accessory muscle use and nasal flaring. The acute change in neurological status or convulsion criteria was positive if the conscious level was reduced.
Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated to determine if any single positive MAC in a set of observations collected at a single point in time predicted an adverse outcome. These performance characteristics were also calculated for each of the nine individual MAC. Although the MAC was designed as a single parameter trigger the National Institute for Health and Clinical Effectiveness recommend the use of multiple parameter or aggregate scoring systems in adults10 and therefore the MAC was also evaluated as a multiple parameter tool. A MAC score was calculated for each set of observations by summating all individual positive MAC, to produce a MAC score between 0 and 9. Receiver operating characteristic (ROC) analysis was performed to study the ability of the MAC score to discriminate between children who went on to develop an adverse outcome and those who did not. The ROC curve was used to identify the cut off value of the MAC score that maximises the sum of sensitivity and specificity. The area under the curve (AUC) was used as a measure of the overall performance of the ROC curve as it reflects; in this case, the probability that MAC score will correctly classify children who develop critical illness and those that did not. AUC can take values between 0 and 1, where 1 is a perfect test and 0.5 is a test equal to chance.11 For the purpose of the analysis any missing criteria were assumed to be normal. Data were analysed using Stata 10.0.12
All patient identifiers were removed from the data set prior to analysis. The original study5 was approved by the Trust Research and Development Committee and ethical approval was granted by the South East Wales Local Research Ethics Committee.
Data were available on 1000 patients on whom 9075 sets of observations were performed. The patients were aged 0 months to 16 years; mean age 44 months, SD 58 months, median age 18 months. Sixteen children had an adverse outcome, 13 were admitted from the ward to high dependency unit (HDU) (4 of these subsequently transferred from HDU to PICU) and 3 were admitted from the ward to PICU. There were no deaths. Seven of the 16 children (43.8%) would not have transgressed the MAC prior to the adverse outcomes. Four hundred and sixty-nine of the 984 children (47.7%) who did not have an adverse outcome would have transgressed the MAC at least once during the admission (table 2).
There were 8993 observations between admission and discharge in 984 patients without an adverse outcome and 82 observations between admission and the adverse outcome in the 16 patients with adverse outcomes (table 3).
If the MAC is used as designed as a single parameter system it would have a sensitivity of 68.3% (95% CI 57.7 to 77.3), specificity 83.2% (95% CI 83.1 to 83.2), PPV 3.6% (95% CI 3.0 to 4.0) and NPV 99.7% (95% CI 99.5 to 99.8) for predicting PHDU admission, PICU admission or death. Sensitivity of individual MAC ranged from 0% to 45.1% (table 4).
ROC analysis for the identification of adverse outcomes from their MAC score demonstrated acceptable performance of the system (AUC = 0.79 (95% CI 0.73 to 0.84)) (figure 1). The MAC score cut off that maximises the sum of sensitivity and specificity was derived from the ROC analysis (table 5).
A score of 1 maximises sum of sensitivity and specificity demonstrating that the MAC works best, as designed, as a single parameter tool.
PEWS aim to identify children at risk of sudden deterioration and by activating a MET (or their equivalent, eg, rapid response team) ensuring that early action is taken to reduce the risk of death or serious morbidity. The performance characteristics of the MAC were comparable to the three previous published studies considered to be of adequate quality by Chapman et al.4 The sensitivity 68%, 89%, 90%, 78%; specificity 83%, 64%, 74%, 95%; PPV 3.6%, 2.2%, 5.8%, 4.2%; NPV 99.7%, 99.8%, 99.8%, not available; and AOC (95% CI) 0.79 (0.73 to 0.84), 0.86 (0.82 to 0.91), 0.89 (0.84 to 0.94), 0.9 (0.83 to 1.0) for the MAC, C&VPEWS,5 modified Brighton13 and Birmingham14 tools were broadly similar.
There are intrinsic limitations of using data that were collected to evaluate another PEWS to validate the MAC. Data were available that provided identical measurement of six of the nine MAC (table 1). The MAC hypoxaemia indicator was therapeutic as it measured oxygen saturation after the administration of oxygen. The data available measured oxygen saturation in air or any amount of oxygen. An observation that transgressed the hypoxaemia criteria might not have done so if oxygen had been administered in this study. However, observations that did not transgress the hypoxaemia MAC would not be affected by the administration of oxygen. This minor difference could reduce our measurement of the sensitivity or increase the specificity of the MAC. The hypoxaemia criteria were transgressed in 0.9% (80/9075) of observations so any detrimental impact on sensitivity or improvement in specificity would be small.
Transgression of the MAC indicator ‘acute change in neurological state or convulsion’ would result in a reduced level of consciousness. The data used to measure the indicator in this study would therefore detect all transgressions of this criterion. This MAC indicator is not precisely defined and neither the size of the change in level of neurological state or the time period over which this change occurs to make it acute is explicitly stated. Therefore the direction of any potential effect on sensitivity or specificity is difficult to predict. However, as the criteria were transgressed in 0.3% (26/9075) of observations any impact on the estimated sensitivity or specificity is likely to be very small. The criterion ‘severe respiratory distress, apnoea or cyanosis’ is also more subjective than indicators based on clearly defined physiological parameters. The direction of any impact on the measured sensitivity or specificity cannot be predicted but any effect is likely to be small.
Any evaluation of PEWS that are undertaken in situations where a MET is not in place will tend to produce a lower estimate of specificity. In a situation where a MET is activated they may intervene in a way that normalises the activation criteria. The MET might also adjust the thresholds for criteria transgression to prevent repeat triggers in a stable child with abnormal vital signs. In addition, staff may have been reassured and therefore the threshold for the ‘doctor or nurse worried’ indicator would have altered. If abnormal vital signs continue and staff have been reassured the system then once again relies on clinical judgment and not the PEWS. However, it was the perceived limitations of ward staff clinical judgment that lead to the development and popularity of PEWS in the first place.
A number of patients who suffered an adverse outcome, admitted to HDU or ICU, did not transgress any MAC including the ‘nurse or doctor worried about clinical state’ criterion. This criterion was prospectively recorded on the observation chart as part of the original study and therefore there was no contemporaneous documentary evidence of staff concern. In patients who did not transgress the MAC but suffered an adverse outcome staff must eventually have been concerned and transferred the patient to HDU or ICU. However, the purpose of PEWS is to recognise critically ill children at an earlier stage.
A further difficulty in evaluating PEWS in a situation where a MET is not in place is that staff may act differently if recording concerns result in activation of a MET. In a cluster randomised control trial of a MET in adult patients staff in the intervention hospitals were thirty five times more likely to trigger an emergency team as a result of ‘staff concerns’.15 The performance characteristics of activation criteria may be different in situations where a MET is in place. In the original study it was intended to collate data on all eligible admissions over a 12-month period. Data extraction was stopped when information on a thousand patients was available. It is possible that these ‘most available’ records are not representative of all admissions during this period of time. Data were incomplete either because observations were not performed or were not recorded. As observations represented normal practice there were variations in the frequency, type and recording of observations.16 If missing data were abnormal, and not normal as assumed, the specificity and PPV would have been lower than measured.
The systematic review of paediatric alert criteria by Chapman et al concluded that the evidence supporting the validity, reliability and utility of paediatric alert criteria is weak.4 A survey of UK hospitals found that just over a fifth had implemented PEWS but there was no consistency in approach.17 None of the systems in use at the time had been validated. Thirty-six different parameters were used in the various systems, those most frequently used were respiratory rate, respiratory effort, heart rate, shock and nurse or doctor concern.17 The PEWS published to date vary in their content and triggering thresholds and no one system has been shown to be clearly superior to the others. In order to work in the field PEWS must have a high sensitivity, specificity and PPV.
No paediatric randomised control trials of the effectiveness of MET have been published and results from before and after studies are inconsistent.4 Tibballs and Kinney reported a statistically significant reduction in total hospital mortality following the introduction of a MET.1 Sharek et al also demonstrated statistically significant reductions in both total hospital mortalities, and respiratory and cardiopulmonary arrest rates outside the PICU.2 However, when the appropriate analysis was performed using a two-sided significance test, Brilli et al found no statistically significant reduction after the introduction of the MET in respiratory and cardiopulmonary arrests outside a PICU.7 Hunt et al reported no change in the rate of cardiopulmonary arrests and a decrease in the rate of respiratory arrests requiring intubation on the wards after the introduction of a MET.8
The four systems evaluated in before and after studies all reported relatively low rates of activation of a MET. The more subjective criteria used by Brilli activated the MET during 0.3% (27/9615) of admissions.7 The criteria used by Sharek et al, based on undefined acute changes in physiological observations, resulted in MET activation in 2.0% (143/7287) of admissions.2 Hunt et al reported activation in 1.2% (88/7503) of admissions using broad criteria for triggers rather than specific vital sign parameters.8 Tibballs and Kinney used well-defined, objective and easily reproducible triggering criteria that lead to activation of the MET in 0.6% (809/138,424) of inpatient admissions.1
There are a number of potential explanations to explain why the MAC were transgressed in 47.7% of admissions in the current study but the same criteria only actually activated the MET 0.6% of admissions when implemented in the field.1 Differences in the case mix between the two hospitals might explain the difference; however, as both studies were undertaken in tertiary teaching hospitals this is unlikely. The data used to determine the performance characteristics of the MAC were originally collected to validate a different tool it is possible that data for three of the nine criteria do not completely measure the intended criteria. As these three criteria tended to be triggered less frequently this does not explain the high overall activation frequency. The two criteria which triggered most frequently, tachycardia/bradycardia was abnormal in 32% and tachypnoea was abnormal in 19% of admissions, are entirely objective and would have been measured and interpreted in the same way in both studies. No data are provided by Tibballs on the frequency and completeness of observations, if either or both of these were low then this might explain the low activation level. The most plausible explanation is that the MET was not always activated when the MAC were transgressed.
To enable clinicians to implement an intervention a complete precise description is required. Activation criteria or PEWS are part of a complex intervention designed to reduce paediatric mortality. If this complex intervention does reduce mortality the relative contributions of education, PEWS and MET to clinical effectiveness is unknown. It appears that the MAC was not applied in practice as described and therefore its effectiveness is unproven. The physiological parameters used for the MAC were more extreme than those in the C&VPEWS (online appendix 1). Despite the use of parameters that were well outside what most clinicians would feel was normal they were frequently observed in children who did not have an adverse outcome. To date all of the validated PEWS have low PPV and their full implementation would result in a large number of false positive triggers.
The members of the MET in evaluation studies have included existing PICU staff that will not be available in most hospitals. There are insufficient staff who have and are able to maintain these skills even if finance was available. There will be practical difficulties implementing a MET in District General Hospitals who do not have the resources to have a team of additional people seeing children who probably don't need an intervention. Attempting to utilise existing personnel in this setting would be using the same team who are already criticised for failing to recognise children that go on and develop critical illness.
The recently published document ‘Why Children Die – A Pilot Study 2006’ from the Confidential Enquiry into Maternal and Child Health (CEMACH)18 states that ‘For paediatric care in hospital we recommend a standardised and rational monitoring system with imbedded early identification systems for children developing critical illness – an early warning score’. We believe that at present there is an insufficient evidence base to support this recommendation.
Further research must be based on a clearly defined intervention and determine the relative contribution of the components of this complex intervention on any effect on patient outcome.
The authors would like to thank Ms C Amphlett for the original data collection.
Competing interests None.
Ethical approval The data collection was approved by the Trust Research and Development Committee and ethical approval was granted by the South East Wales Local Research Ethics Committee.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.