Objective: To develop and test the predictability of a paediatric early warning score to identify children at risk of developing critical illness.
Design: Prospective cohort study.
Setting: Admissions to all paediatric wards at the University Hospital of Wales.
Outcome measures: Respiratory arrest, cardiac arrest, paediatric high-dependency unit admission, paediatric intensive care unit admission and death.
Results: Data were collected on 1000 patients. A single abnormal observation determined by the Cardiff and Vale paediatric early warning system (C&VPEWS) had a 89.0% sensitivity (95% CI 80.5 to 94.1), 63.9% specificity (95% CI 63.8 to 63.9), 2.2% positive predictive value (95% CI 2.0 to 2.3) and a 99.8% negative predictive value (95% CI 99.7 to 99.9) for identifying children who subsequently had an adverse outcome. The area under the receiver operating characteristic curve for the C&VPEWS score was 0.86 (95% CI 0.82 to 0.91).
Conclusion: Identifying children likely to develop critical illness can be difficult. The assessment tool developed from the advanced paediatric life support guidelines on identifying sick children appears to be sensitive but not specific. If the C&VPEWS was used as a trigger to activate a rapid response team to assess the child, the majority of calls would be unnecessary.
Statistics from Altmetric.com
Suboptimal care may contribute to physiological deterioration of patients with major consequences on morbidity, mortality and the requirement for intensive care.1 Studies in adults2 3 have shown that patients in hospital exhibit premonitory signs of cardiac arrest, which may be observed by nursing and medical staff but are frequently not acted upon. Similar findings have been observed in the condition of adult patients before admission to intensive care units (ICU).4 5 Suboptimal care may be due to lack of knowledge regarding the significance of findings relating to dysfunction of the airway, breathing and circulation causing them to be missed, misinterpreted or mismanaged.
To reduce the occurrence of suboptimal care in adults, systems for identifying patients at risk of critical illness have been developed. These include the early warning score and modified early warning score, which were developed to focus attention on worsening physiological parameters and to act sooner. Some hospitals have set up a medical emergency team, a patient at risk team or a rapid response team (RRT), which provide expert advice to manage patients identified by these scoring systems.4 6–9 For the purpose of this paper the term “RRT” is used to describe this intervention.
These concepts could be transposed to the paediatric population, in which symptoms and signs of acute severe illness are often non-specific. The clinical condition of these patients can change rapidly and therapy needs to be initiated immediately if early intervention produces a better outcome.
A retrospective audit was carried out by the Children’s Hospital for Wales paediatric intensive care unit (PICU) looking at the physiological parameters in the 24 h leading up to the intensive care admission. The audit suggested that some of the children may have benefited from earlier intervention because they had abnormal observations for a period of time before admission. As a result it was decided to develop a paediatric early warning system to identify children at risk of adverse outcomes using simple physiological parameters suitable for bedside application. We report the findings of a prospective cohort study undertaken to evaluate the predictability of the Cardiff and Vale paediatric early warning system (C&VPEWS) criteria used to identify paediatric patients at risk of deteriorating critical illness.
All paediatric admissions (aged 0–16 years) to any of the paediatric wards at the University Hospital of Wales were eligible for inclusion in the study. Patients admitted directly to the PICU and the paediatric high dependency unit (PHDU) and those patients presenting in cardiac or respiratory arrest were excluded. The University Hospital of Wales is a tertiary centre for paediatric care with 50 medical, 34 surgical, 16 oncology, seven PICU, four PHDU, four cardiac and four renal beds.
Development of the C&VPEWS
The C&VPEWS was developed using physiological parameters based on the 2005 advanced paediatric life support (APLS) guidelines for recognition of the sick child.10 An expert group was set up that consisted of general paediatricians, a regional nurse educator and a paediatric intensivist. All paediatric medical and nursing staff in the trust are required to attend APLS or paediatric life support courses. The physical signs and physiological parameters taught in these courses were used as a starting point to develop the system. The expert group reviewed other early warning systems to modify the age-related normal ranges and to identify other parameters for inclusion in the score. The group reached a consensus opinion to agree the eight parameters and their trigger criteria. These parameters were based on: airway threat; oxygen required to keep saturations greater than 90%; respiratory rate; respiratory observation; heart rate; blood pressure; level of consciousness and nurse or doctor worried about clinical state. Some criteria were age dependent (see appendix for details). Each parameter scored zero if normal and one if abnormal, and thus each set of observations scored between 0 if all criteria were normal, and 8 if all criteria were abnormal.
A new paediatric observation chart was developed to incorporate all of the criteria in the C&VPEWS and nursing staff were trained to use the new observation chart before it was introduced. The nursing staff, while performing their routine duties, recorded the observations on the new paediatric observation chart. The new charts were completed for all admissions between 1 December 2005 and 30 November 2006. The frequency of observations was determined by the current clinical care policy and was not altered for the purpose of this study. The data were collated by the research nurse and entered into a database for analysis. The outcome measures defining an adverse outcome were respiratory arrest, cardiac arrest, PHDU admission, PICU admission and death.
The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated to determine if any abnormal criteria (a score of 1 or more) in a set of observations predicted an adverse outcome. Receiver operating characteristic (ROC) analysis was performed to study the ability of the C&VPEWS score to discriminate between children who went on to develop an adverse outcome and those who did not. The ROC curve was used to identify the cut-off value of the C&VPEWS score that maximises the sum of sensitivity and specificity of the C&VPEWS. The area under the curve (AUC) was used as a measure of the overall performance of the ROC curve as it reflects, in this case, the probability that the C&VPEWS score will correctly classify children who develop critical illness and those who did not. AUC can take values between 0 and 1, where 1 is a perfect test and 0.5 is a test equal to chance.11 For the purpose of the analysis any missing criteria were assumed to be normal. Data were analysed using Stata 10.0.
The study was approved by the Trust Research and Development Committee and ethics approval was granted by the South East Wales Local Research Ethics Committee.
Data were collected on 1000 patients on whom 9075 sets of observations were performed. Sixteen children had an adverse outcome, 13 were admitted from the ward to the high-dependency unit (four of these subsequently transferred from the high-dependency unit to the PICU) and three were admitted from the ward to the PICU. There were no deaths, cardiac arrests, or respiratory arrests. Three of the 16 children (18.8%) had no abnormal observations before to the adverse outcomes. Eight hundred and ten of the 984 children (82.3%) who did not have an adverse outcome had at least one abnormal observation during the admission (table 1).
There were 8993 observations between admission and discharge in 984 patients without an adverse outcome and 82 observations between admission and the adverse outcome in the 16 patients with adverse outcomes. Recording of the eight criteria in each set of observations was incomplete and ranged from 87% for heart rate to 8% for airways threat (table 2). The C&VPEWS score is made up of eight criteria; the number of outcomes for each score is shown in table 3. In children without an adverse outcome, 20 sets of observations in nine children scored over 4. In those without an adverse outcome eight sets of observations in two children scored over 4.
If the C&VPEWS was triggered by a single abnormal criteria in a set of observations, ie, functioning as a single parameter system, it would have a sensitivity of 89.0% (95% CI 80.5 to 94.1), specificity of 63.9% (95% CI 63.8 to 63.9), PPV of 2.2% (95% CI 2.0 to 2.3) and a NPV of 99.8% (95% CI 99.7 to 99.9) for identifying children who subsequently had an adverse outcome.
ROC analysis for the identification of adverse outcomes from their C&VPEWS score demonstrated acceptable performance of the system (AUC 0.86, 95% CI 0.82 to 0.91; fig 1). The C&VPEWS score cut-off that maximises the sum of sensitivity and specificity was derived from the ROC analysis (table 4). This cut-off was 2, giving a sensitivity of 69.5% (95% CI 59.0 to 78.4), a specificity of 89.9% (95% CI 89.8 to 90.0), a PPV of 5.9% (95% CI 5.0 to 6.7) and a NPV of 99.7% (95% CI 99.6 to 99.8).
The purpose of the C&VPEWS is the early identification of clinically deteriorating hospitalised children. When it operates as a trigger activated by a single abnormal parameter12 the tool was sensitive but had low specificity, as a result the PPV was very low and most activations were false positives. Operating the C&VPEWS as a multiple parameter trigger score12 only marginally improved its performance. The optimum score cut-off of 2 had a sensitivity of 70% and PPV of 6%.
The main issue revealed by our study to examine the predictability of the C&VPEWS was the low PPV of the trigger criteria and the large number of false positive triggers. Most patients (823/1000) had one or more abnormal C&VPEWS criteria at some time during their admission.
Some authors have claimed to have demonstrated acceptable performance of trigger criteria;13 however, inappropriate methodology and analysis was used.14 The most robust validation of a paediatric early warning system score published to date used a case–control study design.15 The area under the ROC curve (0.90, 95% CI 0.83 to 1.0) in this study was similar to the C&VPEWS (0.86, 95% CI 0.82 to 0.91). The sensitivity (78%) and the specificity (95%) were also broadly similar to those observed for the C&VPEWS that had a sensitivity of 70% and a specificity of 90% when the optimum cut-off was used. The limitation of a case–control study design is that the PPV and the number of false positives can only be estimated, unlike the cohort study used for validating the C&VPEWS. The activation rate of both the C&VPEWS and the Duncan system was greater than the rate in the four evaluation studies.16–19
What is already known on this topic
It has been recommended that paediatric early warning systems are implemented in UK hospitals, no paediatric randomised controlled trials have been performed in a paediatric population and the results from observational studies are inconsistent.
The performance characteristics of trigger criteria for paediatric early warning systems have not been fully established.
What this study adds
This cohort study demonstrated that the system had reasonable sensitivity, but at the cost of low specificity and PPV.
Further studies to determine the optimum trigger criteria are required before the widespread implementation of an unproved intervention.
Before and after studies in adult populations have suggested that RRT reduce cardiopulmonary arrests and mortality; however, the only randomised controlled trial did not demonstrate any benefit.20 To date no paediatric randomised controlled trials have been published and results from before and after studies are inconsistent. Tibballs et al16 reported no statistically significant reduction in cardiac arrest or death after the introduction of a RRT. Brilli et al17 found no statistically significant reduction after the introduction of the RRT, in respiratory and cardiopulmonary arrests outside the ICU when the appropriate analysis was performed using a two-sided significance test. Sharek et al18 demonstrated statistically significant reductions in both hospital-wide mortality rates, respiratory and cardiopulmonary rates outside the ICU. Hunt et al19 reported no change in the rate of cardiopulmonary arrests and a decrease in the rate of respiratory arrests requiring intubation on the wards after the introduction of an RRT.
Persistent issues that require further examination and debate include the establishment of the optimal trigger criteria to activate a paediatric RRT.20 None of the four evaluation studies validated their activation criteria. Tibballs et al16 used well-defined, objective and easily reproducible triggering criteria that lead to activation of the RRT on 184 occasions during 35 892 hospital admissions. In contrast, the criteria used by Brilli et al17 were less dependent on physiological parameters and were more subjective and therefore potentially less reliable. The RRT was activated 27 times during 9615 admissions. The criteria used by Sharek et al18 were largely based on acute changes in physiological observations but the actual values or size of change for these parameters was not defined in the published paper. There were 143 triggering episodes in 7287 admissions following the introduction of the RRT. Hunt et al19 created broad criteria for triggers rather than using specific vital sign parameters. These criteria activated the RRT 88 times during 7503 admissions.
It is impossible to reconcile the difference between the higher numbers triggering the C&VPEWS and the low numbers in the four evaluation studies16–19 without the data to validate their activation criteria. If, as claimed, the 143 RRT activations resulted in 33 lives saved the ill-defined criteria used by Sharek et al18 must be extremely sensitive and specific. An alternative explanation is that the RRT was not always activated when triggers were present.
The National Institute for Health and Clinical Excellence (NICE) recommend that early warning systems for adults should use multiple-parameter or aggregate-weighted scoring systems.21 The C&VPEWS performed better when used as a multiple parameter system, in which the response algorithm required more than one criterion to be met, rather than a single parameter system. Our study findings were in keeping with the NICE recommendation for adults. The performance of the aggregate weighted scoring system of Duncan et al15 was similar to the C&VPEWS when used as a multiple parameter system. The aggregate system has the advantage of producing a wider range of scores, which facilitates a graded response, but is more complex to use in a general ward setting. There is the potential for both interrater and intrarater variation in the measurement of the physiological variables; single parameter systems in adults have been found to be more reliable.21 Inevitably in our study variability between observers will have occurred, particularly in the more subjective criteria, for example nurse or doctor worried.
Duncan12 observed that the threshold for abnormal values varies considerably between paediatric early warning systems and in all cases these thresholds are based on expert opinion. The thresholds chosen for the C&VPEWS for abnormal heart and respiratory rates were wider than both the APLS10 and other published normal ranges.22 Data collected in hospitalised children might provide more rational thresholds but empirical data comparing observations with outcomes is required to define predictive thresholds for an early warning system. The overlap between the distribution of normal and abnormal physiological parameters is the intrinsic limitation that impedes the development of valid early warning systems.
Initially it was intended to collect data on all eligible admissions between 1 December 2005 and 30 November 2006. Data collection was stopped when information on 1000 patients was available. It is possible that these “most available” records are not representative of all admissions during this period of time. Data on the parameters in the C&VPEWS were incomplete either because the observations were not performed or not recorded. If missing data were abnormal, and not normal as assumed, the specificity and the PPV are likely to have been lower than measured. Complete data were obtained from only 16% of the potential controls identified in the study of Duncan et al.15 The lack of compliance with recording observations highlights practical difficulties implementing any early warning systems.
The most objective outcome measure of critical illness, death, is fortunately uncommon in children. The other outcome measures used in this study and other published research are less reliable. The decision to admit a patient to the high-dependency unit or PICU may be based on different criteria in other units or vary in the same unit at different points in time. Decisions to call the arrest team can be subjective. These limitations in the validity and reliability of outcome measures affect both research to validate trigger criteria and studies on the effectiveness of RRT.
The concept of developing objective trigger criteria based on physiological measurements and the activation of an RRT to improve patient outcome is appealing. To date validation studies have not demonstrated trigger criteria that have high sensitivity that is not at the cost of low specificity. If the available trigger criteria were implemented completely RRT would be called frequently to children who would not go on to develop critical illness. High grade evidence on the effectiveness of RRT, in the form of randomised controlled trials is not available. Allowing that these two problems can be overcome there will be practical difficulties implementing RRT in district general hospitals. The members of RRT in evaluation studies have included existing PICU staff that will not be available in most hospitals. There are insufficient staff who have and are able to maintain these skills even if finance was available.
The majority of these studies have introduced a period of intense education at the same time as the introduction of the trigger tool. This raises the question as to whether it is the tool or the education package that has led to the improvement. The importance of the recognition of abnormal physiological parameters cannot be overstated; however, the presence of abnormal parameters does not identify those children likely to develop critical illness and it does not tell the team when and how to intervene and whether the intervention would improve the clinical outcome. It seems intuitive to institute and rely on paediatric early warning systems but we must be sure that we are not depending on false reassurance.12
The recently published document “Why children die—a pilot study 2006” from the confidential enquiry into maternal and child health,23 states that “For paediatric care in hospital we recommend a standardised and rational monitoring system with imbedded early identification systems for children developing critical illness—an early warning score”.
There are many situations in which a trigger score or system may be useful: primary care; in the emergency department; on the paediatric hospital ward in a district general hospital where there are no PICU facilities; on a hospital inpatient ward where there may be no paediatricians immediately available or on a hospital ward in a paediatric tertiary hospital where there is a PICU. Each of these situations may need a different score and would have different personnel available to constitute the RRT. There will be implications for staffing and resources.
Further validation studies are required to find the optimum trigger criteria, intra-observer reliability and completeness of documentation for the different components of these scores. Ultimately, a multicentre cluster randomised control trial to determine the effectiveness of RRT is required before the widespread implementation of an unproved intervention takes place.
The authors would like to thank Dr R Al-Samsam for her initial contribution to the project and Ms C Amphlett for the data collection.
Appendix: C&VPEWS abnormal criteria
1. Airway threat, eg, stridor
2. Child requiring any amount of oxygen to keep saturations greater than 90%
3. Respiratory rate (outside the range below)
4. Abnormal respiratory observations, ie, recession or accessory muscles used
5. Bradycardia or tachycardia (outside the range below)
6. Blood pressure (outside the range below)
7. Level of consciousness (abnormal if only responding to voice or less)
8. Nurse or doctor worried about clinical state.
Competing interests: None.
Ethics approval: Ethics approval was granted by the South East Wales Local Research Ethics Committee.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.