Register for email alerts and news feeds:
This journal | BMJ Group
To SUBMIT an e-letter please go to the abstract/full text of the article and click the 'Submit a response' link in the box to the right of the text. For further help click here.

* To: ADC Fetal and Neonatal Edition Letters and ADC Education and Practice Letters

Electronic Letters to:

G A Pearson, J Stickley, and F Shann
Calibration of the paediatric index of mortality in UK paediatric intensive care units
Arch Dis Child 2001; 84: 125-128 [Abstract] [Full text] [PDF]
*eLetters: Submit a response to this article

Electronic letters published:

[Read eLetter] Calibration of the paediatric index of mortality (PIM) score for UK Paediatric Intensive Care
Shane Tibby   (12 April 2001)
[Read eLetter] Authors reply
Gale Pearson   (8 May 2001)
[Read eLetter] Calibration of the paediatric index of mortality in UK paediatric intensive care units
Gareth Parry   (1 August 2001)
[Read eLetter] The Need for Appropriate Statistical Analysis of Severity Scores
Murray M Pollack, Kantilal M Patel   (23 October 2001)
[Read eLetter] Authors reply
Frank Shann   (19 November 2001)

Calibration of the paediatric index of mortality (PIM) score for UK Paediatric Intensive Care 12 April 2001
 Next eLetter Top
Shane Tibby,
Paediatrician
Guy's Hospital, London, UK

Send letter to journal:
Re: Calibration of the paediatric index of mortality (PIM) score for UK Paediatric Intensive Care

Shane.Tibby{at}gstt.sthames.nhs.uk Shane Tibby

Dear Editor,

Pearson and colleagues have presented data highlighting the use of the PIM score as a tool for auditing paediatric intensive care unit (PICU) performance.[1] Whilst we would agree with the authors' message that PIM has many advantages over other scoring systems, we feel that urgent calibration is needed before this tool is adopted as a benchmark for performance indication in the UK. PIM variables were developed predominantly from an Australian data set (one British PICU, Birmingham participated) over 1994-1995; the data used in Pearson's validation comes from five UK PICUs, including our own over the period 1998-1999.[1] PIM continues to discriminate between death and survival reasonably well giving an area under the ROC curve of 0.840 (95% CI 0.819-0.853),[1] marginally less than the figure of 0.90 seen in the original paper.[2] However from the 4-year period between development and validation the model is now calibrating poorly, as evidenced by two pieces of information from Pearson's study.[1]

First, the overall standardised mortality ratio (SMR) is 0.87 (95% CI 0.81-0.94); this figure is remarkably concordant across 4 of the 5 PICUs. Second, from table 2,[1] it is possible to calculate the Hosmer-Lemeshow statistic: chi-squared = 37.41, p<0.0001. This implies poor calibration, (good calibration traditionally represented by a p value >0.10).

The reasons for the loss of calibration are unclear. A possible, perhaps over-optimistic explanation is that UK units in the latter study were all "over-performing" given that individual units demonstrated an SMR of between 0.83 and 0.89. However it is unlikely that such a quantum leap in the quality of paediatric intensive care delivery has occurred over the 4 years between 1994-8, given that no major treatment breakthroughs or radical service reorganisation has occurred in this time.

More recent data from our PICU highlights the trend towards poorer calibration, where the PIM-derived SMR from 910 patients seen during the 2000 calendar year is 0.54 (95%CI 0.39-0.69). The authors acknowledge the shortcomings and state that a revised version of PIM will soon be available. However recalibration is only worthwhile if a very broad sample of UK units participates. The UK PICOS study (paediatric intensive care outcome study) will attempt to address this, by collecting data used in the calculation of several scoring systems across the whole of the UK over a one-year period commencing March 2001. From this study it is hoped that an optimal indicator of PICU performance will be derived.

Shane M Tibby
Ian A Murdoch
Paediatric Intensive Care Unit
Guy's Hospital, London, UK

References
(1) Pearson GA, Stickley J, Shann F. Calibration of the paediatric index of mortality in UK paediatric intensive care units. Arch Dis Child 2001;84:125-8.

(2). Shann F, Pearson G, Slater A, Wilkinson K. Paediatric index of mortality (PIM): a mortality prediction model for children in intensive care. Intensive Care Med 1997;23:201-7.

Authors reply 8 May 2001
Previous eLetter Next eLetter Top
Gale Pearson,
Consultant in Paediatric Intensive Care
Birmingham Childrens Hospital

Send letter to journal:
Re: Authors reply

Gale.Pearson{at}bhamchildrens.wmids.nhs.uk Gale Pearson

Dr Tibby and Dr Murdoch note that, in our study of paediatric intensive care units (PICUs) in the UK [1], PIM discriminated well between children who died and children who survived, with an area under the ROC curve of 0.84. However, they are concerned that PIM had "poor calibration" because the standardised mortality rate (SMR) in the UK units was 0.87 (95% CI 0.81-0.94) - that is, the actual number of deaths was only 87% of the number predicted by PIM. In fact, this figure is almost identical to the PIM SMR for all PICUs in Australia in 1997-99, where the SMR was also 0.87(95% CI 0.81-0.92). It is very encouraging that PIM gives such similar results in Australia and the leading PICUs in the UK, as it suggests that standards are comparable between the two groups of units and that PIM performs similarly in Australian and UK children.

It is normal for SMRs to fall with time as intensive care improves, and for mortality prediction models to need recalibration. This has happened with PRISM [2], MPM [3] and APACHE [4], as well as PIM. Despite Dr Tibby and Dr Murdoch's reservations, the fact that the SMR has fallen by a similar amount in both Australia and the UK suggests that standards of care have improved in PICUs in those countries in recent years.

Dr Tibby and Dr Murdoch point out that the Hosmer-Lemeshow test gives a low p value for PIM's performance in the UK data. This test divides the sample into 10 groups, ranging from very low to very high risk of death, and compares the actual number of survivors and non-survivors in each group with the number predicted by PIM. Because PIM predicts too many deaths in the leading units in the UK, it follows that the number of actual deaths differs from the number predicted - so the Hosmer-Lemeshow p value is low. However, Table 2 in our paper shows that the ratio of observed to expected deaths was similar across the 10 groups [1], so that the recalibrated model is likely to fit well. The fact that the Hosmer-Lemeshow test gives a low p value does not necessarily mean that a model (such as PIM) is invalid - it often means only that the standard of care in the test PICUs differs from that in the units in which the model was derived.

The PICUs that contributed the data from which the PIM score was derived were all leading units that deliver a high standard of care, so the score reflects best practice in 1994-1996 when the data were collected. We are recalibrating PIM using data from units in the UK and Australia, and the new model will be available this year. Unfortunately, the quality of paediatric intensive care is not uniform in the UK, and there is evidence that some units do not perform at an optimal standard [5-7]. Surely it would be preferable for the UK to use an international standard based on best practice (such as PIM), rather than the average of good and not-so- good units from the whole of the UK (PICOS). The UK should aim for best practice rather than being content with average practice.

Yours sincerely

Frank Shann Gale Pearson

(1) Pearson GA, Stickley J, Shann F. Calibration of the paediatric index of mortality in UK paediatric intensive care units. Arch Dis Child 2001;84:125-8 (2) Pollack MM, Patel KM, Ruttimann UE. PRISM III: an updated pediatric risk of mortality score. Crit Care Med 1996;24:743-52 (3) Lemeshow S, Teres D, Klar J, Spitz Avrunin J, Gehlbach SH, Rapoport J. Mortality probability models (MPM II) based on an international cohort of intensive care unit patients. JAMA 1993;270:2478-86 (4) Knaus WA, Wagner DP, Draper EA, Zimmerman JE et al. The APACHE III prognostic system: risk prediction of hospital mortality for critically ill hospitalized adults. Chest 1991;100:1619-36 (5) Pearson G, Shann F, Barry P, Vyas J et al. Should paediatric intensive care be centralised? Trent versus Victoria. Lancet 1997;349:1213-7 (6) Bennett NR. Provision of paediatric intensive care services. Br J Hosp Med 1997;58:368-71 (7) De Courcy-Golder K. A strategy for development of paediatric intensive care within the United Kingdom. Intens Crit Care Nurs 1996;12:84-9

Calibration of the paediatric index of mortality in UK paediatric intensive care units 1 August 2001
Previous eLetter Next eLetter Top
Gareth Parry,
Senior Research Fellow
University of Sheffield

Send letter to journal:
Re: Calibration of the paediatric index of mortality in UK paediatric intensive care units

g.parry{at}sheffield.ac.uk Gareth Parry

Dear Editor,

Pearson et al should be congratulated on successfully collecting the data required for calculating the PIM Score on 7253 children admitted to 5 UK PICUs.[1] It is re-assuring to note that the authors did not find any systematic differences between these five units in terms of their standardised mortality ratios. Leaving aside the controversies involved in cross-country comparisons, it is further pleasing that they appear to conclude that mortality following admission for paediatric intensive care in 1998-99 is less than it was in 1994-95.[2,3] The current results imply that 78 more children have survived following treatment in these 5 PICUs than were predicted by the 1994-95 PIM derivation model. Before this can be considered a major clinical advance, it is important to consider the health status of the additional survivors. Very different conclusions might be drawn if the additional children who survived have a very poor health status than if they have a very good health status.

The United Kingdom Paediatric Intensive Care Outcome Study (UK PICOS) was set up in response to the "Paediatric Intensive Care: A framework for the future" document and a joint United Kingdom Medical Research Council and Department of Health working paper.[4,5] Both these publications recognised that since mortality following paediatric intensive care is less than 10%, morbidity or health status may be a more important outcome of paediatric intensive care than mortality. UK PICOS is currently collecting health status measurements of children who survive following admission for paediatric intensive care in a representative sample of 21 UK paediatric intensive care units. By seeking to differentiate between the survivors of paediatric intensive care UK PICOS may lead to a risk adjustment method for health status in addition to mortality. Furthermore UK PICOS has the potential to provide the methodology to enable cost- effectiveness studies to be set up in paediatric intensive care. In the longer term this will allow organisational structures, service management and new interventions in paediatric intensive care to be evaluated in a more rigorous manner than at present. Further details of UK PICOS are available at www.shef.ac.uk/~scharr/ukpicos.

Yours sincerely,

Dr Gareth Parry
Senior Research Fellow

Ms Sam Jones
Project Manager, UK PICOS

References

(1) Pearson GA, Stickley J, Shann F. Calibration of the paediatric index of mortality in UK paediatric intensive care units. Arch Dis Child 2001;84(2):125-128
(2) Pearson G, Shann F, Barry P et al. Should paediatric intensive care be centralised? Trent versus Victoria. Lancet 1997;349:1213-1217
(3) International Neonatal Network and Scottish Neonatal Consultants’ and Nurses Collaborative Study Group. Risk adjusted and population based studies of the outcome for high risk infants in Scotland and Australia. Arch Dis Child 2000; 82:F118-F123
(4) National Co-ordinating Group on Paediatric Intensive Care “Paediatric Intensive Care:A framework for the future.” NHS Executive Leeds 1997
(5) MRC/DoH Working Party on Intensive Care:The research needs and opportunities relevant to the NHS Medical Research Council 1997

The Need for Appropriate Statistical Analysis of Severity Scores 23 October 2001
Previous eLetter Next eLetter Top
Murray M Pollack,
Professor of Pediatrics
Children's National Medical Center, Washington, DC,
Kantilal M Patel

Send letter to journal:
Re: The Need for Appropriate Statistical Analysis of Severity Scores

mpollack{at}cnmc.org Murray M Pollack, et al.

Dear Editor,

We believe in quantitative, severity and case-mix adjusted pediatric intensive care unit (PICU) evaluations as potential indicators of quality of care. As the director and statistician for Pediatric Intensive Care Unit Evaluations, we have discovered PICUs that have performed poorly as indicated by more deaths than predicted using such methods. When poor performance was discovered, almost all PICUs implemented changes and their results improved, saving many lives each year. These evaluations have the potential to save many lives.

The believability of performance evaluations in individual pediatric ICUs is dependent on the reliablity of the method of severity and case-mix adjustment. In a recent issue of Archives of Disease in Childhood, the performance of the Paediatric Index of Mortality (PIM) in the UK was reported [1]. The authors reported the performance of the PIM in two ways, discrimination using the area under the receiver operating characteristic (ROC) curve and calibration using the goodness of fit test.

Surprisingly, the authors did not report the statistics associated with the calibration method. Lemeshow et al. pointed out the inadequacy of simply observing the data as was done in this manuscript [2]. Sufficient data were presented to enable us to compute the Hosmer-Lemeshow goodness-of-fit statistics [3]. The overall performance (Table 2) in deciles of risk was highly significantly different than predicted (p <0.001). The authors also reported the calibration of PIM in major diagnostic groups (Table 3); this was also significantly different than expected in the following diagnostic categories: respiratory diseases (p<.001), cardiac diseases (p<.001), neonates (p<.001), accidental trauma (p<.021), neurological diseases (p=.03), and miscellaneous conditions (p<.001). The only diagnostic condition that the results were not significantly different than expected was postoperative conditions.

The intent of this letter is to document the missing vital information in the article. We point this out for two reasons. First and most important, severity and case mix adjustment methods need statistical and face validity if they are to be used and believed. Lives can be saved if the PICU evaluations are accurate, reliable, and believable. The readers will have to judge PIM's performance for themselves because they are the ones potentially using it. But, when poor PICU performance can be attributed to the performance of the method, an opportunity to save lives is lost. Second, scientific veracity requires the appropriate statistics be reported.

Sincerely,

Murray M. Pollack, MD
Kantilal M. Patel, PhD

Children’s National Medical Center, Washington, DC.

References
(1) Pearson GA, Stickley J, Shann F. Calibration of the pediatric index of mortality in UK paediatric intensive care units. Arch Dis Child 2001;84:125-128.
(2) Lemeshow S, Le Gall, JR. Modeling the severity of illness of ICU Patients: A systems update. JAMA:1994:272:1049-1055.
(3) Hosmer DW, Lemeshow S. Applied Logistic Regression. John Wiley and Sons, New York. 1989; 141.

Authors reply 19 November 2001
Previous eLetter  Top
Frank Shann,
Director of Intensive Care
Royal Children's Hospital, Melbourne, Australia

Send letter to journal:
Re: Authors reply

shannf{at}cryptic.rch.unimelb.edu.au Frank Shann

Dear Editor

We agree with Parry and Jones that the quality of life after paediatric intensive care is at least as important as the number of survivors. Indeed, the first study of the quality of life 2-3 years after paediatric intensive care was performed in children in Melbourne (before any paediatric mortality prediction models were available) [1]. Another study of a later cohort is about to be submitted for publication.

Pollack and Patel accuse us of omitting vital information from our report – the Hosmer-Lemeshow p value. In fact, as they concede, the information needed to calculate a Hosmer-Lemeshow p value was included in Table 2 of our paper [2]. The Hosmer-Lemeshow procedure divides the sample into 10 roughly equal groups (deciles) ranging from mildly ill to very ill children, and then compares the actual number of children in each group who died and survived with the number predicted by the model (in this case, PIM) [3]. Inspection of this table (Table 2 in our paper) yields very useful information. If, for example, the model predicts too many deaths in not-so-ill children, and too few in very ill children, the model may be missing important variables or be poorly constructed. Under these circumstances, the standardised mortality ratio (SMR) may be close to 1.00 even though the model is poorly calibrated.

In our study, there were more actual survivors and fewer non- survivors than predicted by PIM in all but one of the deciles of risk in the Hosmer-Lemeshow table. Where there is a consistent trend like this, the standardised mortality ratio (SMR) and its 95% confidence interval provides a reliable guide to the calibration of the model. The SMR is the number of actual deaths divided by the number of deaths predicted by PIM. Our report stated clearly in the abstract, the results, and the discussion that the SMR was 0.87 (95% CI 0.81 to 0.94) – that is, PIM predicted 13% too many deaths, with p <0.05 (because the 95% CI did not include 1.00). We stated that a revised version of PIM will be available soon to correct this problem – similar revisions have been required for PRISM, MPM, APACHE and other mortality prediction models. In fact, PRISM III has a similar problem of poor calibration in PICUs in Australia.

We do not share Pollack and Patel’s view that the Hosmer-Lemeshow p value is “vital information” in these circumstances. First, the fact that the 95% confidence interval of the SMR does not include 1.00 tells us that the number of actual deaths differs from the number predicted by PIM – so the Hosmer-Lemeshow p value will inevitably be significant. Inspection of the Hosmer-Lemeshow table (rather than its p value) tells us HOW it differs. Secondly, the Hosmer-Lemeshow p value is highly unstable when the number of covariate patterns is lower than the number of subjects, as is usually the case with PIM and PRISM data. Bertolini et al performed the Hosmer-Lemeshow test using all possible subject dispositions on data from 1393 ICU patients – they obtained about one million different p values, ranging from 0.01 to 0.95 [4]. Thirdly, the Hosmer-Lemeshow p value does not tell us about the clinical importance of a difference between actual and predicted survivors and non-survivors. A small (clinically unimportant) difference in a large sample, and a large (clinically important) difference in a small sample can both give the same p value. When the Hosmer-Lemeshow table shows that a model predicts too many (or too few) deaths in both high and low risk patients, the best guide the clinical importance of the finding is the SMR and its 95% confidence interval.

The International Committee of Medical Journal Editors now requires authors to use confidence intervals rather than merely state p values [5]. Where inspection of the Hosmer-Lemeshow table shows a uniform difference between observed and predicted outcomes, the SMR and its 95% confidence interval is preferable to the Hosmer-Lemeshow p value.

Pollack and Patel conclude by questioning our “scientific veracity” because we did not publish the Hosmer-Lemeshow p value. They failed to declare their pecuniary interest in this matter. Dr Pollack charges US$9500 for the right to use PRISM III (and software that monitors the quality of paediatric intensive care); PIM is in the public domain and can be used without any payment.

Frank Shann
Royal Children's Hospital, Melbourne

Gale Pearson
Birmingham Children's Hospital

References
(1) Butt W, Shann F, Tibballs J, Williams J, Cuddihy L, Blewett L, Farley M. Long term outcome of children after intensive care. Crit Care Med 1990;18:961-5.
(2) Pearson GA, Stickley J, Shann F. Calibration of the paediatric index of mortality in UK paediatric intensive care units. Arch Dis Child 2001;84:125-128.
(3) Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. New York: John Wiley & Sons, 2000:147-156.
(4) Bertolini G, D’Amico R, Nardi D, Tinazzi A, Apolone G. One model, several results: the paradox of the Hosmer-Lemeshow goodness-of-fit test for the logistic regression model. J Epidemiol Biostatistics 2000;5:251- 253.
(5) International Committee of Medical Journal Editors. Uniform Requirements for Manuscripts Submitted to Biomedical Journals, October 2001. http://www.icmje.org/index.html, accessed 14th November 2001.

 

ADC is co-owned by the RCPCH and is the official journal of the European Academy of Paediatrics

BMJ Careers - Latest Paediatrics and Paediatric Surgery Jobs

Paediatrics and Paediatric Surgery Jobs