Statistics from Altmetric.com
Commentary on the paper by van Buuren et al
Measurement of height is an important component of child health care and has been widely incorporated into paediatric practice. Yet little is known about how it performs in terms of sensitivity and specificity for detecting growth disorders. This lack of information impacts on health care in a number of ways. First, it is difficult to inform public health policy via recommendations for height monitoring, which has resulted in a plethora of statements made about referral for height assessment. One consequence of this has been to opt for a minimum standard for practice as exemplified in Health for all children.1 Second, the lack of information on test performance in the early steps of the short stature evaluation decision tree makes it difficult to interpret subsequent tests and ultimately the likelihood of the presence or absence of disease.2
The Dutch study reported by van Buuren and colleagues in this issue3 addresses for the first time these issues of test performance by quantifying the role of height monitoring in the identification of girls with Turner’s syndrome (TS). TS is the ideal condition to use to show the methodology as it fulfils several important screening criteria—it is common (1 in 2500 live female births), a confirmatory test is available with high sensitivity and specificity (karyotype), and early intervention can appreciably influence outcome (growth, osteoporosis, and management of ovarian dysfunction). One of the problems with TS is that universal karyotype screening is unfeasibly expensive—a pre-karyotype assessment is required. The clinical manifestations of TS are variable whereas the short stature, particularly with respect to parental height, is not, so height monitoring clearly should play an important role.
The Dutch group treats height monitoring as a diagnostic test using two distinct populations, TS girls (cases) and normal girls (controls), which together provide estimates of sensitivity, specificity, and median referral age for a series of distinct screening rules for referral for height assessment. The three basic rules they consider, which are based on Dutch guidelines, are: (1) height standard deviation score (SDS) below a given cut-off; (2) height SDS below a given cut-off based on target height; and (3) height SDS velocity below a given cut-off. The performance of these rules, both separately and in combination, is assessed for a series of distinct cut-offs and age(s) when they apply, and the best performing rules identified.
What should we look for in a screening rule? It needs a high sensitivity, so it identifies most girls with TS, but more importantly it must have a very low false positive rate (that is, very high specificity). A false positive rate exceeding say 1% (specificity <99%) would have serious implications for the workload of specialist growth clinics. With this in mind the British 1990 height reference chart4 includes a 0.4th centile curve which predicts a false positive rate of only about 0.4%,5 corresponding to an absolute height SDS rule with a cut-off of −2.67.
With this in mind the results of the Dutch study are enlightening. The absolute height rule performs relatively poorly, with a specificity of only 98.1% (sensitivity 41%) with a cut-off of −3.5 up to age 3 and 3.0 afterwards. This is appreciably worse than the 99.9% predicted theoretically, and the reason why it performs so poorly is not obvious. It may be because the Dutch height reference does not adjust birth length for gestation.
The parentally adjusted rule has a much higher specificity, up to 99.4% or better, and its sensitivity is also higher, near 70%. The deflection (velocity) rule gives specificities close to 100% but sensitivities below 60%. The authors propose a combined rule involving these two components with specificity 99.4% and sensitivity 79%.
Two strengths of the approach are the ability to compare the performance of different screening rules, and the use of pre-existing data. This means that large prospective studies are not required, and that screening rules can be developed for any growth disorder where suitable data exist. For the purists, one slight disadvantage is that the estimates of sensitivity and specificity are potentially biased. This is because the TS population is itself biased, consisting of girls who have had to draw attention to themselves to be identified. We do not know what proportion of TS patients were missed in assembling the TS cohort. If the factor identifying TS girls was short stature, this might improve test performance. Also, using datasets drawn over a long period of time may tend to incorporate the more severely affected in the earlier years. As a result the sensitivity and specificity results need to be interpreted with caution.
The clinical significance of the findings is intriguing. First, the current UK view is that height velocity does not contribute usefully to growth monitoring,1,6 yet one of the proposed screening rules includes height velocity. Second, the findings confirm the value of parental height adjustment. So how should these results affect the UK recommendations for height assessment? Measuring height velocity involves two sets of costs: the resource cost of having to collect the longitudinal height data, and the delay cost of potential cases having to wait an extra year or more before being diagnosed, rather that relying on their height at presentation. So does the benefit of including height velocity justify the cost? In our view the answer is no. Adding velocity to the parentally adjusted rule with cut-off −2 increases the sensitivity by just 3% for the same specificity. A better approach would be to focus on the parental height rule, which can in theory be improved using formal regression methods—that is, height adjusted for familial height.7,8 The authors’ methodology could quantify the benefit of this approach.
From the epidemiologist’s standpoint these results are valuable in showing how to study the performance of growth assessment techniques “in the field”. The ideal approach would be to compare height measurement performance with a karyotype assessment in all girls born in the UK, but such a study would be very expensive—only 120–150 TS girls are born each year, so the study would need to last several years. However, if one accepts that the sensitivity and specificity may be different in the “field”, then at least the proposed approach allows for a more precise estimate of the role of height monitoring in the population, and provides a methodology which could be applied to other areas of interest such as growth hormone deficiency. The approach and the information it provides are to be welcomed and should now be used to inform height monitoring practice.
Commentary on the paper by van Buuren et al