Aims: To compare, using a decision model, performance, treatment pathways and effects of different newborn screening strategies for developmental hip dysplasia with no screening.
Methods: Detection rate, radiological absence of subluxation at skeletal maturity and avascular necrosis of the femoral head, as favourable and unfavourable treatment outcomes respectively, were compared for the following strategies: clinical screening alone using the Ortolani and Barlow tests; the addition of static and dynamic ultrasound examination of the hips of all infants (universal ultrasound) or restricted to infants with defined risk factors (selective ultrasound); “no screening” (that is, clinical diagnosis only).
Results: Universal or selective ultrasound detects more more affected children (76% and 60% respectively) than clinical screening alone (35%), results in a higher proportion of affected children with favourable treatment outcomes (92% and 88% respectively) than clinical screening alone (78%) or no screening (75%), and the highest proportion of these achieved without recourse to surgery (64% and 79% respectively) compared with clinical screening alone (18%). However, ultrasound based strategies are also associated with the highest number of unfavourable treatment outcomes arising in unaffected children treated following a false positive screening result. The detection rate of clinical screening alone becomes similar to that reported for universal ultrasound when based on studies using experienced examiners (80%) rather than junior medical staff (35%).
Conclusion: From the largely observational data available, ultrasound based screening strategies appear to be most sensitive and effective but are associated with the greatest risk of potential adverse iatrogenic effects arising in unaffected children.
- hip dislocation
- mass screening
- decision trees
- outcome and process assessment
- health care
Statistics from Altmetric.com
In 1966 a national screening programme was introduced in the United Kingdom to identify newborn infants who are considered to be at increased risk of subsequent developmental hip dysplasia (DDH; formerly referred to as congenital dislocation of the hip).1 Under this programme, all infants are examined using the Ortolani and Barlow tests to identify hip dislocation or instability respectively. Those in whom these signs persist are treated with abduction splinting to reduce and stabilise the hip and prevent established or partial dislocation (subluxation). This policy was last reviewed in 1986, when the recommendation to perform a clinical screening examination of the hips on all newborn and very young infants was reinforced.2 The policy is currently under further review by the National Screening Committee’s Child Health Group.
The goal of this screening programme is to achieve normal hip function and development by skeletal maturity in those children who would otherwise have presented clinically with DDH, while a subsidiary goal is to achieve this outcome without recourse to surgery. At its inception, the rationale for the screening programme was based on the then generally accepted premise that DDH first diagnosed clinically after walking age is likely to require complex surgical treatment and to have a less successful outcome than if diagnosed earlier. However, as the outcome of clinical screening has never been compared to that of clinical diagnosis in a randomised trial, the effectiveness of this programme remains controversial.3–5 Furthermore, its performance cannot be assessed directly, as there is no confirmatory diagnostic test. In clinical practice, screen positive infants who are truly affected cannot be distinguished from those who are not.6 Thus clinical screening is associated with potential over-treatment of those with false positive screening results,7 as well as with failures of screening, diagnosis, and treatment in those who are affected.8
In view of these uncertainties, there has been increasing interest in alternative ultrasound based screening strategies.5 Static ultrasound images are used to assess the morphology of the largely cartilaginous newborn hip joint, specifically the depth of the acetabulum and the location of the femoral head at rest, while dynamic images, obtained during a modified Barlow test, are used to assess hip stability. Although the role of ultrasound imaging is not defined in the current UK policy, in practice both methods have crept into use in the UK, but largely to assess infants with defined risk factors.9 In contrast, in some European countries all infants receive a static ultrasound examination. This has led to a subsequent reduction in the incidence of surgery, but a marked increase in the incidence of abduction splinting to levels some 40–70 times higher than the prevalence of DDH before screening was introduced.10,11 The long term outcome for those who are treated despite not being truly affected is an important consideration since there are significant iatrogenic risks associated with abduction splinting, notably avascular necrosis of the femoral head.12 This may affect normal as well as initially abnormal hips and, in its severest form, results in premature osteoarthritis of the hip.13,14
In recognition of these concerns, the Department of Health initiated research under the auspices of a MRC Working Party to assess the current programme and the potential role of ultrasound.6 This comprised observational epidemiological studies,3,9 a randomised trial of ultrasound in the management of neonatal clinical hip instability,15,16 and, as the most appropriate approach to evaluating primary screening was unclear, an examination of existing evidence, which we report here. The objective of this study is to identify those factors that most influence the relative performance and effects of the different primary screening strategies in order to inform future policy and research priorities. We report the performance, treatment pathways, and effects of clinical and ultrasound based screening strategies and “no screening”, using decision tree models to synthesise data sources relevant to the United Kingdom. The costs and efficiency of these different strategies are reported in an accompanying paper.
Definition of target condition
The term DDH refers to a spectrum of developmental disorders of the hip,17 and includes dislocated or subluxated hips where the femoral head is completely or partially displaced from the acetabulum, or stable dysplastic hips where the femoral head is stable and not displaced but the acetabulum is dysplastic or shallow. It is unclear whether stable dysplastic hips, which may present symptomatically in early adult life, share the same antecedents as dislocated or subluxated hips, are preceded by dysplasia or instability in infancy, or are modifiable by early treatment.4 We evaluated strategies to prevent hip dislocation or subluxation, which usually presents clinically and requires surgical treatment during early childhood.6,18
Characterisation of strategies to be compared
Three screening strategies were identified (table 1). In a “clinical screening alone” strategy, all infants are screened with the Ortolani and Barlow tests whereby the examiner attempts to reduce a dislocated hip and provoke dislocation or subluxation respectively. Infants in whom one or both hips are dislocated, subluxated, or unstable are referred for further clinical, but not sonographic, assessment. In a universal ultrasound strategy, all infants receive a static and dynamic ultrasound examination in addition to clinical screening; those with sonographic appearances of dislocation or instability and/or a positive Ortolani or Barlow test are referred for further clinical and sonographic assessment.19 In a selective ultrasound strategy, all infants are screened clinically and assessed for the presence of recognised risk factors: those with a positive Ortolani and/or Barlow test and/or recognised risk factors are referred for sonographic assessment.20 Thus, in this strategy ultrasound is not used as a primary screening test. As screening was introduced without clear evidence of benefit, we included a “no screening” strategy, whereby infants are diagnosed only following presentation with clinical signs or symptoms.
In each screening strategy, infants who on referral have persistent clinical or sonographic dislocation, subluxation, or instability are treated with abduction splinting. Infants treated with abduction splinting include those who would develop DDH (true positives, treated early), as well as those who would not (false positives, treated). Infants with a positive screening result but in whom persistent abnormalities are not confirmed at follow up include those who may present clinically at a later stage, when abduction splinting is no longer possible and surgery is required. They are also true positives, but do not benefit from early treatment because of a failure of diagnosis (true positives, treated late).8 Infants with a negative screening result may present later with clinical signs or symptoms (false negatives) or may remain clinically well (true negatives). In the “no screening” option, infants can only present symptomatically, and surgery, but not abduction splinting, is the treatment option.
Development of a decision model
A decision tree was developed to depict the sequence of events experienced by 100 000 liveborn infants along the screening, management, and treatment pathways for each of the screening strategies (fig 1). A “no screening” strategy was included to allow calculation of the incremental effects of screening. Probabilities in the model assumed to vary by screening strategy include: being screened (A), a positive screening result (B), treatment with abduction splinting following a positive screening result (C), treatment following a negative screening result (E), and treatment following a positive screening result that is not confirmed (F). We assumed that surgical treatment would not be required in those infants with false positive screening results. Otherwise the probability of surgical treatment following abduction splinting (G) was assumed constant for all screening strategies. The probability of DDH among those infants who are not screened (D) was assumed to be equivalent to the prevalence of DDH (see below). These probabilities, as indicated by these letters, are shown in fig 1.
Favourable treatment outcomes
We defined a favourable treatment outcome as the radiological absence of hip dislocation or subluxation at skeletal maturity. Hip pain and range of movement during childhood are a poor guide to normal hip development and function in adult life,21 while radiological appearances by skeletal maturity are thought to predict symptomatic osteoarthritis and need for hip replacement in early adult life. The probability of a favourable treatment outcome was assessed from reports of radiological appearances of the hips at 16–24 years of age, using the system devised by Severin.22 Severin hip scores of 4–6 imply hip dislocation or subluxation associated with increasing degrees of joint deformity, while a score of 3 describes dysplastic (shallow) hips without evidence of displacement. In the base case analysis, we defined a favourable outcome as Severin hip scores of 1–3 in both hips by 16 years. The probabilities of a favourable outcome differ according to the treatment given and were only assigned following surgical or abduction splinting treatment of affected children—that is, excluding abduction splinting treatment in those with false positive screening results. We assumed that they were similar in all strategies. The effect of omitting hip dysplasia without displacement (Severin hip score 3) from the favourable outcomes was explored in a sensitivity analysis.
Unfavourable treatment outcome
We selected avascular necrosis following surgical treatment or abduction splinting as the principal unfavourable treatment outcome. This was derived from published reports of radiological appearances consistent with systems devised by Kalamchi and MacEwen23 or Salter and colleagues,24 and occurring in one or both hips when assessed at least two years following surgical treatment or abduction splinting.
Estimation of probabilities to populate the decision model
Probabilities were obtained from published and unpublished data sources relevant to the United Kingdom, identified through a computerised search of Medline (1966 to August 2001) and Embase (1974 to August 2001) using search strategies modified from those developed for systematic reviews,25 by scanning the reference lists of recent published systematic reviews,26–29 and by contacting experts for unpublished data. With the exception of four randomised trials,20,30–32 the studies reviewed were observational studies. Probability data were summarised adjusting as appropriate to express data using children rather than hips as the denominator (see tables A and B, available on the ADC website; www.archdischild.com/supplemental). Sensitivity analyses were performed using extreme but plausible values as listed in web tables A and B, based when possible on ranges reported in the literature. Details of the sensitivity analyses are presented below and in an accompanying paper33 for those parameters that were most influential or where there were uncertainties that could potentially be addressed by changes in policy.
The probability of abduction splinting in a universal ultrasound strategy was based on data published from the single UK centre operating such a policy19,34 to take account of management practices and treatment thresholds likely to operate in the UK. Estimates of abduction splinting rates derived from other European programmes or from studies incorporating universal ultrasound imaging were examined in a sensitivity analysis.
Estimation of screening programme performance
In clinical practice, children with true positive and false positive screening results cannot be distinguished; screening programme performance cannot therefore be calculated directly. We enumerated the true positive screening results from the decision model by adapting a method devised to estimate the sensitivity and specificity of a screening test where a diagnostic test is lacking.28,35 We assumed that all those with DDH were treated with either surgery or abduction splinting, that all those requiring surgical treatment have DDH, and that the underlying prevalence of DDH has not changed since screening was introduced and is equivalent to the mid-point prevalence estimate for Northern European populations of 120 per 100 000 live births, derived from studies reported before screening was introduced.36 We derived estimates of the rates of surgical treatment with or without prior abduction splinting associated with each screening strategy from the literature. Together with the prevalence estimate above, this allowed calculation of the number treated with abduction splinting who were true and false positives.
Modelling options for implementation
The expertise of the primary screener has been identified as an important factor in the effectiveness of the current policy.4,37–40 In the base case, we derived the false negative rate (E in fig 1) for clinical screening alone from reports from UK centres where junior medical staff are responsible for carrying out the Ortolani and Barlow tests (web table A). We investigated the potential impact of using more experienced examiners by deriving a false negative rate from reports where more experienced or specifically trained staff (physiotherapists or orthopaedic specialists) were responsible for clinical screening (web table A). We also investigated the potential impact of using ultrasound to inform the subsequent management of infants with positive Ortolani or Barlow tests, as assessed in the UK Hip Trial,15 by assuming that this avoided failures of diagnosis among those screening positive.8 Finally we investigated a combination of these two modifications to current policy.
Modelling uncertainties relevant to unfavourable outcomes of screening
Uncertainties in specifying the risk factors to be used as indications for ultrasound examination in a selective ultrasound strategy9 were investigated by varying the proportion of children with a positive screening result (B in fig 1). Similarly, uncertainties in ultrasound indications for abduction splinting were investigated by varying the proportion of children treated with abduction splinting in the ultrasound based strategies (C in fig 1).
The performance of each of the different screening strategies,41 estimated from the decision model, is summarised for 100 000 births in table 2. The percentage of infants with positive screening results is highest for strategies based on ultrasound, being 7.7% and 8.1% for universal and selective use of ultrasound respectively, compared with 2.1% for clinical screening alone. Ultrasound strategies are also associated with a higher estimated detection rate: 76% and 60% for universal and selective use of ultrasound respectively, compared with 35% for clinical screening alone. More false positive screening results occur in strategies using ultrasound (table 2). Thus the odds of being affected given a positive result are most favourable in the clinical examination strategy as a consequence of a higher specificity.
Without screening, all affected children require surgical treatment. This percentage is greatly reduced with screening strategies using ultrasound (table 3). This occurs to a much lesser extent in clinical screening alone, as in this screening strategy there are more children with false negative screening results or with true positive screening results that are unconfirmed.8 By contrast, ultrasound based strategies are associated with higher abduction splinting rates. In the baseline model, the predicted number treated with abduction splinting alone is higher for selective use of ultrasound rather than universal ultrasound (table 3), reflecting the more conservative estimate of treatment probabilities used in the model for universal ultrasound.19,34 It is notable that, in all screening strategies, the majority of children treated with abduction splinting alone have a false positive screening result (403, 426, 638 respectively for clinical screening alone, universal, and selective use of ultrasound).
Our findings suggest that the effectiveness of clinical screening, as performed in the UK, is only marginally better than no screening. Without screening, 90 (75%) of the 120 cases of DDH anticipated among 100 000 live births would be expected to have a favourable treatment outcome, compared with 94 (78%) in a clinical screening alone strategy (table 3). This compares with equivalent figures of 110 (92%) and 106 (88%) in universal and selective use of ultrasound respectively. Furthermore, a higher percentage of these favourable outcomes are achieved without recourse to surgery in ultrasound based screening strategies: 79% with universal ultrasound, 64% with selective ultrasound, compared with 18% with clinical screening alone.
Ultrasound based strategies are associated with the fewest cases of avascular necrosis of the femoral head overall, reflecting the fact that in these strategies fewer affected children require surgery (table 3). Although the number of cases of avascular necrosis which arise in unaffected children treated with abduction splinting as a result of a false positive screening result is relatively similar across the screening strategies, this represents a higher percentage of all avascular necrosis cases in the ultrasound based strategies: 44% in selective ultrasound, 43% in universal ultrasound, and 20% in clinical screening alone.
Uncertainties in estimates of test performance, rates of abduction splinting or surgery, or treatment effectiveness were explored in sensitivity analyses. Omitting hip dysplasia without displacement (Severin 3) from the favourable outcomes (web table B) results in an absolute reduction of 20–25% in the proportion of affected children with a favourable treatment outcome in all strategies. Other relevant sensitivity analyses are reported below.
Modelling options for implementation
The predicted detection rate of 35% for clinical screening alone in the baseline model rises to 80%, comparable to that reported for universal ultrasound, when the false negative rate is derived from centres using more experienced or dedicated screening examiners rather than junior medical staff. As a consequence, the number of children requiring surgery falls from 102 to 50 per 100 000, the percentage of affected children with a favourable outcome rises from 78% to 88%, and the percentage of these outcomes achieved without recourse to surgery rises from 18% to 65%.
In the baseline analysis for clinical screening alone, failure to confirm DDH in affected children who have screened positive (referred to as failures of diagnosis8) may result in nearly 60% of infants correctly identified through screening receiving late (that is, surgical) treatment. If such cases were avoided by using ultrasound to manage infants with a positive clinical screening result,15,20 then the absolute number of favourable treatment outcomes rises slightly from 94 to 99, but the proportion of these achieved without recourse to surgery more than doubles, rising from 18% to 37%. If both these modifications are considered together, then clinical screening alone becomes the most effective option, with 111 favourable treatment outcomes, and 84% achieved without surgery.
Uncertainties relevant to unfavourable outcomes of screening
Uncertainty in specifying the risk factors to be used as indications for ultrasound examination in selective use of ultrasound was investigated by increasing the proportion of infants with a positive screening result from the value of 8% used in the base case to 13% as reported from some UK centres operating such a policy.42 In this scenario, were the probability of abduction splinting to be unchanged, the number of infants with false positive diagnoses who are treated unnecessarily would rise from 638 to 1058.
Uncertainties regarding the ultrasound indications for abduction splinting were investigated by increasing the abduction splinting rate from 518 per 100 000 as reported from the only UK based universal ultrasound programme19 to reflect the experience of non-UK centres10,11,31 where, on average, 4400 per 100 000 are splinted (web table A). Given this scenario, the number of children treated unnecessarily as a consequence of a false positive screening result rises by a factor of 10, from 427 to 4309 per 100 000. Similarly, in selective use of ultrasound, increasing the abduction splinting rate from the baseline of 709 to 1417 per 100 000 as reported from some UK centres, results in a doubling of those with a false positive screening result from 637 to 1346 per 100 000.
The dilemmas arising from the lack of robust evidence to inform screening policies for DDH are well rehearsed.5,29 While screening policies should ideally be based on evidence from randomised trials, in practice this option is often constrained by considerations of cost, duration, and uncertainties in specifying which options and outcomes to compare. Royston has highlighted the complementarity of decision models and trials in appraising screening programmes.43 Our objective in using a decision model to compare policy options based on data relevant to a UK setting was to assess the extent to which existing data can inform these policy decisions without recourse to further primary research and to identify areas in which future empirical research might be most useful for policy.
Although there have been other published evaluations of clinical and ultrasound based screening for DDH,26,29,34,44–46 our approach has two important strengths which have not been addressed previously. Firstly, we have enumerated true and false positive screening results, allowing the performance and potential harms of the different screening strategies to be compared (otherwise only possible in trials comparing screening with no screening). Secondly, we have compared strategies in relation to longer term health outcomes relevant to the goals of screening. This has allowed quantification of the benefits and harms of each screening strategy at a population level and identification of factors with most influence on performance and effects.
One limitation relates to the quality of the literature from which we derived probability estimates.26,29 These were based almost entirely on observational data. Of four randomised or quasi-randomised controlled trials,10,20,30,32 only two provided information of potential relevance to this study.10,20 The lack of randomised evaluations of the effectiveness of abduction splinting is of particular concern, as infants with false positive screening results cannot be identified clinically. We found relatively few observational studies reporting long term outcomes of abduction splinting or surgery relevant to a UK setting, despite more than 30 years experience of clinical screening in the UK.1,29 While recognising that surgical treatment or false negative rates are not reported consistently in the existing literature,29,47 and that outcome data are from selected case series, this model has allowed the available evidence to be examined and subjected to sensitivity analyses.
Of the screening strategies considered, universal ultrasound appears to be associated with the highest number of favourable outcomes as well as the highest proportion of these achieved without requiring surgery. However, it is also associated with the highest risk of potential iatrogenic adverse effects among those with a false positive screening result. The performance of clinical screening alone is poor and only marginally better than no screening when based on the mean false negative rate derived from UK centres where junior medical staff undertake screening.9 This is no longer the case when that estimate is based on the performance of more experienced examiners (physiotherapists or orthopaedic surgeons). Furthermore, these findings concur with previous observations that experience of the screening examiners accounts for differences in the performance of programmes based on clinical screening alone.3,4,26,29,40,48 This finding highlights the importance of strategies to improve training in clinical screening, training which is recognised to be patchy and inconsistent in content in the UK.4,9 A recent prospective study has shown that advanced neonatal nurse practitioners given a structured training in clinical examination are more successful in identifying persistent neonatal hip abnormalities than junior doctors not given such a training.49
There is an almost threefold variation in the percentage of infants referred for ultrasound between UK centres operating selective ultrasound programmes. This reflects uncertainty in the choice of “risk” factors9,29 and highlights the problems of using risk factors with low predictive value as screening tests50 as well as the difficulties in operationalising this screening strategy. Breech presentation at delivery or in the third trimester, female sex, and a family history of DDH are all strongly associated with an increased risk of DDH in the infant.26,51 However, in UK practice, postural foot deformities, oligohydramnios, and clicking hips, which are less strongly associated with DDH, are often included in the definition of risk.9,41,52 In our decision model, we derived estimates of the percentage identified with risk factors and the associated false negative rates from published reports of such programmes, but were not able, from the data available, to estimate the predictive value of individual risk factors in the absence of clinical hip instability. However, the UK Hip Trial findings have shown that the performance of clinical screening can be augmented by using ultrasound to inform the management of infants with clinically unstable hips through a reduction in the risk of unnecessary treatment.15 The role of ultrasound and the value of treatment in those infants without unstable hips but with other risk factors is less clear.32,53
The effects of both ultrasound based strategies are also influenced by rates of treatment with abduction splinting, and this is most marked when these are derived from rates reported from universal ultrasound programmes in other European countries. At these levels, there are important negative consequences for the population screened, as the subsequent increase in unnecessary treatment of infants with false positive diagnoses is likely to result in an increase in those with avascular necrosis. We have not enumerated other risks reported to be associated with abduction splinting, but these are not trivial and include femoral nerve palsies and pressure sores, as well as parental anxiety.16
In conclusion, the decision model presented enumerates the benefits and harms of different screening strategies for DDH, a necessary process in the explicit appraisal of policy options. While ultrasound based strategies may appear to be more effective than clinical screening or no screening, significant uncertainties remain. These include uncertainties in the indications for ultrasound in a selective ultrasound strategy, as well as in the ultrasound indications for treatment with abduction splinting. Our findings also suggest that clinical screening, as currently performed in the UK, is of marginal benefit relative to no screening but could be improved by use of more expert primary screening examiners who have been specifically trained to screen and by using ultrasound to assess infants with positive screening results. This is consistent with the experience of those implementing a recent quality improvement initiative in Northern Ireland which focused on staff training and careful assessment of high risk infants.54
Further research is required to assess the effectiveness of abduction splinting, particularly in those with stable hips but with ultrasound appearances of dysplasia and/or recognised risk factors.32 While structured training for advanced neonatal nurse practitioners appears promising,49 further work is needed to develop training and methods of assessing performance in clinical screening. Equally there is a need to define methods, standards, training, and accreditation in ultrasound imaging of the infant hip, particularly for dynamic imaging. Finally, prospective assessment of the longer term outcomes of surgical and abduction splinting treatment is required.
In policy terms, decisions to stop or modify established screening programmes introduced without prior evaluation require evidence which is often by definition lacking. This model has explored a range of policy and implementation options in a UK setting which decision makers might like to consider. However, the costs and efficiency of these options need further evaluation to provide a basis for informed policy discussion. These are assessed in the accompanying paper.33
We would like to thank the following for their expert advice and access to data and information that have contributed to this review: Sue Banton, Martin Becker, Mike Benson, Sheila Bird, Patrick Cartlidge, Nick Clarke, John Clegg, David Conlon, Rosemary Dove, Diana Elbourne, Charis Glazener, Alastair Gray, Sara Godward, Marion Hall, Tina Higgins, Edmund Hey, Carol Lefebvre, and members of the British Society of Paediatric Radiology.
This work was supported by the Medical Research Council (UK). Work carried out at the Institute of Child Health and GOS NHS Trust benefits from NHS R&D research funding. The University of Bristol is the lead centre for the MRC Health Services Research Collaboration.
CD is a member of the Child Health Group of the UK National Screening Committee.
Probabilities used in the model
Table A Screening and treatment probabilities by screening strategy
Table B Favourable and unfavourable treatment outcomes
The Tables are available as a downloadable PDF (printer friendly file).
If you do not have Adobe Reader installed on your computer,
you can download this free-of-charge, please Click here
Files in this Data Supplement:
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.