Article Text

Download PDFPDF

Evidence or enthusiasm? Why yields from UK newborn screening programmes for congenital hypothyroidism are increasing
  1. Rodney J Pollitt
  1. Correspondence to Professor Rodney J Pollitt, Clinical Chemistry and Newborn Screening, The Children's Hospital, Western Bank, Sheffield S10 2TH, UK; rodney.pollitt{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Newborn screening for congenital hypothyroidism (CHT) is generally regarded as a highly successful public health measure. It was officially added to the UK newborn screening programme for phenylketonuria in 19811 though many areas had already started. A two-tier protocol based on assay of thyrotropin (thyroid-stimulating hormone, TSH) was recommended, with a second blood-spot sample taken at 2–6 weeks of age from babies with borderline initial results. As found elsewhere in Europe, the incidence of screening-detected cases in the early years was approximately double that of the clinically diagnosed disorder (table 1).

Table 1

Incidences of clinically diagnosed and screening presumptive-positive cases of congenital hypothyroidism

Presumptive-positive rates (cases referred for clinical evaluation) have increased further over time, partly because many laboratories lowered their primary cut-offs as more sensitive TSH assays became available.6–9 However, other changes have contributed as the Sheffield newborn screening laboratory has experienced increased presumptive-positive rates even though the primary cut-off has remained unchanged.

Screening performance

The Sheffield region laboratory serves the East Midlands and South Yorkshire and has screened between 55 000 and 75 000 babies a year. The screening protocol (figure 1) follows the outline scheme recommended in 1981.1 Between 1980 and 1985 (inclusive), the Pharmacia Phadebas Dry-Spot TSH assay was used, with 10 mU/L primary and second sample cut-offs and 40 mU/L as the secondary cut-off leading to immediate clinical referral. Incidence and sex ratios were broadly similar to those recorded for 1982–1984 by the UK Medical Research Council (MRC) Register5 and from Wales10 and Scotland11 over longer periods.

Figure 1

Outline of the laboratory protocol for screening for congenital hypothyroidism. Usually the primary cut-off value is used also for the second-sample cut-off. TSH, thyroid-stimulating hormone.

From August 2001 onwards, Sheffield used the Perkin-Elmer DELFIA assay, with either AutoDELFIA or Perkin-Elmer Genetic Screening Processor instrumentation, and a secondary cut-off of 20 mU/L as in the standard UK protocol.12 This lower cut-off resulted in the immediate clinical referral of many babies who would previously have been classified as normal on the basis of a second blood sample. Thus, in 1980–1985, only 6 of the 107 referred cases had TSH <40 mU/L in the initial blood sample. A 20 mU/L secondary cut-off would have given a 46% increase in presumptive-positive rate bringing this and the sex ratio close to those recorded in 2002–2007. Some of the remaining differences are accounted for by changes in assay calibration; TSH assays used for screening in the early 1980s varied greatly in both accuracy and precision.13 Based on the upper inflexion points of their sex ratio plots (see below), the DELFIA assay gives readings approximately 40% higher than did the Phadebas kit.

A further increase in presumptive-positive rates followed the introduction of a formal two-sample protocol for screening premature babies in 2008. Changes in the ethnic composition of the screened populations may also be contributing as babies with Pakistani or Bangladeshi ancestry in particular are disproportionately affected.5 ,6 ,14

In addition to changes to the official protocol, many UK laboratories reacted to the improved performance of commercially available TSH assays by adopting lower primary cut-offs. By 2011–2012, only 4 of the 16 UK laboratories were still using a 10 mU/L. The resultant increase in presumptive-positive rates proved controversial15–18 and the trend has recently been reversed. Thus, in 2013–2014, eight laboratories were using 10 mU/L as the primary cut-off and four 6 mU/L or less, with aggregate presumptive-positive rates of 6.3 and 10.2 per 10 000 babies screened, respectively. In the UK as a whole, 344 babies were referred as screening-positive on the basis of the first sample and 237 after the second, with, respectively, 88% and 55% of cases (equivalent to 5.5 per 10 000 screened) where data were available being started on thyroxine treatment at the first clinic visit.6

These changes, both official and unofficial, do not appear to be driven primarily by the incidence of clinically presenting false-negative cases though lower cut-offs ought to result in a marginal improvement in sensitivity. There were two false-negatives in the 495 cases reported to the UK MRC Register.5 Both had thyroxine synthesis defects and presented with goitre. Similar screening-negative cases have been reported from elsewhere. Screening in Scotland between 1979 and 1993 using 15 mU/L as the primary cut-off identified 235 definite or probable cases of CHT, with one false-negative result.11 A much longer French series with a primary cut-off of 20 mU/L and a positive rate of 2.7 per 10 000 experienced five false-negative cases among 2.6 million babies screened.19

Clinical implications

The status of the ‘additional’ cases detected by screening is crucial to the evaluation of the programme's performance. Most of Sheffield's 1980–1985 screening-positive babies showed some indication of being clinically affected at the time of referral. Of 99 cases where data were available, 10 had been diagnosed prior to the screening result being communicated. A further 70 showed signs, mainly prolonged jaundice or delayed bone age, indicating that they were indeed experiencing some degree of hypothyroidism irrespective of long-term outcome.

For obvious reasons, there has never been a real-time trial of treatment versus non-treatment following a positive screening result but a retrospective study by Alm and coworkers in Sweden has helped to clarify the situation.4 Blood-spot samples that had been stored at 4°C for 5 years were analysed by a single-stage screen cut-off of 40 mU/L plasma (equivalent to 20 mU/L in whole blood). Similarly stored samples from infants with clinically diagnosed CHT all had a TSH greater than 50 mU/L plasma.20 The rate of 3.2 positive results per 10 000 samples was similar to the overall incidence of 3.4 per 10 000 in the concurrent real-time Swedish screening programme. Of the 31 cases available for follow-up, only 15, all with TSH concentrations >100 mU/L plasma, had already been diagnosed with CHT, Of the remaining 16 cases, nine were euthyroid with serum TSH ≤5 mU/L and seven, described as undiagnosed, had serum TSH ranging from 6 to 83 mU/L (see below). Clinically diagnosed CHT prior to screening showed a marked preponderance of female cases, usually outnumbering males by a factor of 2. A marked predominance of female cases was observed in the Swedish retrospective study, the first 3 years of the UK screening programme (no sex data are included in the more recent UK-wide reports) and in longer series from Wales and Scotland.10 ,11 The female to male ratio of the Sheffield presumptive-positive cases decreased markedly with increasing incidence (table 1). In the series screened using the DELFIA assay, the female excess was largely confined to babies with an initial sample TSH ≥65 mU/L (figure 2). This is consistent with the majority having either athyreosis or ectopy, those with TSH >265 mU/L mostly having athyreosis as this tends to show higher TSH concentrations and less marked sexual dimorphism.21 ,22 With an incidence of 2.9 per 10 000, not all these babies would have presented clinically in the absence of screening.

Figure 2

Sex distribution of screening-positive cases with increasing thyroid-stimulating hormone (TSH) concentration in the initial screening blood-spot (Sheffield data with DELFIA assays, August 2001–January 2014). (A) Cumulative numbers of male and female cases. (B) Cumulative excess of female cases. There were 2.68 babies with TSH ≥65 mU/L per 10 000 screened. The shaded arrow indicates the DELFIA equivalent of 40 mU/L with the Phadebas assay. (C) Cumulative excess of female cases as in (B), plotted against case number in ascending order of the TSH value. Cases between the upper and lower inflexion points indicated by the open arrow had a female to male ratio of 1.95. TSH, thyroid-stimulating hormone.

The outlook for the majority of cases with TSH <65 mU/L is even less clear. Unless an anatomical thyroid abnormality can be demonstrated, classification will usually rest on plasma TSH and thyroid hormone levels, but there is no general agreement on case definition. Thus, a study using a second-sample TSH cut-off of 5 mU/L revealed ‘an unexpected frequency of congenital hypothyroidism’ mainly with gland in situ.23 Longitudinal studies on a similar group, described as having subclinical hypothyroidism, showed that mild abnormalities often persist beyond the first year of life. Some subjects with serum TSH >4.0 mU/L beyond the age of 4 years old had demonstrable morphological abnormalities (hypoplasia or hemiagenesis) or goitre.24 Though some authors have speculated that such subclinical hypothyroidism may foretell juvenile hypothyroidism or a variety of other problems (growth or cognitive defects, hyperlipidaemia, heart problems, maternal hypothyroidism during pregnancy), several recent reviewers have commented that there is insufficient evidence of harm to justify medical intervention in the context of a newborn screening programme.15 ,18 ,25 ,26

A contrary view, supporting the use of 6 mU/L as the primary cut-off, cites the ‘undiagnosed’ group of cases in the Swedish retrospective study as showing that subclinical CHT may lead to an average decrease of 7 IQ points, ‘a significant reduction in IQ potential’.8 Whether it is appropriate to extrapolate from a heterogeneous group with neonatal TSH values ranging from 45 to >100 mU/L plasma to a group with blood-spot TSH of 6–10 mU/L is debateable. Additionally, the Swedish ‘undiagnosed’ group was heterogeneous and included a subject with a serum TSH of 83 mU/L and a development quotient (DQ) of 84. If this subject is removed, the group mean Griffiths development quotient becomes 103, equal to that of the concurrent controls. Though the study lacks statistical power and cannot totally exclude any effects at the milder end of the spectrum, it is striking that all nine transient cases and five of the six ‘undiagnosed’ cases had apparently escaped damage, indicating a homeostatic system with a considerable degree of resilience.

In contrast to this, recent studies have shown a correlation between marginally increased TSH levels in the newborn period and impaired cognitive development.27 ,28 This may well be an indirect relationship with iodine insufficiency as the underlying cause. Even in an area of the UK previously thought to be iodine sufficient, the iodine concentration in first-trimester maternal urine correlates well with IQ of the child at 8 years of age.29 Iodine insufficiency results in an increased frequency of moderately elevated TSH in newborn screening samples from affected populations.30 Whether newborn screening programmes should be adjusted to detect neonates affected in this way and whether they require treatment, other than to ensure an adequate iodine intake, are still matters of debate.

Screening policy

In its current form, the UK screening programme for CHT fails to satisfy some of the basic principles outlined by Wilson and Jungner in 1968, let alone the more stringent quantitative requirements of the UK National Screening Committee. In particular, the natural history of CHT, as detected in its various forms by newborn screening, is not fully understood and there is no general agreement as to cut-off levels and who (not) to treat as patients.

Muir Gray, formerly the programme director for the UK National Screening Committee, is quoted as saying that all screening programmes do harm, though some can do good as well. Screening for CHT has undoubtedly done some good. It has reduced the overall burden of intellectual disability associated with the late-diagnosed disorder, though probably to a lesser extent than predicted by early studies.18 Prompt treatment does not guarantee a completely normal outcome and the degree of in utero hypothyroidism, judged by pretreatment blood T4 levels, affects intelligence at 5 years of age31 and is also reflected in motor skills and the severity of behavioural problems at 10 years of age.32

Harm is more varied and difficult to assess. A presumptive-positive screening diagnosis, even if disproved on further investigation, is a traumatic event for the family concerned and may have lasting effects.33 Treatment and the concomitant monitoring also have a cost, financial to the healthcare system, financial and psychological to the family itself. Children being treated for CHT show lower mean health-related quality of life and self-worth than the general population, with no significant difference between severe and moderate-to-mild CHT.34 Having carried out a study of children with transient CHT and hyperthyrotropinaemia and normal subsequent development, Köhler et al35 conclude that, while growth and development should be followed, frequent monitoring of thyroid function and mental development should be avoided because the concerns and anxieties raised in parents are more likely to impair development than the thyroid disorder itself.

There are also uncertainties about treatment: the optimum starting dose of thyroxine and the need to normalise blood TSH concentrations as soon as possible.36 Although Alm et al4 found that moderately increased TSH, even over the age of 5 years old, had little or no effect on DQ, current UK standards aim for normalisation within the first month of treatment.37 A recent study indicates the need for caution in that overtreatment during the first 2 years of life was more damaging than undertreatment, while fast normalisation, which leads to above-normal development scores in the early years, had no effect on IQ at 11 years of age.38

How to determine the optimum balance between benefit and harm? Even with the rather conservative cut-offs of the early 1980s, the number of babies benefiting from the screening programme was matched by an equal number who would experience only varying degrees of harm. In recent years, official UK standards have been progressively tightened, lowering the secondary cut-off for immediate clinical referral and, more recently, reducing the interval before repeat blood samples are taken from babies with borderline initial results.37 Presumptive-positive rates are now some fourfold higher than the incidence of clinically diagnosed CHT prior to the introduction of screening (higher still where laboratories have lowered their primary cut-offs) and for many screening-positive babies there is no way of predicting whether they would benefit significantly from treatment. A culture of continuous improvement needs to be applied prudently and these changes appear to have been driven more by enthusiasm than by evidence. Firm evidence has proved elusive despite the mass of information available. Formal trials of any sort would present huge logistic and ethical problems. However, several of our European neighbours are using primary cut-offs of 15–20 mU/L TSH with incidences between 2.5 and 3.5 per 10 000 and apparently satisfactory overall results.39 Closer comparison might be rewarding.



  • Contributors RJP assembled the Sheffield data, conducted the literature search and wrote the paper.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.