Background The authors aimed to evaluate the benefits and harms of universal newborn hearing screening programmes in the detection of hearing impairment.
Objectives In the absence of randomised trials evaluating whole screening programmes, the study divided the objective into three systematic reviews of non-randomised controlled studies of diagnostic accuracy of screening tests, screening versus no screening, and therapeutic effect of early versus later treatment.
Methods The authors searched 11 bibliographic databases, and included 17 studies (diagnostic: 9, screening: 2, and treatment: 6). All studies apart from one treatment study showed major quality deficits. Eight diagnostic studies comparing otoacoustic emissions with auditory brainstem response showed sensitivities (and specificities) between 50% (49.1%) and 100% (97.2%).
Results The studies comparing screening versus no screening showed an improvement of speech development of children in the screening group compared with the group without screening. Early treatment was associated with better language development in comparison to children with later treatment.
Conclusions The authors concluded that there is a lack of high-quality evidence regarding all elements of newborn hearing screening. Early identification and early treatment of children with hearing impairments may be associated with advantages in language development. Other patient-relevant parameters, such as social aspects, quality of life, and educational development, have not been adequately investigated.
Statistics from Altmetric.com
The development of the organs of the auditory system is almost completed before birth,1 so that a functional sense of hearing is usually present at the end of pregnancy. Restrictions to the quality of life and development of children with congenital hearing loss have been described, depending on the severity of the loss of hearing and the ability to compensate.1 2 According to the estimates of the German registry for hearing loss in children, the prevalence of congenital hearing abnormalities is about 1.2 per 1000 births. For neonates with risk factors, the prevalence is estimated to be 10–30 per 1000.1 3
The objective of neonatal hearing screening is to identify hearing impairments shortly after birth to initiate treatment as soon as possible and to allow affected children to enjoy largely normal development.1 4
This study, which focuses on outcomes relevant to the patient, was commissioned by the German Institute for Quality and Efficiency in Health Care (IQWiG). Reports prepared by the IQWiG support decisions of the Federal Joint Committee (Gemeinsamer Bundesausschuss). Both together act as ‘‘the German version of the British NICE’’.7 Our objective was to evaluate the benefits and harms of (universal) hearing screening in newborns in the early detection of hearing impairment. Based on the full report that has been published on the IQWiG website,8 the Federal Joint Committee decided to implement a national newborn hearing screening programme starting on 1 January 2009.9
What is already known about this topic?
Objective of neonatal hearing screening (NHS) is to identify hearing impairments shortly after birth to initiate treatment as soon as possible and to allow affected children to enjoy largely normal development.
Several countries, for example Great Britain and many states in the USA, have initiated a NHS programme.
What this study adds?
A detailed description of limitations of the current literature.
There is a need for high-quality studies evaluating the benefit and harm of (universal) NHS for early detection of hearing impairment.
There is low-quality evidence that children with hearing impairments identified in universal NHS have advantages with respect to language development.
The results presented here come from an update of the IQWiG report.
Systematic literature search
The literature search was conducted using 11 bibliographic databases: MEDLINE (Ovid), EMBASE, CINAHL, PsycINFO, PSYNDEX, ERIC as well as the Cochrane Library databases on primary publications (Clinical Trials), systematic reviews (Cochrane Reviews), other reviews (DARE), economic evaluations (NHS EED), and health technology assessments (HTA). The search strategy in Medline (Ovid) was based on combinations of medical subject heading terms and text words and was not restricted to specific languages or years of publication. The search strategies for other databases were conducted following similar search algorithms; the one for the treatment part using Medline (Ovid) is presented in table 1. The last search was carried out on 1 October 2007.
We searched the reference lists of included studies and identified reviews for additional references. Moreover, we contacted authors to gain additional information on included studies and we sent enquiries to hospitals and to manufacturers of screening instruments, hearing aids, and cochlear implants.
All stages of study selection, data extraction, and quality assessment were carried out independently by two reviewers (RW; RR or JK). Any disagreement during the selection, extraction, and assessment process was resolved by discussion and consensus.
Search findings were screened for potentially eligible studies. Abstracts and full articles were obtained for detailed evaluation, and eligible trials were included into the systematic reviews.
For evaluations of whole screening programmes and earlier versus later treatment with cochlear implants and/or hearing aids, we addressed outcomes such as speech and social development and included screening studies with parallel control groups. To evaluate tests in screening populations we assessed the diagnostic accuracy of otoacoustic emissions (OAEs) and auditory brainstem response (ABR) against any reference test.
Table 2 shows the pre-specified inclusion criteria for the different sections of our review.
Data extraction and quality assessment
A quality evaluation tool of the Centre for Reviews and Dissemination (CRD)10was modified and used to evaluate screening and treatment studies. Particular attention was paid to aspects of sample size planning, blinding, comparability of groups in baseline characteristics, consideration of confounding factors, and transparency of patient flow.
The QUADAS instrument11was used for the quality assessment of diagnostic studies.
Based on the limitations of the included studies, no metaanalysis or sensitivity analysis could be performed. Graphs were generated using Version 5.0.17 of the Review Manager.12
Description of search and selection process
The searches identified 15 354 citations (fig 1). We excluded 15 052 citations after checking the title and abstracts (table 2). Three hundred and two full papers were retrieved for further assessment. Of these, 274 were excluded. Selected populations, for example, only high-risk children, and studies without comparison group were frequent reasons for exclusion. An update search was carried out on 1 October 2007.
Of nine studies (12 publications) included for the diagnostic part, eight13,–,20 investigated the diagnostic accuracy of the OAE measurement. One study21,–,24 observed a two-stage screening procedure (OAE and ABR).
Ten publications reporting on two studies were included for the screening part. Yoshinaga-Itano 200125 26 compared hearing-impaired children from hospitals with screening with children from hospitals without a screening programme. Kennedy 200621,–,24 27,–,30 compared periods/regions with screening to periods/regions without screening.
In total, six studies31,–,36 were included for the treatment part. All studies investigated language development. Wake 200535is a population-based cohort study; the other studies were retrospective analyses on the basis of available data.
Study quality was generally poor, for example, for items such as the sample size planning, blinded assessment of outcome parameters, the consideration of confounding factors, and the documentation of uninterpretable tests or tests that were not performed. Therefore, our summary assessment of study quality10 11showed ‘‘major deficiencies’’ in nearly all of the included studies. Only one treatment study35showed ‘‘minor deficiencies’’.
Results of diagnostic studies
OAEs were investigated in eight13,–,20 out of nine studies. Compared to (automated) ABR, the values for sensitivity vary between 0.50 and 1.0 and the values for specificity between 0.49 and 0.97 (fig 2). Because of their heterogeneity, a quantitative summary of the results in a meta-analysis or in a summary receiver operating characteristic (SROC) curve did not seem sensible. Therefore, results are presented in receiver operating characteristic (ROC) space (fig 3). In addition, the reference test used in most studies (ABR) has a marked error rate.37 Furthermore, children with auditory neuropathy/auditory dyssynchrony will not be detected correctly in all cases.38,–,40
One study24supplied data on the diagnostic quality of two-stage screening, that is, sensitivity and specificity of the combination of OAE and ABR and the programme sensitivity. Even though there was no actual follow-up of the screennegative children, we assumed that identification of at least a portion of children with a false negative test result was guaranteed. Based on this assumption the estimated sensitivity of the two-stage screening is 0.917 (95% CI 0.742 to 0.977) and the specificity is 0.985 (95% CI 0.983 to 0.987). If the children not participating in the screening are included (intention-toscreen), the programme sensitivity can be calculated as 0.710 (95% CI 0.520 to 0.858).
Results of screening studies
Both included studies give an account of language development (receptive, expressive), communicative abilities, and spontaneous language (table 3). No data were reported on other patient-relevant outcome parameters, such as general and social development, quality of life, and emotional or educational development.
Concerning receptive language development, both studies report significant differences in favour of universal hearing screening. Adjusted mean difference in Kennedy 200621,–,24 27,–,30 is 0.56 (95% CI 0.03 to 1.08, p = 0.04; Test for Reception of Grammar, British Picture Vocabulary Scale). Yoshinaga-Itano25 26reported mean scores of 81.5 (screened group) and 66.8 (unscreened group, p<0.001; Minnesota Child Development Inventory). Regarding the expressive language development, Yoshinaga-Itano 2001 reported that unscreened children (mean 62.1) exhibited a significantly lower expressive vocabulary than the screened group (mean 82.9; p<0.001) while Kennedy 2006 indicated a favourable trend for screened children (adjusted mean difference 0.30 (95% CI 20.22 to 0.81, p = 0.25)).
Only Yoshinaga-Itano 2001 reported how many children exhibited delayed language development if expressive and receptive language development were counted together (total language development). Seventeen of 25 children (68%) in the unscreened group showed delayed language development, in comparison to 6 out of 25 (24%) children in the screened group; however, more screened children showed a normal language development, that is, comparable to hearing children (56% vs 24%; p = 0.008).
In view of communicative abilities and spontaneous speech, Yoshinaga-Itano 2001 reported that the screened children scored statistically better on the number of different consonant forms (screened children: mean (SD) 13.3 (10.39); children without screening: mean (SD) 9.4 (8.31); difference in mean number: 3.9; p<0.01) and number of intelligible words (no data given; p = 0.004). The difference in the mean number of intelligible vowel forms was 1.1 (screened children: mean (SD) 10.8 (6.24); children without screening: mean (SD) 9.7 (4.16; p = 0.22)).
Taken together, the study results indicate a benefit for universal newborn hearing screening for the language development of children with hearing impairments with average ages of 3 or 8 years. However, it must be kept in mind that the methodologically superior study of Kennedy 2006 – albeit with a number of deficiencies – found much less optimistic results compared with Yoshinaga-Itano 2001. Furthermore, the clinical relevance of the observed differences remains unclear.
Results of treatment studies
We included six studies31,–,36 about the benefit of early versus later intervention with respect to the patient-relevant outcome parameters defined in advance (table 2). These studies only provided information on the language development of children with hearing impairments. Other patient-relevant outcome parameters were not investigated.
Markides 198631 reported a statistically significant advantage of children provided with hearing aids (no information on type of device given) at an age of up to 6 months compared with children with later intervention, with respect to language intelligibility at the age of 8–12 years (p = 0.01–p = 0.02, depending on the control group, no data on effect size given; 7-itemscale; no name mentioned).
Five of the included studies investigated the receptive language development. McDonald Connor 200632found significantly larger rates of vocabulary growth (Peabody Picture Vocabulary Test 3) for the first 3 years after implantation for children who received a cochlear implant (own calculation based on publication: 48% Cochlear Corp Mini-22 (Cochlear Corporation, Sydney, Australia); 37% Nucleus-24M and RCS (Cochlear Corporation, Sydney, Australia); 15% other) between 1 and 2.5 years of age compared with later implanted children. After 4 years of use, rates of growth were similar between the groups. Nicholas 200634reported no natural outcomes. The authors found a significant quadratic trend in the relation between duration of implant use and spoken language score (including Children Language Analysis programs for quantification of direct observation variables (CLAN), McArthur Communicative Development Inventory (CDI), and Scales of Early Communication Skills for Hearing Impaired Children (SECS)) revealing a steady increase in language skill for each additional month of use of a cochlear implant (own calculation based on publication: 62% Nucleus-24; 37% Clarion 1.2 or CII from Advanced Bionics Corporation (Sylmar, California, USA); 1% Med-El (Med-El Corporation, Durham, North Carolina, USA); implantation between 12 and 38 months of age) after the first 12 months of implant use. This relation became more pronounced with longer implant use and did not reach asymptote even approaching 32 months of use. YoshinagaItano 199836found that children’s receptive language development after diagnosis and treatment up to the age of 6 months (‘‘early intervention services that focused on improving the child’s communication and language skills’’) was better than children who had been diagnosed and treated later (adjusted mean: 79.6 vs 64.6, p<0.001; Minnesota Child Development Inventory). Moeller 200033reported data on receptive vocabulary. Children who were older at the time of intervention (Diagnostic Early Intervention Program (DEIP), ‘‘a parent/ infant program operated in metropolitan community’’) had poorer results in comparison with early intervention (up to 11 months of age). The children treated early scored within the normal range, the children treated later scored about 1–1.5 SDs lower (Peabody Picture Vocabulary Test). Wake 200535 found no difference with respect to language abilities (Clinical Evaluation of Language Fundamentals) at an age of about 8 years between children with early intervention and those with late intervention (identified children were fitted with hearing aids, mean (SD) age 23.2 (14.7) months. Fourteen per cent of children had implantation of cochlear implants). Only the receptive vocabulary (Peabody Picture Vocabulary Test) was weakly correlated with age at intervention.
Yoshinaga-Itano 1998 reported differences in expressive language development in favour of children given early care (adjusted mean: 78.3 vs 63.1, p<0.001; Minnesota Child Development Inventory, expressive language subscale).
Taken together, most of the study results indicate favourable differences with respect to language development for early rather than later interventions for children with bilateral hearing impairment. Because of the severe deficiencies in study design in five of the six studies, this can only be regarded as indication that the expressive and receptive language abilities, the communicative abilities, and spontaneous language are better in children treated earlier.
No overall reliable evaluation is possible for the diagnostic accuracy of OAEs and ABR as initial screening tests, as there has been no evaluation in an adequately large group of children without risk factors. On the other hand, one study21,–,24 indicates that sequential screening (first OAE and then, if the finding is abnormal, ABR) in practical use might achieve acceptable sensitivity of >90%, with specificity of >98%. However, this estimate must be confirmed; as it is based on a relatively small number of children with hearing impairments, the 95% CI for sensitivity extends from 74% to 98%. In addition, it must be considered that the proportion of unidentified children markedly increases if the children not participating in screening are included in the evaluation (intention-to-screen analysis).
The assumption that universal neonatal hearing screening can lead to earlier diagnosis of congenital paediatric hearing impairment is supported by the two included screening studies.21,–,30 Substantial benefit from screening can only be expected if there is no unnecessary delay between diagnosis and treatment.
There is evidence that early treatment of children with hearing impairments is advantageous for language development. However, the included studies do not allow for any confident conclusions. Other factors, which were not controlled for, may play an even more important role, such as parental involvement in (language) development or the severity of the hearing impairment. Other patient-relevant objectives, such as social aspects, educational development or professional situation, have not been investigated.
Because of the lack of reliable studies, possible harms from neonatal hearing screening could not be evaluated. The potential of harm exists, particularly from false positive findings. The frequency of these is primarily dependent on the quality regulations and quality assurance measures in a screening programme.
In a recent publication, the US Preventive Services Task Force (USPSTF) recommended the screen for hearing loss in all newborn infants.41Their recommendations are based on a comprehensive review of the effects of screening versus no screening, the effects of early interventions, and the adverse effects of screening. There are a number of differences between the USPSTF report and ours.
Our review places more emphasis on studies focusing on diagnostic accuracy, and we provide a systematic review of accuracy studies of OAE and ABR. We excluded studies that involved high-risk groups of newborns (eg, newborns from neonatal intensive care units), because there is good empirical evidence of variation of diagnostic accuracy with different disease prevalence and severity.42
In contrast to the report published by the USPSTF which focused on treatment before 6 months in infants who would not have been identified by targeted screening, our study includes a more general evaluation of effects of earlier versus later treatment. Furthermore, our systematic review offers evidence on a broader set of outcomes when comparing screening programmes versus no screening. The USPSTF report evaluated adverse effects in more detail than our report.
There is a lack of high-quality evidence regarding all elements of newborn hearing screening. The included studies show that early identification and early treatment of children with hearing impairments may be associated with advantages in the language development. Other patient-relevant parameters, such as social aspects, quality of life, and educational development, have not been adequately investigated.
Funding The project was commissioned by the Federal Joint Committee (Gemeinsamer Bundesausschuss, Auf dem Seidenberg 3a, 53721 Siegburg Germany) to the Institute for Quality and Efficiency in Health Care (Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen, IQWiG, Dillenburger Str. 27, 51105 Cologne, Germany). IQWiG has final responsibility for the original report and funded all authors to participate in the project.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.