AIMS To summarise and critically evaluate research conducted in the UK between 1962 and 1996, on the effectiveness and efficiency of the school entry medical (SEM) examination.
METHODS An electronic search of a large number of databases, in conjunction with a search of reference lists, and sources in the grey literature produced a total of 64 studies.
RESULTS Only one overview and 16 primary studies met the review’s broad inclusion criteria. The results showed significant differences in the identification and referral of new and ongoing problems not only between the routine and selective SEM but also within the two types of SEM examination. There were also large differences in the numbers of children selected for SEM examination. No study included in the review defined either the methods or the criteria used to identify children as screen positive. No study provided follow up of children after referral to estimate the positive predictive value or yield of the screening, or follow up of the whole cohort to identify false negative cases.
CONCLUSION Data on the effectiveness and efficiency of both the routine and selective SEM examination in accurately identifying children with new or ongoing health problems are not available at the present time. The studies reviewed here demonstrate the fragility of the evidence on which the school entry medical is based, and call into question the ethical basis of this programme.
A systematic review of the UK literature from 1962 to date assessed the efficiency and effectiveness of the routine school entry medical (SEM) examination compared with the selective SEM
There were significant differences in the identification and referral of new and ongoing problems not only between the routine and selective SEM but also within the two types of SEM
There is insufficient evidence available to assess the effectiveness or efficiency of either the routine or selective SEM
This review demonstrates the fragility of the evidence on which the SEM is based, and questions the ethical basis of this programme
- systematic review
- selective school entry medical examination
Statistics from Altmetric.com
Over the past two decades there has been a gradual shift from the routine to the selective school entry medical (SEM) examination. While there is very little evidence available at the present time to indicate the full extent of this change, it would appear that many purchasing authorities have viewed the continuation of the routine SEM as an unnecessary expenditure. This change has meant an end in many parts of the UK to the routine examination of all children at school entry.
SEM examinations were established almost a century ago, before concepts such as “effective health care” and “evidence-based medicine” were developed. Political concern about public health led to the development of a service in which all children were examined by doctors in school in order to document the prevalence of disease and disability. It was an exercise in population health needs assessment carried out with a view to defining and implementing public health interventions. The information about health needs in school children was collated and published in the reports of the medical officers of health, and was used to make the case for public health interventions such as free school milk, free school meals, communicable disease control measures, and new services such as school eye clinics. As clinical services were set up to meet the needs of children with health problems, the SEM examination acquired another function—the identification of individual health needs in order to offer clinical or individual interventions. This is now defined as screening. Information from screening services can be collated and used to monitor changes in health over time. This function is defined as surveillance. However, as the discipline of epidemiology has developed it has become unnecessary to examine the entire population of children in the UK to gather essential information about population health needs. Although this information may be a useful by-product of screening programmes, it is an inefficient way to gather such data, and the SEM cannot be justified on this basis. Information gathered from the SEM is now rarely published and where it is used to make the case for new services, these tend to be clinical rather than public health interventions. We have therefore based our evaluation on the effectiveness of the SEM examination as a screening procedure.
The research questions raised by this review were as follows: Is the SEM examination effective and efficient, and is the selective SEM, in which children are seen by the school doctor only when there is concern about their health, as efficient as the routine SEM, in which the doctor sees all children? Effectiveness is here defined as the capacity of the selective SEM to improve children’s health, and efficiency is defined as the extent to which the SEM is successful in identifying children with health problems which are amenable to intervention.
INCLUSION AND EXCLUSION CRITERIA
The search strategy was aimed primarily at the identification of meta-analyses, and secondarily at the identification of first order evidence in the form of randomised controlled trials (RCTs). However, no restrictions were placed on the study design to be included in this review, due to the paucity of first order evidence on this topic, and retrospective/prospective studies and audits were included in the analysis.
The review was based on studies of children entering primary schools.
The review included all studies of the efficiency of the doctor’s contribution to the routine or selective SEM examination. Data concerning the screening carried out by the school nurse for vision, hearing, and growth problems were excluded from this review since this component of the SEM has not been effected by the change from the routine to selective SEM examination.
Country of origin
Studies from countries other than the UK were excluded as we deemed that the SEM examination in other countries was sufficiently different from the UK system to be unhelpful in answering our questions.
The years 1962–96 inclusive were searched, these being the years for which there are data available on electronic sources.
The primary outcome measures sought were as follows: uptake rates, referral rates, yield of target conditions, positive predictive value, negative predictive value, sensitivity, specificity, costs, outcome of treatment, and patient satisfaction measures.
ELECTRONIC SEARCH STRATEGY
The following electronic databases were identified by CROS search (covers 60 biomedical science databases on DATASTAR) ranking them according to the largest number of references identifiable using the search terms developed for this review: Medline; Biological Abstract; PsycLIT; Sociofile; Cinahl; Embase; SciSearch.
The search terms used were modified to meet the requirements of individual databases in terms of differences in fields. The Cochrane search strategy was adapted in order to identify three types of evidence in the first instance: (i) meta-analyses/overviews; (ii) RCTs/clinical controlled trials; and (iii) other study designs.
We examined reference lists and bibliographies of review articles to identify relevant studies. Leading researchers were consulted and practitioners and notification of the study was placed in the newsletter of the British Paediatric Association (now the Royal College of Paediatrics and Child Health) and the British Association of Community Child Health. Letters were sent to 261 community paediatricians and school doctors, and to directors of research and development at the then regional, and district health authorities, requesting information on unpublished and ongoing studies.
A data extraction sheet was used specifying the methodological criteria by which the studies were evaluated, and data were extracted from the studies by JB. The data were organised using Reference Manager. The bibliography includes details of all studies identified by the search.
JB selected studies for inclusion; JB and SS-B carried out critical appraisal and assessments of validity. Meta-analyses and overviews were critically appraised using published criteria.1
Criteria for evaluating the effectiveness of screening programmes were developed in the 1960s, and the Wilson and Junger criteria are now well established and widely accepted as the gold standard in reviews of screening programmes.2 These criteria were designed to be applied to programmes in which a single disease entity is sought as opposed to general medical examinations. A preliminary review of the literature suggested that the conditions sought in SEM examinations were not sufficiently well defined to be able to apply the full Wilson and Junger criteria. A new set of criteria were therefore designed specifically for evaluating the quality of these studies. These were as follows:
Clearly defined population—This refers to the number of children eligible for the SEM. This is essential to calculate uptake rates to establish coverage of the programme, and to calculate prevalence rates for the different conditions identified.
Characteristics of the school(s) and their catchment population described—Information on the school, and the population being recruited, is necessary in order to compare studies.
Uptake recorded—The recorded uptake is the number or proportion of eligible children who received a SEM examination.
Conditions being sought clearly defined—The threshold for diagnosing many conditions such as behaviour problems, speech and language problems, learning difficulty, enuresis, asthma, eczema etc, is dependent on the judgment of individual clinicians. Detailed definitions of the criteria used for diagnosis are necessary in order to make comparisons of the findings between studies. They are also required for the application of the Wilson and Junger criteria for evaluating the effectiveness of screening programmes as they are an essential starting point in searches for evidence relating to the natural history, disability, and effectiveness of interventions.
Components of the SEM, that is screening tests, clearly defined—Clinical practice varies from one practitioner to another. Clear definitions of the component parts of the SEM examination are necessary in order to assess the effectiveness of the screening test, and to ensure comparability between studies.
Prospective recording of the outcome of the SEM—Retrospective recording of data is subject to a number of methodological flaws and is less likely to provide an accurate assessment of the effectiveness of an intervention.
Conditions being identified are reported by—(i) Whether the problem was already known about (old/new problems) and (ii) by the action taken: options for action in the SEM include advice/reassurance, referral to primary care for treatment, referral to secondary care for further assessment, and recall for repeat assessment.
Presentation of the referral and recall criteria—In the absence of clear definitions of the conditions being sought, referral and recall criteria may be used as proxy measures.
Follow up of children after referral—Calculation of the positive predictive value and yield of the screening test depend on the outcome of further assessment or secondary screening examinations. Children who are referred for further assessment and who are found not to have a problem requiring intervention are false positive cases.
Follow up of the whole cohort to identify false negative cases—The calculation of the sensitivity, specificity, and negative predictive value requires the identification of false negative cases: children who have a problem which could benefit from intervention. This requires follow up or re-examination of the entire cohort
One overview of the routine and selective SEM was identified.3 A number of outcome measures were used in this overview to assess whether the selective SEM is as effective and/or cheaper than the routine SEM: (i) the total number of problems and the number found for the first time at school entry; (ii) the prevalence of hearing, vision and growth abnormalities, and the percentage found for the first time at school entry; (iii) the prevalence of undetected undescended testes, confirmed on referral, and significant speech delay requiring referral.
The review included both retrospective and prospective observational studies conducted during the period 1986–91. Studies outside the UK were excluded. No further inclusion or exclusion criteria were specified and the nine studies included in the review focused on both the routine (seven studies) and selective (two studies) SEM examination.
The findings show that the number of problems detected per 100 children ranged from 55 to 132 for the routine SEM and 42 to 46 for the selective SEM. The percentage of problems first identified at school entry ranged from 28 to 75 for the routine SEM and 23 to 71 for the selective SEM. The number of referrals per 100 children ranged from four to seven for the selective SEM and 10 to 18 for the routine SEM. The mean rate for the identification of undescended testes and speech delay was 1 for the routine SEM and 0.3 for the selective SEM.
Using the published criteria referred to earlier,1 we identified a number of methodological flaws in this review. In particular, no critical appraisal of the validity of individual studies was undertaken, and it was difficult to assess whether relevant studies had been missed. As a result of these methodological limitations a further search and critical appraisal of primary studies on the routine and selective SEM was undertaken.
PRIMARY STUDIES OF SCREENING
A total of 64 studies were identified, but only 16 of these met all the inclusion criteria.4-19 The remaining studies were excluded for a number of reasons—they did not focus on the outcome of the SEM examination, the methodology was unclear, or they were not conducted in the UK. Three unpublished studies were identified as a result of letters sent to community paediatricians.5-7 The results of the 16 primary studies included in this review are presented in the . Table 1 provides a summary of the results of the critical appraisal of these studies. A list of all 64 studies identified is available from the author.
The results of this review are based on the findings of one RCT,5 two comparative studies,6 7 and 13 prospective and retrospective observational studies or audits of the routine and selective SEM examination.4 8-19
CRITICAL APPRAISAL OF PRIMARY STUDIES
Clearly defined population
All studies apart from one, provided a clear definition of the population of school entrants eligible for SEM,4 providing accurate denominators with which to calculate the uptake rate and prevalence rate for the number of new and ongoing problems identified in each study, and to compare the findings across studies.
Characteristics of the school and catchment population described
Only a small number of studies provided information about the catchment population of children in terms of their ethnicity, social class, Jarman index, etc.7 9 12 13 19 Three studies showed that social and demographic characteristics appeared to influence the number of children selected, and both the number and type of problems identified.7 9 18 Many studies included in this review failed to provide any information concerning the characteristics of the school(s) in which the children being tested were located, and it was therefore difficult to assess the extent to which these factors might have influenced the outcome of the SEM examination.
We identified only three studies that used comparison groups,5-7 and only one of these studies provided details as to the similarity of the two groups of children at the outset of the study (age and sex).6
Uptake rate recorded
The uptake rate was not recorded in the one RCT identified,5 in one of the two comparative studies,7 or in five of the prospective observational studies.4 9 15 16 17 Where these data were provided, the evidence showed a high uptake rate for both the routine and selective SEM.
Conditions being sought are clearly defined
All the studies reported the number of children with different conditions, but none provided definitions of these problems. Conditions such as undescended testes are reasonably well defined clinical entities for which diagnostic criteria are well known. However, the criteria for diagnosis of conditions such as neurodevelopmental disorders, speech/language disorders, and behaviour problems are not so well defined/accepted or clear cut. In the absence of clear definitions it is unlikely that the rate of identification by different practitioners will be the same.
Table 2 shows the total problems identified per 100 children eligible for examination, and the 95% confidence intervals, for both the routine and selective SEM examination in the 11 studies that provided sufficient data to make this calculation. The results show wide disparities both within and between the two types of SEM. The one RCT identified by this review in which the same doctor is reported to have examined both routine and selective school entrants in each area, showed that the routine SEM identified more problems per 100 children than the selective SEM: 27 compared with 19.5 Similarly, one of the two comparative studies in which two doctor and nurse teams examined children in both the routine and selective groups, showed that the routine SEM identified more problems per 100 children than the selective SEM: 179 compared with 155 problems respectively.6 It should be noted that this study reported the identification of “physical problems” as a group. It does not provide a breakdown of “physical problems” and it is possible that vision, hearing, and growth problems have been included as physical problems. It seems unlikely, however, that this explains the large discrepancies in the findings between these two studies. The difference is more likely to be due to the use of widely differing definitions of the problems being sought. The range of total problems identified by the remaining observational studies was 27–40 problems per 100 children in the routine SEM and 2–46 problems per 100 children in the selective SEM. The 95% confidence intervals show little overlap.
Components of the SEM examination, that is screening tests clearly defined
None of the studies reviewed provided clear definitions of the component parts of the SEM examination or the screening tests which were used, making it impossible to determine whether the same tests had been conducted, or to compare the outcome of these studies.
Prospective recording of the outcome of the SEM
Eleven of the 16 studies used prospective recording of the outcome of the SEM examination.4-7 9 12 13 15-17 19 The use of retrospective recording of data in the remaining five studies raises questions concerning the validity of the data.
Conditions identified are reported by the following
(1) The number of new problems identified
Only eight of the studies reported whether the problems identified were new (that is not previously identified), or old (that is already known about), and all of these were observational studies and audits.11-14 16-19 No study for which these data were provided had clear definitions of “new” or “ongoing” problems. The wide variation in results clearly suggests the use of differing definitions of “new problems”. Table3 shows the number of new problems identified at routine and selective SEM examination per 100 children eligible for examination. The range of newly identified problems was 8–45 for the routine SEM and 2–8 for the selective SEM. Once again, there was no overlap between the 95% confidence intervals in the studies of the routine SEM.
(2) The action taken—Advice/reassurance, recall, and referral are important types of action, which may result from a SEM examination. The identification of problems for which no action is taken may not be justifiable. No studies reported the frequency with which advice/reassurance was given and this may reflect the difficulty of extracting data from routine statistics, regarding this component of the SEM as an independent activity. Only five studies reported on the frequency of referrals,6 7 11 16 18and a further five studies reported on the frequency of recalls.6 7 12 14 18 Only three studies reported both the rate for referrals and recalls.6 7 18 The range for referrals was 13–31% of children who were screened at the routine SEM and 20–33% of children who were screened at the selective SEM. The range for recalls was 4–50% for the routine SEM and 48–58% for the selective SEM. These data show, once again, large differences within and between the two types of SEM for both referrals and recalls, and very little overlap of the 95% confidence intervals.
Presentation of the referral and recall criteria
No study provided a clear definition of the criteria used for referral and recall. One study referred to problems in need of referral or recall as “significant problems needing attention” and these were defined as follows: “The doctors rated the problems subjectively as to whether they were likely to be insignificant (such as mild asthma or food fads), moderately significant (such as squint) or very significant (such as congenital heart disease or speech delay) in terms of influencing the health of the child within school and affecting the child’s education”.6 The subjective nature of these definitions means that they could not be replicated in other settings. However, this was one of the most explicit definitions of the criteria used for referral in all of the studies included in this review.
Follow up of children after referral
No study was identified with data on the follow up of children after referral. This prevented any estimation of the positive predictive value or yield of either type of SEM in accurately identifying the children with specific conditions.
Follow up of the whole cohort to identify false negative cases
No study was identified with data on the false negative cases produced by either the routine or selective SEM examination, due to the absence of follow up data on the whole cohort. This prevented the calculation of the sensitivity and specificity of the selective and routine screening examination.
The one RCT identified provided follow up in the year succeeding the trial, of children not selected for SEM. The findings showed that from a cohort of 302 children, 12 were discovered to have serious language development problems, and nine had behaviour problems. No definition of a “serious language problem” was provided, and the nine behaviour problems identified could have developed during the intervening year between test and follow up. This study did not involve the re-examination of children who had been screened in either the selective or routine SEM and as a result, it is impossible to known whether the same number of children would have been missed in the routine medical group.
Selection rate for the selective SEM
The results of all the studies for which it was possible to calculate the percentage of children selected for SEM showed a range of 19–73%. The wide variation in selection rate is likely to be explained by the use of widely differing selection criteria. An interdistrict audit of the SEM in Cheshire which examined this issue in more depth, showed that had the criteria for selection been based on the absence of a satisfactory three year check and/or parental concerns, 217 of the 491 children with new problems which were identified, would have been missed.18 A further prospective observational study of 82 routine SEM examinations in which 48 problems were detected in 37 children, examined what the impact would have been of a number of different selection criteria in terms of the number of children that would have been selected for SEM.17 Various permutations of five criteria were constructed—immunisation status; previous developmental assessments; availability of records; existing defects; and current concerns. Each permutation selected a different number of children while missing children that had been shown to have health problems in the routine SEM that was actually undertaken.
Cost effectiveness of the selective compared with the routine SEM examination
Only two studies provided any data on the costs associated with the use of the selective SEM.4 5 The RCT showed that the selective system consumed 23.3% more medical time than the routine system.5 While this excess time was not translated into costs, we consider it indicates an increase in expenditure on the selective SEM. One further study concluded that the selective SEM may be more expensive than the routine SEM examination especially where immunisation is included.4 These are important findings, albeit rather limited, in the present political and economic climate with fewer funds available for health care, and an emphasis on the streamlining of services.
One further observational study concluded that although medical time was not substantially reduced as a result of a selective SEM, it was, nevertheless, “put to better us” by focusing on children whose health needs were affecting their education.16 However, health needs affecting education were not defined, they were not used to select children for a SEM, and none of the children who were not selected for a SEM examination in this study were followed up to confirm that health problems affecting education had not been missed.
This review focused specifically on the part of the SEM examination affected by the use of routine or selective methods of recruiting children—the examination carried out by the school doctor. The rate for the identification of problems by doctors at SEM varies dramatically. The evidence shows that large numbers of children are identified as having a problem at school entry, and that many of these problems are newly identified as a result of the SEM. A large proportion of these problems result in referrals for further examination or investigation. There were, however, significant differences in the findings between these studies in terms of the number of children selected for medical, the identification of new and ongoing problems, and the number of children both referred and recalled.
One of the biggest problems with the studies was their failure to define the methods used to screen children and the criteria used to define “significant” problems. The only possible explanation of the widely varying rates of identification is the use of different criteria. This problem means that no useful comparisons can be made between different studies. It also makes it impossible to consider the potential benefits or effectiveness of the programme because it is not possible to establish the prevalence, natural history, disability, or treatment effectiveness of ill defined conditions.
The studies included in this review also failed to provide evidence about the efficiency of the SEM in finding children with specific conditions. This was due to the failure to follow up referrals in order to assess the number of false positive cases, and the failure to follow up other children to identify false negative cases. It was not even possible from the data to establish the relative efficiency of the routine and selective SEM. The results showed significant differences not only between routine and selective SEMs but alsowithin the two types of SEM.
The results do, however, show that the selective SEM may be more expensive in terms of doctor time than the routine medical, as a result of the selection process. The extra cost is due largely to the time required to conduct class reviews which involves communication with other professionals and teachers, and the reading of preschool records, hospital letters and health visitor notes, in order to select children for a SEM examination. Cost depends in part on the number of children selected, but it appears that irrespective of the criteria used or the number selected, the selection process is an unavoidable and expensive part of selective SEM, which may result in a greater expenditure on a smaller number of children. The use of school nurses for this task would require further evidence of the effectiveness of school nurses is selecting all children with health problems, and that selection by school nurses can be conducted at an equivalent cost or less, than that by school medical officers.
The proportion of children selected for examination in studies of the selective SEM varied from 20–70%, and this may well reflect differences in the criteria used, rather than differences in need. A variety of criteria are used for the selection of children for SEM at the present time, and there appears to be little evidence of an awareness of the importance of using validated and standardised questionnaires for this purpose.
The role of the SEM has evolved over time, and the shift from routine to selective SEM may well reflect this. However, there has not yet been a real attempt to address the effectiveness of the SEM in meeting its aims, and the studies we have reviewed demonstrate the fragility of the evidence on which this programme is based. Given the increasing body of evidence showing that screening programmes may be harmful,20 providing a screening programme for which there is no rigorous evidence of benefit, is ethically questionable.
None of the research carried out to date provides evidence from either the routine or selective SEM concerning the effectiveness of the broader public health function of the examination—the positive promotion of health, guidance on important health topics, and the maintenance of a body of knowledge in the community regarding child health and development etc. While SEMs are unlikely to be justifiable in terms of these functions alone, the absence of evidence concerning their public health role is a further area for concern.
Data on the efficiency of both the routine or selective SEM in accurately identifying children with new or ongoing health problems, or in demonstrating their effectiveness in improving children’s health, are not available at the present time. Because of this, it is impossible to make evidence-based decisions justifying either conducting SEM examinations per se, or using the selective SEM to recruit children.
While more robust studies of the type identified by this review would provide evidence concerning the efficiency of the SEM in identifying children with problems, such studies would not answer questions concerning the effectiveness of the programme. Studies of the efficiency of the SEM should only be undertaken after it has been demonstrated that the other criteria for effectiveness have been fulfilled. This would involve a concerted attempt to define the conditions being sought at SEM, and a review of the evidence to show that these conditions are common, disabling, and amenable to treatment. In the absence of such evidence, we question the ethical basis of the SEM examination.
The shows details of the studies discussed.