Article Text

Download PDFPDF

Comparing two methods of follow up in a multicentre randomised trial


AIMS To evaluate a parental questionnaire as a means of providing outcome measures for a multicentre randomised controlled trial of treatment for post-haemorrhagic ventricular dilatation.

METHODS The parents of 88 survivors were sent a questionnaire before a paediatric assessment at the age of 30 months. The parents’ responses to individual questions taken mainly from the Griffiths’ mental development scales and their perception of the child’s ability to see and hear were compared with the paediatric findings. A model, based on the parents’ responses to particular questions, allowed the categorisation of the children as normal, impaired, moderately or severely disabled; this was compared with similar categorisation based on the full paediatric assessment.

RESULTS Agreement on items concerning gross motor function ranged between 81 and 99%, concerning dressing between 77 and 80%, concerning feeding between 91 and 99%, and concerning language between 85 and 93%. Similar proportions of children were identified as disabled by the parents (60%) and by the paediatrician (66%). Of 29 children who had developmental quotients less than 70, parents identified 28 as disabled, 18 of them as severely disabled. They were not so good at identifying children with impairments without functional loss.

CONCLUSIONS Further work is required but there is sufficient encouragement from the results to pursue this methodology further for use in comparing groups in randomised trials.

  • outcome measures
  • multicentre randomised controlled trial
  • post-haemorrhagic ventricular dilatation

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Randomised controlled trials provide the most scientific way of evaluating interventions in the perinatal period. As well as reporting the immediate short term outcomes of the treatment being offered, it is important to determine longer term beneficial or harmful effects in infancy and later childhood. Some effects may not become apparent until the children are much older and some early problems may resolve.

There are a number of reasons why longer term follow up may be difficult. Firstly, populations of babies identified at the time of birth often scatter geographically in the preschool years. Secondly, many randomised controlled trials require multicentre and sometimes international collaboration to enrol sufficient numbers of participants to assess the usefulness of a treatment either because the condition being treated is uncommon or because the outcome being sought is infrequent. In such trials, the number of children may be large, and widely scattered in this and other countries, presenting problems with language and differences in the organisation of health care and follow up as well as having considerable cost implications.

Follow up in a clinical or research setting is commonly done by a developmental paediatrician using a clinical examination and a variety of developmental tests or by a psychologist administering standardised psychometric tests. In addition, other specialists may provide information on the status of vision or hearing, a teacher may evaluate the child’s progress at school and parents may give information on the child’s behaviour or daily living skills. It is clear that, traditionally, a follow up assessment is a complex composite of history and observation by a number of different people. This is an expensive and time consuming exercise.

Simpler, reliable ways of following up large groups of children need to be found and we have tested a simple postal questionnaire for parents in which information about their child’s health and abilities was sought. The questionnaire was used in the follow up of babies with post-haemorrhagic ventricular dilatation who were entered into a randomised controlled trial between January 1983 and December 1986 and who were randomly allocated to one of two treatments: early tapping of cerebrospinal fluid or conservative management. The children were followed up at the corrected age of 1 year and these findings have been reported.1 A second follow up was done at the corrected age of 30 months and information on the outcome of the children at this age was obtained in two ways. First, a postal questionnaire was sent to parents asking for information on the health and development of their child; second, a full neurodevelopmental assessment was done by a paediatrician in the child’s home. All the assessments were done by one paediatrician and the results of these assessments have already been reported.2

The aim of this study is to evaluate the use of a questionnaire to parents in a group of children with a very high prevalence of impairment and disability by comparing the information from the questionnaires with the findings at the paediatric assessment done at the same age.



Parents were sent the questionnaire with the letter confirming the date and time of the paediatrician’s visit, arranged to be as close as possible to the date when the child reached the corrected age of 30 months. The paediatrician collected the questionnaire when she visited the family to do her assessment, normally only a few days later. There were 30 questions overall covering gross and fine motor performance, language, social development, and responsiveness. The questionnaire included 15 items from the Griffiths’ infant development scales covering the age range from 12–34 months. Parents responded to these questions with ‘yes’, ‘no’, or ‘uncertain’. The parents were asked for their opinion about their child’s vision and hearing. Finally they were asked to say at what age level (in months) they felt their child was performing in respect of understanding others and in expressing himself or herself, and, overall, the age level at which their child was behaving.


All but 11 of the 112 children were assessed within 10 days of reaching the corrected age of 30 months. These 11 were mostly seen within a further five days, with the latest being seen 35 days later. The assessment included the Griffiths’ infant development,3 a standardised neuromotor assessment,4 and the revised Reynell development language scales.5

For most items on the Griffiths’ scale, the examiner observed the child’s response to a variety of stimuli ranging from verbal requests to responding to toys presented in a specific way. For some of the items in the personal-social and hearing and speech subscales, the test manual states that the examiner is permitted to ask the parent if a child usually says a word or phrase or does a particular task, if this is not observed during the assessment.

The Reynell language scales are reported as age equivalent performance levels for comprehension and expression. The developmental tests, together with the Amiel Tison-Stewart neuromotor assessment formed the basis for an overall assessment of health status as normal, impaired or disabled.

An assessment of the child’s visual acuity was done following the method of Sonksen6 and included a clinical assessment of visual fields and examination for the presence of squint. No fundal examination was carried out. The paediatrician then allocated each child to one of three groups: functional vision, impaired vision (which included those with a field defect), or blind. Likewise, the child’s functional hearing was assessed according to Sheridan7 and categorised as normal, impaired but unaided, or aided.


Comparisons between the parents’ and paediatrician’s assessments were made as follows:

(1) Parents responded to individual questions taken from the Griffiths’ test with yes, no, or uncertain and these responses were compared with the paediatrician’s answers (as yes or no) to the same questions. If a response of ‘uncertain’ was given by a parent, this was regarded as an inability to perform a task, that is, the same as the response ‘no’.

(2) The parents’ perceptions of the child’s ability to hear and to see were compared with the paediatrician’s assessments of hearing and vision. If the parents or the paediatrician were uncertain about the child’s vision or hearing, the responses were categorised separately in the analyses.

(3) The age level estimated by the parents for understanding and speaking (converted to months by the researchers) was compared with the age level (in months) on standard language scales for comprehension and expression (Reynell language scales).

(4) The developmental age level estimated by the parents (converted to months) was compared to the overall age equivalent (in months) obtained from the standard developmental test (Griffiths’ mental development scales).

(5) Parents were not asked directly whether they thought their child was normal, impaired, or disabled. The allocation of children into one of four categories of ‘overall status’ was made using information from the parents’ questionnaire using a hierarchical approach (Appendix ). Firstly, a set of questions was identified to which the answer had to be ‘yes’ before a child could be considered ‘normal’. Secondly, another set of questions was identified, the answer to any one of which had to be either ‘no’ or ‘uncertain’ for a child to be regarded as ‘severely disabled’. A third group was formed using answers to a different set of questions indicating less severe disability (that is inability to perform certain tasks); this was designated the ‘mild to moderately disabled’ group. Remaining children were considered to be in the ‘impaired’ group without disability. These four groups based on parental responses were then compared with the paediatrician’s allocation into four categories based on information recorded on the full paediatric assessment. The levels of disability were designated moderate or severe based on predefined criteria (Appendix ).

(6) The overall status of the children as defined in the parents’ questionnaire was compared directly with the overall Griffiths’ developmental quotient (DQ) in order to answer the question ‘Can parents identify children who are in the lower range of the DQ distribution?’


For questions concerned with the child’s ability or inability to perform a task (yes/no answer), the level of agreement between the parents and the paediatrician was measured by the κ statistic.8 For ordered categories a weighted κ (which adjusts for the seriousness of each discrepancy) was used. Values of κ generally range from 0 (indicating only chance agreement) to 1 (perfect agreement), but are difficult to interpret as they are much affected by the overall totals in each category.9 We have therefore also presented either the raw data, or enough information for the raw data to be deduced.

Where an age level of performance (in months) in the developmental tests was assigned to each child, the differences between the paediatrician’s and parents’ assessments were examined graphically.10 For each child, the difference was plotted against the mean of the two estimates of assessed age (assumed to be the best estimate of the actual age at which the child was performing). Such a plot reveals whether or not differences in assessment appear to be random.



Of the original 157 babies recruited into the trial, 32 died (20%) and a further 13 were lost to follow up mainly through emigration. One hundred and twelve were available for review at the age of 30 months. Questionnaires were not sent to the parents of 17 (15%) of the 112 eligible children for the following reasons. The parents of four of the children did not read English. Thirteen children were found to be very disabled and delayed at the time of the 1 year assessment and it was thought that sending a questionnaire was insensitive as the children could achieve so little and completion of the questionnaire might further upset their parents. A further six parents did not receive the questionnaire in time for completion before the paediatrician’s visit.

Of the 89 questionnaires sent out, one was not returned. In six of the 88 questionnaires which were returned, responses were not available to some of the key questions that were used to allocate the overall status. These children could not, therefore, be included in this part of the assessment.


The responses to the 15 Griffiths’ questions answered by both parents and paediatrician are shown in table 1. The percentage agreement ranged from 77 to 99 per cent with κ ranging from 0.35 to 0.97. For the seven motor tasks, the level of agreement ranged from 81 to 99%; in particular, compared with the paediatrician, parents appeared to underestimate their child’s ability to walk up stairs, although there was a slight difference in the wording of the question. Parents also underestimated their children’s cooperation with dressing and removing articles of clothing, but more felt their child could feed him or herself adequately compared to the paediatrician’s observations.

Table 1

Level of agreement on individual questions (full text of questions in Appendix )


For reasons discussed earlier, most of the parents of blind children were not sent a questionnaire. We compared, therefore, the allocation of the children by the paediatrician into three groups: normal, impaired, or uncertainty about visual status with a similar grouping of the parents’ responses on the questionnaire (table 2). There was agreement between the paediatrician and the parents for 65 of the 84 children (77%) where a comparison could be made (κ = 0.64). Four of the parents did not answer the question about the child’s vision; three of these children were normal and one had a vision impairment according to the paediatrician’s report.

Table 2

Comparison of assessment of vision by paediatrician and parents


There was agreement between the paediatrician and the parents for 74/86 (86%) children on the reporting of hearing status (weighted κ = 0.88) (table 3). Two parents did not respond to the question on the child’s hearing; one of these children had hearing aids. Parents identified only one of the seven children thought by the paediatrician to have impaired hearing. Of the other six, four were considered normal, one parent was uncertain, and one parent did not respond.

Table 3

Comparison of assessment of hearing by paediatrician and parents


Sixteen children were not tested with the Reynell scales because their language development was very delayed.

(A) Comprehension

Eighteen parents did not estimate an age level for comprehension; 63 comparisons were possible. Parents tended to overestimate their child’s performance compared with the paediatrician. The mean (SD) age estimated by the parents was 29.0 (6.5) months and by the paediatrician 25.7 (8.0) months (difference 3.3 months, 95% confidence interval (CI) 1.6 to 5.0; p<0.001). The plot of parent-paediatrician differences in assessed age shows an apparently random scatter (fig 1). The appearance of diagonal lines on this and subsequent plots was due to the tendency for parents to estimate age levels to the nearest half year. If differences are normally distributed, 95% of them would be expected to lie between the mean ± 2SD; this is sometimes referred to as the 95% range of agreement.9 In the study sample, the estimated 95% range of agreement was from −10.2 to 16.8 months and 58/63 (92.1%) of the differences fell within this range.

Figure 1

Comprehension: difference between parents’ and paediatrician’s assessments of age level plotted against mean assessed age level.

(B) Expressive language

Seventeen parents did not estimate an age equivalent for expressive language; 62 comparisons were possible. Parents and the paediatrician assessed expressive language performance at a similar level. The mean (SD) age estimated by the parents was 26.7 (7.7) months and by the paediatrician 25.9 (6.9) months (difference 0.8 months, 95% CI −0.5 to 2.1). The estimated 95% range of agreement was from −9.4 to 11.1 months; 61/62 (98.4.%) of the observed differences fell within this range (fig 2).

Figure 2

Expressive language: difference between parents’ and paediatrician’s assessments of age level plotted against mean assessed age level.


As with comprehension, parents tended to overestimate their child’s performance compared with the paediatrician. The mean (SD) age estimated by the parents was 26.2 (7.3) months and by the paediatrician 23.9 (6.5) months (difference 2.3 months, 95% CI 1.3 to 3.3; p<0.001). The estimated 95% range of agreement was from −6.0 to 10.6 months; 65/69 (94.2%) of observed differences fell within this range (fig 3).

Figure 3

Overall Griffiths’ developmental age level: difference between parents’ and paediatrician’s assessments of age level plotted against mean assessed age level.


Due to missing responses on key questions on the parent questionnaire, it was not possible to allocate an overall status for six children. In only 42/82 (51%) of the children was there agreement between the overall status allocated by the paediatrician and the level derived from the responses to preselected key questions on the parent questionnaire (weighted κ = 0.62) (table 4). Seventeen of the 19 children allocated to the severely disabled group by the paediatrician were categorised similarly by their parents. Of the other two, one was blind and the parents had said they were unsure about his vision, and the other was thought to have a severe hearing problem which the parents did not regard as so disabling since the child had hearing aids which had been very helpful. While there was some mismatch in the perception of severity of disability between parents and the paediatrician, parents and the paediatrician agreed on the presence of a disabling condition in 45/54 (83.3%) instances.

Table 4

Comparison of overall status of children based on paediatrician’s assessment and parents’ questionnaire


In the context of trials, interest in the use of questionnaires is often focused on the question ‘can questionnaires identify children with developmental delay?’ To answer this question, the ability of parents to identify children with a Griffiths’ DQ of less than 70 was examined (table 5). Of 29 children with DQ less than 70 for whom the parents had given sufficient information for categorisation, 18 were thought by parents to be severely disabled and a further 10 moderately disabled. In the one case where the parents’ report led to their child being categorised as normal, the paediatrician assessed him as having a DQ of 66, with global delay. Interestingly, however, the parents remarked that he was already being surpassed in development by his younger brother born 14 months later.

Table 5

Comparison of overall status of children based on Griffiths’ overall developmental quotient and parents’ questionnaire


Most paediatricians will acknowledge that much of what they learn from an assessment of a child is derived from the observations of the parents. Asking parents directly by questionnaire about their child’s health and abilities is, therefore, an obvious method of gathering information, particularly in large multicentre studies where children may have moved from the place of original care before data about their subsequent course are sought.

There have been a number of different approaches to the problem of determining later outcome in children while avoiding the expense of a full paediatric developmental assessment. These have centred mainly on the use of questionnaires either directly to their parents or for administration by trained (non-medical) interviewers either in person or by telephone.11 In some cases, these questionnaires have been designed to screen larger populations in order to identify those children who require more detailed assessment by a developmental paediatrician12; in others, the aim is to provide sufficient information on which to base an assessment of the child’s health status.13

Questionnaires to parents concentrate on issues ofdisability or ability—what the child is unable or able to do at any particular stage. Health professionals, on the other hand, will also want to identify impairments that do not have an associated functional loss, as these may provide clues to the nature of the underlying pathology and are prognostically helpful. Within this trial, the aim was to identify both impairments and disabilities. This is one important reason why a questionnaire to parents about disability will not agree totally with an assessment by a paediatrician. For example, an impairment such as a visual field defect identified by a paediatrician may not be obvious to a parent.

Even within the area of ability and disability, however, there are a number of reasons why observations made by parents and paediatricians may differ. First, parents may fail to respond to the questionnaire or to individual questions and this could lead to misleading conclusions about a group of children if the children of the non-responders differed from those of the responders. There is evidence that children who are difficult to trace are more likely to be disabled than those who are easier to locate14; the same may be true of non-response to questionnaires. It is possible that parents of severely disabled children may find it too stressful to respond to questions about their child’s progress or lack of it. Such non-response could lead to an underestimate of the prevalence of disability in a group of children. In this study, although some parents did not answer some questions, there was no clear evidence of response bias, although the numbers overall were small.

Secondly, an assessment carried out on a one-off occasion with someone unfamiliar to the child may not represent the ‘true’ picture of the child’s ability. A parent living with the child may have observed an achievement some time in the past and therefore knows that the childcan do a particular task, even if he or she refuses to repeat the task on demand.

Thirdly, parents with their first child may have different perceptions of ability and disability from those who have had previous children, particularly if those children have been healthy. On the other hand, parents attending assessment centres will see other children who are more disabled than their own and may therefore underestimate their child’s disability.

Other sources of disagreement may arise because some questions may seem inappropriate in certain circumstances—for example, a question about the ability to climb stairs may appear to disadvantage a child who has no stairs in the home, even although most children will attempt to climb outside the home. Furthermore, parents, aware that their child has a motor problem, may not allow them to attempt such a task for fear they might injure themselves.

Disagreements may also arise if a questionnaire has been completed some time before the paediatric assessment. For instance, hearing may change if a child has had a cold in the intervening period, and motor skills, such as walking, may be acquired.

In some cases, the questions are not understood or are ambiguous; this can be corrected only by piloting the questionnaire in different populations and by using simple everyday language, or by translation. Parental literacy or unfamiliarity with written English may also present problems. Furthermore, there may be difficulties with the choice of answer. In this study, we gave the options of the answers ‘yes’, ‘no’, or ‘uncertain’ to most of the questions. In retrospect, this caused problems in the analysis. Occasionally parents used the option of ‘uncertain’ when they had not observed a child doing something, such as picking up a small object with one or other hand. This could be overcome by having a clause—if you have not seen your child doing this, try getting him or her to do it now. In this analysis, however, the response ‘uncertain’ was treated as ‘no’ except in relation to questions about hearing and vision.

In testing language development, the assessments were not quite the same and this might account for some of the disagreement. Parents were asked how many objects a child could name; the paediatrician presented him or her with toys representing everyday objects: a cup, a spoon, a cat, a car, a baby doll, and a ball and the child ‘passed’ by naming four of these objects. In a standardised assessment it is necessary to present all children with the same objects in order to stimulate speech and hence the situation is an artificial one.

Although the mean assessed ages of the levels of comprehension, expression, and overall status were not very different for the two sources (3.3 months, 0.84 months, and 2.3 months, difference respectively) differences between the parents’ and paediatrician’s estimates for individual children ranged widely. For instance, for comprehension the 95% range of agreement ran from an overestimate by the parents (compared with the paediatrician) of 17 months to an underestimate of 10 months.

Others have also found language to be a problem area. Sonnander in her study,15 validating the use of parental questionnaires against a Griffiths’ developmental assessment at the age of 18 months, reported levels of agreement of the order of 82% in respect of language development, whereas on a number of items assessing fine and gross motor development, she found much higher levels of concordance (95–100%) between parents and the paediatric assessment.

Overall, we found, like Coplan,16 that parents were good at identifying children whose Griffiths’ DQs were less than 70. Furthermore, within the whole study, we found that the parents’ responses categorised 60% of the children as disabled compared with the paediatrician who reported that 66% were disabled, with 21% compared with 23% respectively being in the severely disabled group. At the other end of the scale, parents thought 26% of the children were normal, whereas the paediatrician considered only 6% to be normal. This mismatch arose from the paediatrician’s greater ability to detect impairments not yet manifest in everyday life as a loss of function. For example, a child with abnormal neurological findings without functional loss would have been categorised by the paediatrician as impaired but not disabled, whereas the parent might not have detected the impairment.

The population in this study was an unusual one of children with the serious neonatal complication of post-haemorrhagic ventricular dilatation which carries with it a very high risk of subsequent impairment and disability.2 Most of the parents had been in regular contact with health professionals over the time period of the study. Such parents might therefore be expected to show better agreement with the assessment made by a developmental paediatrician than parents of less disabled children.

In summary, although parents and the paediatrician did not achieve a high level of agreement in respect of the identification of impairment, they agreed well on the presence of disability and on the severity of that disability. In the context of randomised trials, where the outcome sought is the rate of death or disability in the comparison groups, and where it is anticipated that the populations being compared are similar in all respects, including their knowledge of child development, the lack of precise information about impairment in individuals is not so important. We were unfortunately unable to analyse the questionnaire data according to the trial groups in the original study as not all parents were sent the questionnaire because of the severity of their child’s disability at 1 year.

Further development of this questionnaire, and others, to assess different ages and different types of populations, is required but we feel sufficiently encouraged by the results of this study to proceed with developing this methodology.


We are grateful to the parents of the children who participated in the original trial of treatment for post-haemorrhagic ventricular dilatation for their contribution in completing these questionnaires.

Jean Fooks was funded by the Anglia and Oxford Regional Health Authority and the Medical Research Council, Patricia Yudkin by the University of Oxford, Ann Johnson and Diana Elbourne by the Department of Health.


Categorisation of children by parental responses to questionnaire


Criteria for coding children by the paediatrician The children were assessed on each of four domains: intellectual, neuromotor, vision, and hearing. If a child had multiple systems affected, the final grading depended on the most severely affected system.