Background: There is limited knowledge of the accuracy of height and weight measurements from child health records, despite widespread use for research and clinical care. We assess the accuracy of such measurements, using research measurements as the gold standard.
Methods: We compare height/length and weight measurements from clinics of the Avon Longitudinal Study of Parents and Children with routinely collected child health records within 2 months of the clinic date at age 4 (n = 345), 8 (n = 1051), 12 (n = 139), 18 (n = 649), 25 (n = 183), and 43 months (n = 123). To adjust for age differences at measurement, growth data were converted into standard deviation scores using the UK 1990 growth reference.
Results: Mean weight standard deviation score (SDS) differences were ⩽0.08, with mean predicted differences ⩽0.1 kg (eg, mean predicted difference at 8 months −0.011 kg, 95% level of agreement −0.64 to 0.62 kg). Mean height SDS differences were ⩽0.45, with mean predicted differences ⩽0.9 cm (eg, mean predicted difference at 8 months −0.59 cm, 95% level of agreement −3.84 to 2.66 cm). There was indication of lower accuracy at 4 months old (mean predicted height difference at 4 months −0.91 cm, 95% level of agreement −4.61 to 2.79 cm), but this decreased when the age difference between measurements was reduced. Routine measurements slightly overestimated heights of tall children and underestimated those of short children, but otherwise differences were not associated with sex, social class, birth weight, birth length, or maternal anthropometry.
Conclusion: Routinely collected child health record height/length and weight data are compatible with no systematic bias, at least in children over 8 months old, supporting their use in clinical practice and research.
Statistics from Altmetric.com
Anthropometric data are widely used to measure general health in children. Although initially genetically programmed, both height and weight are strongly affected by social and environmental conditions, making growth a useful marker of health and living conditions.1 Weight responds quickly to acute environmental cues. Height, conversely, reflects chronic conditions. Growth is also of interest for its impact on later health; rapid growth in infancy is associated with increased risk of vascular and metabolic disorders in later life.2 The importance of growth is reflected in the fact that regular growth monitoring of children is a feature of many health systems internationally.
What is already known on this topic
Height and weight measurements from child health records are a potentially useful source of data for both clinical practice and research, but there is limited knowledge of their accuracy.
What this study adds
This study uses research clinic data to demonstrate that height and weight measurements routinely collected by health workers have good accuracy.
In the UK, children are measured regularly, with data plotted on growth reference curves. Since 1991, Personal Child Health Records (PCHRs) have been issued to all new parents,3 and are now endorsed in the National Service Framework for Children.4 PCHRs are intended to improve communication between parents and health workers, increase parental understanding of their child’s health and development, and assist with continuity of care.5 Booklets contain tables/charts for length and weight throughout infancy. The majority of Primary Care Trusts use health visitors to complete the PCHRs,6 although staff nurses, community nurses, and student health visitors may also record measurements. PCHRs are generally well used, although the percentage of mothers able to produce them when requested, and the proportion with measurements recorded was lower in disadvantaged than more advantaged groups.7
Routine growth measurements are used to identify children whose growth is of concern, for example, those failing to thrive. In addition to clinical use, data could be used for research, in particular for birth cohort studies. Despite the resources invested in data collection, and the wide-reaching potential of the data, there is limited knowledge of the accuracy of the measurements in PCHRs. One study demonstrated good test-retest reliability of community-based length and weight measurements following training of health visitors in anthropometric measurement (70 health visitors returned test-retest data for five children).8 A further study compared measurements on 10 children taken by health visitors with those by a trained auxologist, and found good accuracy of health visitor measurements.9 This study was preceded by health visitor training. There is, to our knowledge, no evidence on the accuracy of routinely collected measurements that did not involve training of the health workers.
This study investigates the accuracy of length/height and weight measurements from the PCHR using research clinic measurements as a gold standard, and evaluates predictors of differences.
Data and methods
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a prospective cohort study investigating the health and development of children. Full study methodology is published elsewhere10 and is detailed on the study website (http://www.bristol.ac.uk/alspac). Briefly, pregnant women resident in one of three Bristol-based districts with an expected delivery date between 1 April 1991 and 31 December 1992 were invited to participate. Of these women, 14 541 were recruited. A sub-sample of the cohort, known as Children in Focus (CiF), was selected for in-depth follow-up at clinics. CiF was a random 10 per cent sample of the last 6 months of ALSPAC births. Mothers who had moved out of the study area, those lost to follow-up, and those taking part in another study of infant development were excluded. One thousand four hundred and thirty-two families attended at least one CiF clinic. Ethical approval of the study was obtained from the ALSPAC Law and Ethics Committee and three Local Research Ethics Committees.
CiF clinics were held at 4, 8, 12, 18, 25, 31, 37, 43, 49 and 61 months. We compare heights/lengths and weights from CiF clinics with those recorded in the PCHR within 2 months of the CiF clinic. Numbers were insufficient for meaningful comparisons at the 31-, 37-, 49-, and 61-month CiF clinics (table 1). For the remaining clinics, between 123 and 1051 children (11 to 80%) attending the clinic also had measurements in their PCHR within 2 months (table 1). CiF clinics were analysed separately to assess whether accuracy of the PCHR data varies with age.
Measurement of weight and length
Routinely collected weight and height/length data were extracted from PCHRs. The equipment used for weight measurements at CiF clinics was: at 4 months, Fereday 100 kg combined scale; at 8 months, Soehnle scale or Seca scale, model 724; at 12 months, Seca 724 or Seca 835 (for children who could only be weighed with parent); from 18 months onwards, Seca 835. The equipment used for height/length measurement at CiF clinics was: recumbent length (crown-heel) at 4 months, Harpenden Neonatometer (Holtain Ltd, Crymych, Pembs, UK); length from 8 months to 24 months inclusive, Kiddimetre (Raven Equipment Ltd, Dunmow, Essex, UK); standing height from 25 months onwards, Leicester height measure. Staff received training in measurement. Interobserver reliability was established at each clinic.
Assessment of predictors of differences in anthropometry between PCHR and research visits
The following characteristics were considered a priori to be potential predictors of differences between PCHR and CiF measurements: height and weight (mean of PCHR and CiF measurements), sex, birth weight and length (from obstetric records), household social class (highest of mother’s and her partner’s occupational social class assessed from mother’s questionnaire at birth, categorised as manual/non-manual), multiple births (from obstetric records), and maternal height and pre-pregnancy weight (from mother’s questionnaire at recruitment). These factors are predictors of differences between clinic and self-reported heights and weights in other studies (eg, Dubois and Girad11).
To assess whether the children with CiF and PCHR measurements are representative of the full CiF cohort, they were compared with the full CiF cohort in terms of sex, birth length and weight, social class, and maternal anthropometry.
Using CiF measurements as the gold standard, CiF–PCHR measurements were compared to assess the accuracy of PCHR measurements. Since PCHR and CiF measurements were taken at slightly different ages (although always within 2 months), and since children grow rapidly, comparisons needed to take account of age differences. To do this, standard deviation scores (SDS) using the 1990 growth reference were calculated.
Differences between SDS from CiF clinics and PCHR measurements were calculated. The mean difference and standard deviation of the difference between SDS was calculated. A mean SDS difference of 0 is expected if PCHR measurements are unbiased. Bland–Altman graphs were plotted, allowing graphical assessment of agreement. In these graphs the mean of the height/weight SDS at CiF clinic and height/weight SDS from the PCHR is plotted against the difference between SDS, with lines showing the mean difference and the 95% bounds of agreement.12
CiF and PCHR SDS were used to predict measurements at the target ages of CiF clinics. Predicted heights and weights were compared for the CiF and PCHR by the mean difference and 95% level of agreement (mean SD 1.96 × standard deviation of difference). This allows differences between CiF and PCHR measurements to be assessed in centimetres/kilograms rather than SDS, for ease of interpretation.
Precision of the measurement methods was examined by calculating the average within-person standard deviations in SDS score (for both weight and length/height). This assumes individuals remain on the same centile throughout the period studied. As this assumption applies to both methods, comparison of their precisions should be valid although the exact values may be underestimated.
Univariable and multivariable linear regressions of factors (see above) potentially predictive of differences were performed.
Children with measurements from both CiF clinics and PCHRs appear to be representative of the full CiF cohort (defined by those attending at least one CiF clinic) (see supplementary table 1). Note, however, that the CiF cohort is not necessarily representative of all those invited to participate: attendees tended to have older and better-educated mothers (data not shown).
Differences between CiF and PCHR SDS and predicted measurements are presented in table 2. Positive differences indicate that PCHR measurements are underestimates, and vice versa. Mean differences can be close to 0 even with overall low agreement between the two measures. More important are 95% levels of agreement, which provide the range of differences between the CiF and PCHR for 95% of the sample and hence demonstrate the degree to which the two measurements may differ.
Mean differences in weight SDS were all <0.08 SDS and the bounds of the 95% levels of agreement for predicted weights were <−1.5 kg to 1.6 kg for each clinic (table 2) (eg, mean difference in predicted weight at 4 months −0.037 kg, 95% level of agreement −0.78 to 0.71 kg and at 12 months −0.0050 kg, −0.81 to 0.80 kg). Bland–Altman plots and regressions on the difference demonstrate that differences are evenly distributed apart from at 4 months, where PCHR measurements tend to underestimate the weight of heavier children (supplementary fig 1 and supplementary table 2). Other factors (social class, sex, birth weight and length, and maternal anthropometry) do not predict the differences (supplementary table 2).
There is some indication that the accuracy of weight measurements is lower in the younger infants. The lower bound of the 95% level of agreement represents approximately 12% of the mean weight of infants at 4 months, 7%, 8%, 8%, 10% and 9% at 8, 12, 18, 25 and 43 months, respectively.
The estimated within-person standard deviation for SDS of weight was 0.49 for CiF measurements and slightly higher at 0.57 for PCHR measurements. The standard deviation of the SDS difference can be compared with the within-person standard deviation of the SDS to estimate the extent to which observed differences between the CiF and PCHR SDS might be a result of inherent variability in weight SDS of the children. At each age, the standard deviation of the SDS difference between CiF and PCHR measurements is less than the within-person standard deviation of the PCHR SDS, and less than or very similar to the within-person standard deviation of the CiF SDS. This provides reassurance that PCHR weight measurements are precise, since comparing a CiF with a PCHR measurement is similar to comparing two CiF measurements.
Mean SDS height/length differences were bigger than for weight; the mean SDS difference was <0.45 in each case. The bounds of the 95% levels of agreement for predicted heights were all <−4.6 to 3.1 cm for each clinic (eg, mean difference in predicted height at 12 months −0.40 cm, 95% level of agreement −3.24 to 2.44 cm). Although the mean trend is for PCHR measurements to overestimate height, Bland–Altman plots (supplementary fig 1) and regressions on the difference (supplementary table 2) demonstrate that PCHR measurements tend to underestimate the height of short children and overestimate that of tall children. Other factors (social class, sex, birth weight and length, and maternal anthropometry) are not associated with differences (supplementary table 2).
The mean difference and 95% levels of agreement of predicted height are largest at 4 months (mean difference −0.91 cm, 95% level of agreement −4.61 to 2.79 cm). This represents a proportionally greater difference in the infants’ height than at older ages. The lower bound of the 95% level of agreement represents approximately 7% of mean length at 4 months, 5%, 4%, 5%, 4% and 4% at 8, 12, 18, 25 and 43 months, respectively.
The estimated within-person standard deviation for height SDS was 0.47 for CiF and considerably higher at 0.75 for PCHR measurements. Comparing the within-person standard deviation of CiF and PCHR SDS to the standard deviation of the difference in SDS demonstrates that at 4 months, the variability of differences (SD 0.88) is greater than within-person variability of either CiF or PCHR SDS. At older ages, the standard deviation of the differences reduces to be less than or similar to the within-person standard deviations. These findings suggest the PCHR measurements of length at 4 months are relatively imprecise.
Sensitivity analyses examined whether the lower accuracy at 4 months reflected the relatively large age difference between CiF and PCHR measurements at this age. The measurements appeared more accurate when the age difference was restricted to 7 or 6 weeks. At 4 months, mean SDS difference was −0.43 for length and −0.049 for weight when measurements within 2 months were analysed (n = 345 and 382); mean differences reduced to −0.33 for length and −0.033 for weight when the maximum difference was 7 weeks (n = 118 and 142), −0.20 for length and 0.0043 for weight when the maximum difference was 6 weeks (n = 46 and 65). Because of small numbers it was not possible to reduce the age difference further.
Anthropometric data recorded in the PCHR are clinically important and potentially of great use for birth cohort and other studies. Evidence suggests important and systematic inaccuracies in self- or parental-reported weight and height compared with measured values.11 13 14 15 16 17 18 19 20 21 22 The accuracy of measurements in child health records, however, has been little investigated.
This study has demonstrated good accuracy of routine weight measurements, particularly from age 8 months onwards, supporting their use for both clinical practice and research.
Accuracy of routine height/length measurements appears to be slightly lower, with PCHR measurements tending to be underestimates for shorter children and overestimates for tall children. The more pronounced differences for length than weight potentially reflect the difficulty of measuring length in small children; infants must lie still and stretched out. For children over 8 months old, accuracy of PCHR length measurements was good.
There was some indication that accuracy of routine length measurements, and to a lesser degree weight, is lower in younger infants. This may be an artefact of study design. For 4 month olds, the mean age difference between measurements was almost 2 months. This is an unavoidable limitation of using secondary data that did not include this research as a primary objective. Whilst establishing a study to specifically examine accuracy of routine measurements of infant size would avoid this limitation, such a design could introduce bias if health visitors became aware of the study. In our study the routine measurements reflect true practice, uninfluenced by health visitors knowing that their measurements would be compared to research clinic measurements. For 4-month-old infants, a 2-month age difference could be substantial. The assumption of not crossing reference centiles may not be valid, resulting in the false impression of inaccuracy of PCHR measurements. Restricting analysis to individuals with smaller time differences between measurements suggested that the apparent inaccuracy of PCHR measurements at this young age was at least in part due to the age differences between measurements.
Important inaccuracies in routine measurements would be concerning for both researchers and clinicians. For clinical purposes, underestimated measurements would cause unnecessary concern about infants’ growth. Conversely, overestimated results would mean some slow-growing children were falsely recorded as having good growth. Our data, however, imply little systematic bias in reporting of weights and only slight inaccuracy of length/height measurements.
Strengths and limitations
To our knowledge this is the first study to compare routine measurements of children’s length/height and weight with measurements from research clinics. Our study has an important strength that comparisons were made across a range of ages. These measurements were taken over 15 years ago in one area of the UK. Accuracy of routine measurements for contemporary children in other areas may differ. Nonetheless our results demonstrate that it is possible to accurately record weight, and to a slightly lesser degree height/length, in routine clinical practice.
We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. We thank Paul Clarke (Centre for Market and Public Organisation, University of Bristol), Fiona Steele (Centre for Multilevel Modelling, University of Bristol), and reviewers for helpful comments on an earlier draft.
Funding This work was funded by a grant from the UK Economic and Social Research Council (RES-060-23-0011). This grant provides the salary for LDH. The UK Medical Research Council, the Wellcome Trust and the University of Bristol provide core funding support for ALSPAC. The UK Medical Research Council and the University of Bristol provide core funding for the MRC Centre of Causal Analyses in Translational Epidemiology.
Competing interests None.
Ethics approval Ethical approval of the study was obtained from the ALSPAC Law and Ethics Committee and three Local Research Ethics Committees.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.