Article Text

Download PDFPDF

Failure to thrive: the prevalence and concurrence of anthropometric criteria in a general infant population
  1. E M Olsen1,
  2. J Petersen1,
  3. A M Skovgaard2,
  4. B Weile3,
  5. T Jørgensen1,
  6. C M Wright4
  1. 1Research Centre for Prevention and Health, Glostrup University Hospital, Glostrup, Denmark
  2. 2Child and Adolescent Psychiatric Centre, Glostrup University Hospital, Glostrup, Denmark
  3. 3Department of Paediatrics, Gentofte University Hospital, Gentofte, Denmark
  4. 4Unit of Paediatric Epidemiology and Child Health, Glasgow University, Glasgow, UK
  1. Correspondence to:
    Dr E M Olsen
    Research Centre for Prevention and Health, Glostrup University Hospital, 2600 Glostrup, Denmark; emao{at}


Background: Failure to thrive (FTT) in early childhood is associated with subsequent developmental delay and is recognised to reflect relative undernutrition. Although the concept of FTT is widely used, no consensus exists regarding a specific definition, and it is unclear to what extent different anthropometric definitions concur.

Objective: To compare the prevalence and concurrence of different anthropometric criteria for FTT and test the sensitivity and positive predictive values of these in detecting children with “significant undernutrition”, defined as the combination of slow conditional weight gain and low body mass index (BMI).

Methods: Seven criteria of FTT, including low weight for age, low BMI, low conditional weight gain and Waterlow’s criterion for wasting, were applied to a birth cohort of 6090 Danish infants. The criteria were compared in two age groups: 2–6 and 6–11 months of life.

Results: 27% of infants met one or more criteria in at least one of the two age groups. The concurrence among the criteria was generally poor, with most children identified by only one criterion. Positive predictive values of different criteria ranged from 1% to 58%. Most single criteria identified either less than half the cases of significant undernutrition (found in 3%) or included far too many, thus having a low positive predictive value. Children with low weight for height tended to be relatively tall.

Conclusions: No single measurement on its own seems to be adequate for identifying nutritional growth delay. Further longitudinal population studies are needed to investigate the discriminating power of different criteria in detecting significant undernutrition and subsequent outcomes.

  • BMI, body mass index
  • CCCC2000, Copenhagen County Child Cohort 2000
  • FTT, failure to thrive
  • PEM, protein energy malnutrition
View Full Text

Statistics from

Failure to thrive (FTT) is regarded as an indicator of physical or psychosocial problems in early childhood and is associated with subsequent growth delay and cognitive deficiencies.1–3 Although the concept of FTT is widely used, no consensus exists regarding a specific definition.4 Thus, FTT has been used to cover a broad range of different anthropometric indicators, usually based on centile charts for weight or height.5,6 Criteria involving behavioural characteristics of the child or quality of the mother–child relationship were proposed in early work, which linked the condition to emotional deprivation,7,8 but a consensus in 1985 concluded that the diagnosis should be based solely on anthropometric parameters.9 Reviews further recognised that the unifying characteristic in FTT was relative undernutrition,10,11 thus approaching the concept of “protein energy malnutrition” (PEM), a term used to describe nutritional deprivation among children in developing countries.4 However, FTT and PEM are described in different literatures, with FTT mainly comprising children in more affluent societies.

Most early studies on FTT used criteria based on attained low weight or, sometimes, height with a cut-off around the 3rd or 5th centile.5 Dynamic measures of weight gain are now increasingly being used,6 including fall from a normal birth weight below a given cut-off, dropping through major centile spaces and, recently, slow conditional weight gain, taking into account the normal phenomenon of regression to the mean, with small children tending to move upwards through the centiles and large children tending to cross downwards.12–15 Although FTT and PEM both refer to paediatric undernutrition resulting in growth deviation,4 different criteria are often used in developing societies. Thus, weight may be expressed as a percentage of the median weight for age, like the Gomez criterion, whereas severe undernutrition is often assessed using weight for height, which has the advantage of not requiring age to be known. Thus, Waterlow’s criterion expresses weight as a percentage of the median weight for measured height.16,17 However, weight for height has not been used much to diagnose FTT in affluent countries, but recently published age-specific body mass index (BMI; weight (kg)/height (m2)) standards for childhood18–20 could make this method more feasible.

Thus, several definitions of FTT are in use, but it is unclear to what extent these definitions concur, hampering comparison between studies. The few studies that have compared different definitions found poor concordance, but were performed in highly selected clinical cohorts.21,22 To our knowledge, no such comparison has been carried out in a whole population of children from affluent societies.

In this study, we compare the prevalence and concurrence of different anthropometric criteria of FTT when applied to a birth cohort of Danish infants.


Study population

The birth cohort “The Copenhagen County Child Cohort 2000” (CCCC2000) consisted of all 6090 children born in 16 municipalities in the county of Copenhagen during 2000. The catchment area is the suburban area of Copenhagen city, comprising a mixture of socioeconomic areas with families having low, medium and high income. The CCCC2000 covered 9% of children born in Denmark in 2000, and was representative concerning sex, birth weight, gestational age and Apgar scores.23 The project was approved by the Regional Scientific Ethics Committee of the Copenhagen County, Denmark.

Anthropometric data

Weight and length at birth and perinatal data were obtained from the National Birth Registry. Postnatal measurements were collected by public health nurses using a standardised record as part of four routine home visits conducted when the children were aged about 1–5 weeks, 2–3 months, 4–6 months and 8–10 months. The Danish public health nurses are specifically educated in monitoring child health, and their core duty is to assess the health and development of infants, including standardised measurements of weight and length.

To optimise the availability of data, standardise ages and minimise possible bias due to different numbers of anthropometric measurements, data were grouped into two age groups: 2–6 months (61–180 days) and 6–11 months (181–340 days). One age per child was chosen from each age group, giving priority to ages with both a weight and a length measurement. Growth data were then converted into z scores and centiles using the LMS method.24 Growth curves estimated from the entire growth data of the CCCC2000 cohort were used as reference.25

Criteria of FTT

Seven clinically used anthropometric criteria for FTT (box 1) were applied to the cohort, using the cut-off points corresponding to “moderate” failure to thrive, and the prevalence and concurrence were compared within and across the two age groups.

Box 1: Anthropometric criteria of failure to thrive

  • Weight <75% of median weight for chronological age (Gomez criterion)

  • Weight <80% of median weight for length (Waterlow criterion)

  • Body mass index for chronological age <5th centile

  • Weight for chronological age <5th centile

  • Length for chronological age <5th centile

  • Weight deceleration crossing more than two major centile lines; centile lines used: 5, 10, 25, 50, 75, 90, 95, from birth until weight within the given age group

  • Conditional weight gain = lowest 5%, adjusted for regression towards the mean from birth until weight within the given age group

Conditional weight gain was calculated using the thrive index method,13 where thrive index is the change in weight z scores from birth to the later age, adjusted for regression to the mean, with an average thrive index being zero. This method is described in detail in the appendix.

No child was excluded, as anthropometric screening of children in primary care is normally carried out on the whole population.

Statistical analysis

Crude prevalence was determined using all children screenable by each of the seven criteria, whereas the concurrence among criteria was analysed in subgroups of children for whom growth status could be evaluated for all the given criteria. In the absence of a single gold standard measure of undernutrition, we considered that a child with both poor weight gain and low weight for height was most likely to be significantly undernourished. We thus tested the sensitivity and the positive predictive value of each of the seven anthropometric criteria in detecting children with the combination of conditional weight gain and BMI below the 5th centile, termed “significant undernutrition” in this analysis. BMI was chosen in preference to weight for length, as BMI in childhood is assessed using age-specific centiles, which, at least in theory, make it superior to weight for length in infancy.

Descriptive statistics were used for analyses of prevalence and concurrence, whereas χ2 tests and logistic regression models were used to investigate differences between groups. All analyses were performed using SAS.26



Of the 6090 eligible children, 12 (0.2%) died before they could enter the study, 17 (0.3%) never received any visits, 215 families (3.5%) left the county within the child’s first year of life and 222 (3.7%) records were not located, leaving 5624 (92.3%) children with some data from routine visits. Comparison showed that the 466 children without data were more likely to be living with a single parent (odds ratio 2.2, 95% confidence interval 1.6 to 3.1); otherwise the two groups were demographically alike.

Data available for analysis

We analysed the growth data of 4640 children in the first age group (median (interquartile range (IQR) age 17.3 (14.1–19.6) weeks) and 4642 children with growth data in the second age group (median (IQR) age 35.6 (34.7–37.1) weeks), with 4635 children (82% of children with nurse visits) having a complete set of weight and length data in at least one of the two age groups (table 1).

Table 1

 Number of children with anthropometric measurements among the 5624 children with nurse visits

Weight was the most prevalent measurement recorded, with 5323 (95%) children having at least one weight value compared with 4749 (84%) children having at least one length value. However, children with weight data generally had lower weight and BMI values at birth than children without any weight data. By contrast, children with full versus partial growth data showed no significant differences.

Prevalence of FTT

A total of 17% (n = 942) of the 5624 children with visits met one or more of the anthropometric criteria in the younger age group and 20% (n = 1126) in the older age group, with 27% (n = 1524) meeting one or more criteria in at least one age group, and 10% (n = 545) meeting one or more criteria in both age groups. The total yield for each criterion varied from 1.3% (Waterlow) to 22.2% (crossing at least two major weight centiles downward; table 2).

Table 2

 Crude prevalence of failure to thrive identified by each of seven anthropometric criteria*

Concurrence among criteria

The concurrence among all seven criteria was generally poor. None of the children identified as failing to thrive met all the criteria, and most (623/942 from the younger and 804/1126 from the older age group) met only one criterion. Significant undernutrition was found in 77 (2% of fully screenable children) from the younger age group and 66 (1.8%) from the older age group, with 129 (2.8%) children from at least one age group, but only 14 (0.5%) children form both age groups. Most single criteria either identified less than half of these children or included too large a proportion of the total cohort, resulting in a low positive predictive value (table 3). In particular, only 40% of those with slow conditional weight gain also had low BMI and vice versa, and only 30.5% of children crossing two or more major centiles also had slow conditional weight gain. All children identified by the Waterlow criterion had BMI below the 5th centile, but less than half had a weight below the 5th centile.

Table 3

 Sensitivity and positive predictive values of each criterion in identifying significant undernutrition (BMI and conditional weight gain below the 5th centile).

Characteristics of screen-positive infants

Children identified by low weight, as well as children identified by low length, were significantly smaller at all ages than the rest of the cohort and with significantly lower gestational age, whereas children identified by slow weight gain had only lower mean weight and length in the given age group (table 4). By contrast, children identified by the Waterlow criterion and BMI were significantly taller than average and had higher gestational age. However, although those identified by the Waterlow criterion were of average size at birth, those identified by low BMI were also significantly thinner at birth. Finally, children crossing two or more major weight centiles downwards were significantly larger at birth and had higher gestational age, and although significantly lighter than average in both age groups, they were substantially heavier than children identified by all the other criteria.

Table 4

 Characteristics of children identified as “cases” by the different criteria among fully screenable children


As a concept, FTT assumes that the growth pattern seen is a failure that should be identified and reversed. In population-based cases, however, there is also the risk that otherwise flourishing children will be mislabelled, resulting in unnecessary and potentially harmful parental anxiety. It could well be argued that, as undernutrition is relatively uncommon in affluent countries such as Denmark, most children below a single growth threshold are simply growing normally at the extreme end of the distribution. However, undernutrition still does occur and its identification in population screening, by necessity, relies heavily on anthropometric measures.

The concurrence among clinically used criteria for growth delay and undernutrition proved to be poor, with no single measure reliably identifying children with significant undernutrition. Criteria based on a low weight or height centile have the limitation of including normally growing, constitutionally small children, yet missing larger children not falling below the cut-off. By contrast, definitions relying on a decrease on an ordinary centile chart tend to overidentify initially large and miss very small infants. Accordingly, low length for age and downward crossing of centiles, without adjustment for regression to the mean, proved to have low sensitivity and positive predictive value and, thus, on both theoretical and practical grounds seems invalid. The concordance between low weight for length and conditional weight gain proved to be surprisingly low and a low weight for length seemed to be associated more with relatively tall stature than with low weight. Although the Waterlow criterion may have a high positive predictive value in populations where wasting is common, in this affluent population it predominantly seems to identify relatively tall, slim children at low risk of being nutritionally compromised. This was shown to be an intrinsic limitation of the method’s lack of adjustment for age,27 but the same tendency was also seen with BMI, although to a lesser extent.

This is the first population-based study to examine this question. The study population was a large representative birth cohort with prospectively recorded data, a high participation rate and no substantial differences between participants and non-participants.

Using a routine screening system still involves the risk of selection bias, with over-representation of data on at-risk children—for example, children of low birth weight more often had measurements from all four visits. Condensing data into two age groups reduced this tendency and subsequently no differences were found between children with full versus partial growth data, making it likely that the prevalence rates found were not inflated.

Misclassification due to inadequate growth references, which has been a problem in other studies,28 was dealt with by using the cohort as its own reference.25 Inevitably, there will be some measurement errors, as data were not collected under careful research conditions, but this is the case in any real-life screening situation, which this study aims to assess. Combining deceleration in weight gain with low weight for length seems a theoretically valid definition of significant undernutrition, but criteria depending on multiple measurements are also prone to multiple sources of error, particularly if they use length, which is especially difficult to measure. Overestimation of length in proportion to weight would increase the probability of being identified by criteria using both of these measures. The relatively high gestational age of children identified by the Waterlow criterion and BMI suggests a true difference in length, but the association found between low BMI and higher length does question the validity of our measure of significant undernutrition in infancy.

Two previous studies have compared criteria for FTT or undernutrition in affluent societies. Wright et al21 compared the criteria of Gomez and Waterlow in referred children with weight for age less than the 5th centile or deceleration of weight crossing two major centiles. Likewise, Raynor and Rudolf22 investigated five criteria for undernutrition in children referred to an FTT clinic with weight below the 3rd centile or deceleration in weight crossing two centile channels. Both studies found large differences regarding prevalence, but concurrence was not investigated. Raynor and Rudolf also investigated the cross-sectional association of the criteria with developmental delay and eating difficulties. They found no definition to be more predictive than another, and suggested weight as the most feasible measure of FTT. However, both study populations had already been referred because of low weight or poor weight gain and were thus highly selected, so results cannot easily be applied to the general paediatric population. Ideally, we would test the performance of screening criteria in an unselected population at high risk of undernutrition. In developed countries, however, the only possible high-risk groups would usually have underlying somatic morbidity, covering less than a quarter of children with FTT, and thus rendering the findings non-generalisable and unsuitable for screening purposes.

However, only testing criteria against subsequent outcomes can truly determine whether a growth pattern is pathological. Bairagi et al29 found weight for age to be the best discriminator of the 1-year mortality in a Bangladeshi paediatric population, whereas weight velocity seemed to be a good indicator of short-term mortality. Mortality is a rare outcome in more affluent societies, but poor conditional weight gain was recently found to be associated with sudden infant death syndrome.30 Alternatively, cognitive delay and behavioural difficulties are possible outcomes of FTT, and, in a recent meta-analysis, Corbett and Drewett3 found that poor early weight gain, measured in a range of ways, was associated with minor cognitive deficits in later childhood. Slow conditional weight gain is commonly followed by catch-up weight gain,31 more rapidly in children receiving intervention,32 suggesting that this is not a constitutional pattern. No studies have yet considered outcomes for low BMI in infancy.


The concurrence among different criteria for FTT is low and no single measurement on its own is adequate to identify nutritional growth delay in the general population. The criteria do not always identify the expected group of children, with the international criterion of Waterlow primarily identifying tall children in this affluent population. Longitudinal population studies are needed to investigate the relative contribution of different criteria, particularly BMI, in relation to potential covariates, and to predict outcomes such as neurodevelopmental and behavioural problems.


Adjusting for regression towards the mean

A given weight of a child is correlated with earlier weights, with the degree of correlation depending on both age (A) and interval between the measurements (I).

Cole12 calculated the correlations (r) between weight measurements at fixed ages and intervals using a British child cohort with regular weight measurements. He then estimated the correlations between weight measurements differing from this reference dataset regarding age and interval using linear interpolation on r depending on I and A.

Weight gain in z scores (standard deviation scores) was calculated as

z score gain=(z2−rz1)/√1−r2

However, correlations were not available on any Danish cohort and could not be calculated easily from our material, as measuring points were too scattered. Instead, the relationship among weight z scores was estimated using every possible combination of weight pairs for every given child in the cohort, including nearly 15 000 weight pairs for each sex.

This was carried out by linear regression modelling z2 as a function of z1, age (A) and interval (I). The fit of the different models was evaluated using the sum of the squared deviations.

What is already known on this topic

  • Failure to thrive (FTT), often defined by low weight or poor weight gain, is a state of paediatric undernutrition.

  • Poor weight gain is associated with increased mortality in developing countries, and with sudden infant death syndrome and cognitive deficits in affluent societies.

What this study adds

  • The concurrence between clinically used criteria for FTT is low when applied to a general affluent infant population.

  • The sensitivity and positive predictive value of single criteria are poor at detecting children with growth patterns likely to reflect significant undernutrition.

This was performed separately for boys and girls, including all children in the cohort. However, the models were also tested after excluding preterm children with gestational age below 37 full weeks.

In all cases, the best fitting model was found to be:

z20z11z1I+β2z1A+β3z1(1/A)+β4z1log A+β5z1(log A)2+
6z1(1/I)+β7z1log I+β8z1AI+β9z1log Alog I+
10Z1(log A)2(log I)+β11z1(log I)2

where z1 is the weight z score at time1, z2 is the weight z score at time2, I is the age interval = (age at time2)–(age at time1) and A is the mean age = (age at time2 + age at time1)/2.

To validate the model, we performed Efron’s optimism bootstrap procedure as described by Harrell.33 A thousand random samples, for boys and girls, separately, including z2 and the predicted values, Ž2, were drawn from the original data, and the following linear model was fitted:


The mean of γ0 and γ1 are the estimates of any possible overfitting. For boys, γ0 = –0.01303 (standard error (SE) 0.01062) and γ1 = 0.99917 (SE 0.01446) For girls, the γ0 = –0.02323 (SE 0.01097) and γ1 = 0.99843 (SE 0.01505). Conclusively, no overfitting was present.


We are grateful for the contribution of the public health nurses from the County of Copenhagen who obtained the unique prospective data on the birth cohort. We also thank David H Rubin, MD, Chairman, Department of Pediatrics, St Barnabas Hospital, Bronx, and Professor of Clinical Pediatrics, Weill Medical College of Cornell University, New York, USA; and Sannie Nordly, MD, PhD, Department of Paediatrics, Hillerød University Hospital, Hillerød. Denmark, for kindly reading and commenting on the paper.


View Abstract


  • Published Online First 10 March 2006

  • Funding:This study was funded by the Egmont Foundation, the Danish Health Insurance Foundation, the Foundation of Carl August and Jenny Andersen, the Lundbeck Foundation, the Gangsted Foundation, the Beatrice Surovell Haskell Fund for Child Mental Health Research of Copenhagen, the Rosalie Petersen Foundation, the Foundation of Director Jacob Madsen and Wife Olga Madsen, the Linex Foundation and the Danish Ministry of Social Affairs.

  • Competing interests: None declared.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles

  • Précis
    BMJ Publishing Group Ltd and Royal College of Paediatrics and Child Health
  • Perspectives
    N J Spencer
  • Atoms
    Howard Bauchner
  • Perspectives
    I Hughes