The effects of a double blind, placebo controlled, artificial food colourings and benzoate preservative challenge on hyperactivity in a general population sample of preschool children
- B Bateman1,
- J O Warner1,
- E Hutchinson3,
- T Dean5,
- P Rowlandson4,
- C Gant5,
- J Grundy5,
- C Fitzgerald3,
- J Stevenson2
- 1Infection, Inflammation and Repair Division, University of Southampton, Southampton, UK
- 2Department of Psychology, University of Southampton, Southampton, UK
- 3Department of Clinical Psychology, St Mary’s Hospital, Isle of Wight, UK
- 4Department of Paediatrics, St Mary’s Hospital, Isle of Wight, UK
- 5David Hide Asthma and Allergy Research Centre, St Mary’s Hospital, Isle of Wight, UK
- Correspondence to:
Professor J Warner
University Child Health, Southampton General Hospital, Tremona Road, Southampton SO16 6YD, UK;
- Accepted 14 September 2003
Aims: To determine whether artificial food colourings and a preservative in the diet of 3 year old children in the general population influence hyperactive behaviour.
Methods: A sample of 1873 children were screened in their fourth year for the presence of hyperactivity at baseline (HA), of whom 1246 had skin prick tests to identify atopy (AT). Children were selected to form the following groups: HA/AT, not-HA/AT, HA/not-AT, and not-HA/not-AT (n = 277). After baseline assessment, children were subjected to a diet eliminating artificial colourings and benzoate preservatives for one week; in the subsequent three week within subject double blind crossover study they received, in random order, periods of dietary challenge with a drink containing artificial colourings (20 mg daily) and sodium benzoate (45 mg daily) (active period), or a placebo mixture, supplementary to their diet. Behaviour was assessed by a tester blind to dietary status and by parents’ ratings.
Results: There were significant reductions in hyperactive behaviour during the withdrawal phase. Furthermore, there were significantly greater increases in hyperactive behaviour during the active than the placebo period based on parental reports. These effects were not influenced by the presence or absence of hyperactivity, nor by the presence or absence of atopy. There were no significant differences detected based on objective testing in the clinic.
Conclusions: There is a general adverse effect of artificial food colouring and benzoate preservatives on the behaviour of 3 year old children which is detectable by parents but not by a simple clinic assessment. Subgroups are not made more vulnerable to this effect by their prior levels of hyperactivity or by atopy.
- artificial food colouring
- benzoate preservatives
- double blind placebo controlled challenge
- ADHD, attention deficit-hyperactivity disorder
- APHR, aggregated parental hyperactivity ratings
- AT, atopy
- ATH, aggregated test hyperactivity
- BCL, Behaviour Checklist
- HA, hyperactivity
- WWP, Weiss–Werry–Peters Activity Scale
There have been no population based studies examining the prevalence of hyperactivity related to intolerance to food additives following the initial claims of the detrimental effect of artificial additives on children’s behaviour.1 Subsequent studies, despite improved methodology, have failed to substantiate this claim2–7 or have only shown a small effect.8–18
A double blind placebo controlled high dose azo dye challenge in a highly selected group of children with behaviour disturbance suggested a small adverse effect on the children’s behaviour based on ratings on the Connor scale.16 There was no association between response and atopy, leading the authors to conclude that any effect was pharmacological rather than IgE mediated. Further clinical evidence from research on urticaria linked artificial food additive responses to IgE independent histamine (and other mediator) release.19 An in vitro study showed that circulating basophils released histamine in a non-IgE dependent response on exposure to azo dyes,20 and in an in vivo study in which high doses of tartrazine were administered to normal subjects induced significant histamine release.21 Despite this suggested mechanism of action there continues to persist, particularly in the public mind, links between “allergy” to artificial food additives and behaviour disturbance. The generalisability of findings from previous studies is limited by samples which are small, depend on an attention deficit-hyperactivity disorder (ADHD) diagnosis,9 are in patients already thought to show adverse behaviour triggered by artificial additives,16 or are recruited from specialist clinics.11 Some studies have identified a higher than expected proportion of atopic children within those whose behaviour appeared to be affected,13 but this has never been systematically examined.
The present study used population based screening to identify children with or without hyperactivity (HA) and with and without atopy (AT). Children were selected from this population for the dietary challenge phase of a within subject double blind placebo controlled study examining the impact of artificial colourings and benzoate preservatives on hyperactive behaviour. The study was designed to test the hypothesis that food additives have a pharmacological effect on behaviour irrespective of other characteristics of the child.16
Figure 1 presents details of the children in the study. The study population comprised 2878 children (dates of birth 1 September 1994 to 31 August 1996), resident and registered with general practitioners on the Isle of Wight (IOW), UK on their third birthday. This includes all children living on the IOW. The study was approved by the local Research Ethics Committee (Reference Number 40/96) and written informed consent was obtained from the parents. Screening with the behaviour questionnaires (phase I) was completed on 1873 children; of these, 1246 subsequently underwent skin prick testing for atopy (phase II). Therefore, of the 2731 children resident on the IOW, 1246 (46%) were potentially available for entry to the food challenge (phase III). One hundred and eighty two did not consent to take part in the challenge and a total of 397 children were selected to enter phase III.
Study design and treatment protocols
The children were initially assessed for hyperactivity, using two scales. Those who scored at least a mean of 4 on the EAS activity scale22 and 20 on the Weiss–Werry–Peters Activity Scale (WWP)23 were designated hyperactive. This definition has been shown in a previous epidemiological study to identify a distinct group of hyperactive 3 year olds.24,25 These two measures appraise hyperactivity in terms of the degree to which the child shows inattention, overactivity, fidgetiness, and impulsivity. The ratings are made by parents on the basis of the child’s usual current behaviour. The children were also assessed for a wider range of behaviour problems using the Behaviour Checklist (BCL).26
Children were defined as atopic if on skin prick testing (Dermatophagoides pteronyssinus, grass pollen, cat allergens, cows’ milk, egg, and peanut) (ALK, Hørsholm, Denmark) they had one or more reactions with a mean wheal diameter ⩾2 mm in the presence of a positive histamine control and negative saline control.27
Children were entered into the four group randomised, placebo controlled, double blind, crossover challenge study. The four groups were in a 2×2 between group design with the following groups: HA/AT, non-HA/AT, HA/non-AT, and non-HA/non-AT.
For the four week study period the child followed an artificial colouring and sodium benzoate free diet. During the second and fourth week they received, daily, and to be taken at home over the course of the day, 300 ml of mixed fruit juices (placebo or active randomly assigned) in identical, sealed bottles, of the same appearance. The active drink included 20 mg in total of artificial food colourings (sunset yellow, tartrazine, carmoisine, and ponceau 4R; 5 mg of each) (Forrester Wood, Oldham, UK) and 45 mg of sodium benzoate (J Loveridge, Southampton, UK). The washout periods used in other studies have varied from days17 to weeks.11,16 There was no carry over effect noted by Rowe and Rowe despite repeated challenges with tartrazine with only two day long placebo periods.17 A period of one week was felt to be both suitable and practical for both the challenge and the washout periods.
A preliminary test showed that the drinks could not be accurately differentiated on blind testing. Preliminary blind tasting of the placebo and active drinks by 34 adults had shown that they were no more likely to identify content of the drink than expected by chance. Fifteen of the 59 parents who withdrew their child from the study did so due to perceived adverse behavioural changes. Nine of these withdrawals occurred during an active week and six during a placebo week. At the end of the study period the parents were equally divided into those who did or did not correctly identify the drink order. All the study team and the family were blind, apart from the dietician who prepared the drinks and randomly allocated each child using a random number table to receive either active or placebo drinks first.
The child’s behaviour was assessed weekly in the clinic by research psychologists, using validated tests. There was a baseline assessment at the beginning of the challenge month, then four subsequent weekly assessments (time 1 to time 4). The parents also rated changes within their child’s behaviour daily, using behaviours from the WWP:23 (1) switching activities; (2) interrupting or talking too much; (3) wriggling; (4) fiddling with objects or own body; (5) restless; (6) always on the go; (7) concentration. Parents kept a daily “snack” diary to allow an estimate of their compliance with the consumption of the challenge drinks as well as with the diet over the four week study period. Two hundred and twenty four (81%) of children drank all or nearly all of the active and placebo drinks; only 14 (5%) children drank less than two thirds of the active and placebo drinks. Dietary infractions were estimated from the “snack diary”. Each time a portion of drink or food was recorded containing sodium benzoate or an artificial colour this was counted as one “mistake”. Over the study month 34% of children recorded no “mistakes”, 58% recorded 1–6, and 8% more than 6 total “mistakes”. There was no difference in infractions during active or placebo weeks.
Of the 397 selected for phase III, 120 (30%) failed to complete all four weeks of the study. There was no effect of order (children were no more likely to drop out on active than placebo). Gender, hyperactivity, or atopy were also not related to the failure to complete the study.
Children were observed during a period of free play,28 then assessed with three structured tasks: the “bear and dragon” task,29 a delay-aversion “hiding stickers” task,29 and “draw-a-line slowly and walk-a-line-slowly”.30
The clinic based tests produced 12 measures for each visit based on task performance and tester recordings of behaviour: three of inattention, three of activity, and six of impulsivity (for further details of these measures, please contact the authors). The three attention and three activity measures were aggregated into a single index since these aspects of behaviour were so highly correlated. The six impulsivity measures were also aggregated. These summary measures were calculated as a mean of the available constituent measures. An overall aggregate test hyperactivity (ATH) index was also calculated using the same methodology.
The weekly mean of the daily parental behaviour ratings was calculated. Three parental ratings were calculated from the seven item weekly behaviour questionnaire, measuring activity (items 1, 3, 4, 5, and 6), attention (item 7), and impulsivity (item 2).
There were 277 children who completed the trial and for whom test data was available at all five measurement time points (see fig 1). Inevitably with studies on children as young as 3 years, there were missing data in the testing. To deal with this, two procedures were adopted. If the scores were aggregated across a number of measures the mean score was taken for those measures on which the participants had data. If a child had sporadic missing data this was replaced by the modal value for that variable. By this means it was possible to achieve an n of 277 for each of the three measures (aggregated test hyperactivity, test impulsivity, and test activity and attention) at each of the five time points (baseline, pre 1, post 2, pre 3, post 4).
Each of the measures was based on a different scaling and therefore had different mean values and variances. To facilitate interpretation of the data analysis, all measures were standardised as follows. Each score was expressed as deviation from the baseline mean for that measure divided by the standard deviation at baseline. The test-retest reliabilities for all measures were established (please contact the authors for details).
To avoid inflating type I errors two primary outcome variables were identified: aggregated test hyperactivity (ATH) and aggregated parental hyperactivity ratings (APHR). This aggregation was made after standardisation and resulted in the standard deviations of the aggregated measures being less than unity.
The initial analysis was based on the pooled data for all subjects. The effect of the order with which the active and placebo supplements were administered was tested in an analysis of variance (ANOVA) by the interaction of the between subject factor of order and the within subject five-level factor of time of measurement. Subsequent analyses pooled subjects across order and were concerned with detecting a difference between the changes in scores in the placebo and active periods. This was tested using a repeated measures ANOVA and shown by the interaction between the two within subject two-level factors of period (active/placebo) and time (pre and post).
The initial sample selection was designed to allow a mixed ANOVA analysis with 2 two-level between subject factors: hyperactive/non-hyperactive and atopic/non-atopic. The interaction of these between subject factors and the within subject factors of period (active/placebo) × time (pre/post) interaction effect was tested. These analyses were based on the total sample of 277 subjects. To take advantage of the matching of cases it was possible to repeat the analysis using 35 matched quartets in the 2×2 design with n = 140.
There were pre-period inequalities in APHR for the active and placebo periods. The analysis was repeated casting the active and placebo periods as a between subject factor. The post-period scores were the dependent variable and the pre-period scores a covariate. The main effect of period in this analysis of covariance indicates differential change in behaviour in the active and placebo periods with initial level of hyperactivity controlled.
Treating the study as a crossover challenge with two treatments with a total of 240 children, the probability is 94% that the study will detect a challenge difference at α = 0.05, if the true difference between the treatments is 0.3 standard deviation units of pre- to post-treatment change scores. As a 2×2 design with 30 subjects per cell the main effects of each factor of 0.35 could be detected with power greater than 0.80 and α = 0.05.
Table 1 presents the characteristics of the children entering the crossover challenge phase of the study. There were no significant differences between the four groups in terms of gender and mother’s age at leaving full time education. As would be expected the children in the HA/AT and HA/not-AT groups had a significantly higher rate of behaviour problems than the other two groups (χ2 (3, n = 277) = 67.8, p < 0.001).
Validation of the tests
To establish that the tests administered to the children were sensitive to cognitive and behavioural differences between hyperactive and non-hyperactive preschoolers a preliminary analysis was conducted to compare the scores at baseline for these two groups. It was found that on the test measures of impulsivity (t (275) = 3.0, p < 0.004) and attention and activity (t (275) = 3.0, p < 0.004) as well as the ATH measure (t (275) = 3.9, p < 0.001), the hyperactive children had significantly worse scores.
Mean scores on testing and parent ratings from baseline to time 4
Figures 2 and 3 show the pattern of mean scores for children in the active-then-placebo and placebo-then-active groups. There is no evidence for any changes across time for the ATH score. For the parent ratings by contrast there is a pattern indicating a reduction in hyperactivity (an increase in APHR) between baseline and time 1; a period over which food additives were removed from the diet. In the active-then-placebo and the placebo-then-active groups there were increases in hyperactivity for both the placebo and active challenge periods. However in both groups the slope of the lines indicates a greater increase in hyperactivity during the active periods.
Effects of withdrawal of food colourings and additives
There is a similar APHR increase for both groups between time 2 and time 3—that is, the wash out period between challenges. This indicates that the removal of food additives and colourings from the diet may have a beneficial effect detected by parental ratings (fig 3) but not by formal clinic testing (fig 2). These changes in APHR between baseline and time 1 (t (274) = 6.0, p < 0.001) and time 2 and time 3 are significant (t (275) = 7.4, p < 0.001). There were no significant interactions of order with the effects of active and placebo on the mean scores for either ATH scores (F(1,275) = 1.3, NS) or APHR (F(1,274) = 1.4, NS); subsequent analyses are based on the pooled scores for the active and placebo periods ignoring the order. Table 2 presents the means for the five time points.
Effects of challenges
A repeated measures analysis of variance showed that there were no significant changes in the test scores in either the active or placebo periods for the impulsivity (F(1,276) = 1.1, NS), activity and inattention (F(1,275) = 0.3, NS), or ATH (F(1,276) = 1.1, NS) measures. There were, however, significant changes in parental ratings that were shown to interact with type of dietary supplement indicating significantly greater increase in hyperactive behaviour during the active period. These significant interactions were found for activity (F(1,275) = 7.7, p < 0.007) as well as for the APHR (F(1,275) = 6.2, p < 0.02), but not for impulsivity (F(1,266) = 0.2.8, NS (p < 0.10)) or inattention (F(1,271) = 3.5, NS (p < 0.07)). To reduce the risk of type I errors, the remaining analyses will be conducted only on the ATH scores and the APHR.
To test whether the child’s initial hyperactivity level or atopy status influenced these changes in hyperactivity under dietary challenge, a set of 2×2 analyses of variance were conducted to detect interactions between these between subject factors and the interaction between time and challenge type. With the ATH score as the dependent variable there was no significant effect of challenge (F(1,273) = 0.2, NS) nor any evidence of interactions between challenge type and atopy or initial hyperactivity.
With the APHR there was a significant effect of challenge type (F(1,272) = 6.5, p < 0.02) but this effect did not interact with either initial hyperactivity status (F(1,272) = 0.0, NS) or atopy (F(1,272) = 0.5, NS), nor was there a joint interaction between these two factors and challenge type (F(1,272) = 0.5, NS).
It can be seen in table 2 that by chance the mean scores on the APHR pre-placebo were lower than pre-active (t (275) = 2.5, p < 0.02). It is necessary to establish whether the significant differences in changes in behaviour under the placebo and active challenges remain when the initial scores differences are controlled. An analysis of covariance was conducted on the post-period scores with placebo/active as a between group factor and the pre-period scores as covariates. There was a significant effect of the covariate (F(1,550) = 43.1, p < 0.001) and the effect of type of challenge type remained significant (F(1,550) = 3.9, p < 0.05).
The observed effect of food additives and colourings on hyperactivity in this community sample is substantial, at least for parental ratings. The change in aggregated hyperactivity as rated by parents while the child was on placebo was 0.38 and for the active supplement was 0.77. The difference between these changes is 0.39 and represents an effect size of 0.51 in relation to the baseline standard deviation of 0.76. The standard deviation at baseline was chosen for this comparison since it represents the extent of variance in hyperactive behaviour in this general population sample before any intervention or dietary manipulations. The change effect size of removing additives and colourings is shown in the increase from baseline to the time 1 scores and was approximately 0.5; a value slightly higher that the 0.39 estimate above. This would be expected given that the parents were not blind to the removal of additives/colourings from their children’s diets and expectancy effects would therefore inflate this change estimate. Nevertheless these two estimates of the impact of food additives/colourings on 3 year old children’s hyperactive behaviour both indicate a statistically substantial effect detectable by parents. The effect size is less than that obtained for methylphenidate (0.82)31 but similar to that for clonidine (0.58)32 in the treatment of children with ADHD.
These results are based on a sample constituting 10% (277/2731) of a general population of 3 year olds. The starting point of the study were all the 3 year old children living on the IOW. There may have been some self-selection of families to take part in the food challenge phase of the study. However, where checks were made on broad sociodemographic factors, selective attrition for the various stages of the study was not detected. The loss of families during the challenge phase was low considering that these families were not ones that entered the study because of a referral. From this general population sample. 70% (277/397) of those invited to take part in the food challenge completed all phases.
It was not possible in the present study to obtain parallel evidence for changes in hyperactivity on the basis of psychologist administered tests. This has proved difficult to obtain in previous studies of dietary changes in selected hyperactive samples.33 Parents’ reports have also been found to show the largest effects in drug trials of treatment for ADHD.32 One possible explanation of this is that the tests are not sensitive to hyperactivity in this age group. This was shown not to be the case since the hyperactive children did show significantly worse scores on these tests at baseline.
Parental ratings might be more sensitive to changes in behaviour in that parents experience their child’s behaviour over a longer period of time, in more varied settings and under less optimal conditions. The tests conducted in clinic are liked by the majority of children who see them as an entertaining game; they are given when the children are optimally alert and engaged. In contrast, parents will observe the child’s behaviour when they are competing with siblings for attention; at times when the child is hungry or tired; when the child has less devoted attention from one adult; when the child is interacting with other children; or in a constraining setting such as on public transport or in a supermarket queue. This range of disparate settings will provide the parent with a greater opportunity to observe the child’s hyperactive behaviour.
An additional possibility is that the test-retest reliability of the tests being used was simply insufficient to detect systematic effects of dietary supplements. The reliabilities of the test scores were only modest (0.24–0.72) but comparable with a number of physiological measures at this age (0.25–0.50).34,35
These findings therefore suggest that significant changes in children’s hyperactive behaviour could be produced by the removal of artificial colourings and sodium benzoate from their diet. The results were obtained in a general population sample with only a modest degree of self-selection. A total of 397 families were invited to enter the double blind food challenge phase. Although approximately one sixth of families did not complete the challenge phase, the completers were no different from the non-completers on any of our baseline measures. Such losses from the study would be expected given the heavy demands placed on these general population families to modify their children’s diet over a five week period.
The reduction in hyperactive behaviour that would arise from removal of the additives used in this study from the diet of preschool children are ones that are not related to initial levels of hyperactivity in the child. The child with more extreme hyperactivity showed changes no greater but also no less than other children. The potential long term public health benefit that might arise is indicated by the follow up studies which have shown that the young hyperactive child is at risk of continuing behavioural difficulties, including the transition to conduct disorder and educational difficulties.36,37
Our study has shown that the effect of food additives on behaviour occurs independently of pre-existing hyperactive behaviour or indeed atopic status. This is consistent with other studies which have tended to suggest that if food additives have an effect at all, it is via a pharmacological effect which is best exemplified by the non-IgE dependent histamine release.20,21 We believe that this suggests that benefit would accrue for all children if artificial food colours and benzoate preservatives were removed from their diet. These findings are sufficiently strong to warrant attempts at replication in other general population samples and to examine whether similar benefits of the removal of artificial colourings and sodium benzoate from the diet could be identified in community samples at older ages.
The authors would like to acknowledge the contribution to the study from Dave Pearson, Hasan Arshad, Sharon Matthews, Brenda Fishwick, Karen Simms, and Sophie Dodswell.
The research reported in this paper was funded by research grants from the Food Standards Agency, UK (Grant: FS 3015) and the South West Regional Research and Development Directorate. Smith Kline Beecham contributed to the challenge materials.