|
|
||||||||||||||
|
|
|||||||||||||||
Original articles |
1 Department of Paediatrics, College of Medicine, Blantyre, Malawi
2 Postgraduate Statistics Centre, Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
3 Centre for Medical Statistics and Health Evaluation, University of Liverpool, Liverpool, UK
4 Department of Community Health, College of Medicine, Blantyre, Malawi
5 Department of International Health, University of Tampere Medical School, Finland and Department of Paediatrics, Tampere University Hospital, Finland
6 Institute of Child and Reproductive Health, University of Liverpool, Liverpool, UK
Correspondence to:
Dr Melissa Gladstone, Institute of Child Health, University of Liverpool, Royal Liverpool Childrens Hospital, Eaton Rd, Liverpool L12 2AP, UK; mgladstone{at}btinternet.com
Accepted for publication 4 March 2007
| ABSTRACT |
|---|
Design: Through focus groups, piloting work and validation, a more culturally appropriate developmental tool, based on the style of the Denver II, was created. Age standardised norms were estimated using 1130 normal children aged 0–6 years from a rural setting in Malawi. The performance of each item in the tool was examined through goodness of fit on logistic regression, reliability and interpretability at a consensus meeting. The instrument was revised with removal of items performing poorly.
Results: An assessment tool with 138 items was created. Face, content and respondent validity was demonstrated. At the consensus meeting, 97% (33/34) of gross motor items were retained in comparison to 51% (18/35) of social items, and 86% (69/80) of items from the Denver II or Denver Developmental Screening Test (DDST) were retained in comparison to 69% (32/46) of the newly created items, many of these having poor reliability and goodness of fit. Gender had an effect on 23% (8/35) of the social items, which were removed. Items not attained by 6 years came entirely from the Denver II fine motor section (4/34). Overall, 110 of the 138 items (80%) were retained in the revised instrument with some items needing further modification.
Conclusions: When creating developmental tools for a rural African setting, many items from Western tools can be adapted. The gross motor domain is more culturally adaptable, whereas social development is difficult to adapt and is culturally specific.
When child development is assessed in clinical studies in developing countries, Western developmental tools are often utilised.4 5 These include the Bayley scales,6 the Griffiths,7 the McCarthy scales8 and the Denver II,9 all designed and validated in Western countries. These tools may be tailored for use in non-Western settings. Often translation (changing of the language used) is all that is carried out.10 11 If this is not accompanied by a process of adaptation, translation alone may not allow completely for local expressions and customs, therefore leading to misinterpretation of results.12 In other settings, tools are adapted and items are modified and in some cases new items are created for use within a Western tool.13 Sometimes these tools are piloted (tried out before use)14 and validated (assessed that they are measuring what they are supposed to be measuring) in the local population.15 Even these adapted tools, however, are of limited value without normal ranges for their defined population. Standardisation studies (finding norms for a population) have taken place in many non-Western countries mainly using the Denver Developmental Screening Test (DDST) in a translated and occasionally adapted form,16–18 but none of these studies was in Africa. Only two studies have attempted standardisation in Africa, one using a translated form of the Bayley scales with an urban black South African population19 and the other on a limited age range in a rural Nigerian population.20
It is clear that Western developmental assessment tools may include tasks and materials which are completely alien to other cultures. These tools may therefore fail to identify and assess children adequately in cultural settings other than those for which they were created.21 This may be less of a problem when comparing groups of children, but when Western tools are used alone as an outcome measure, culture may have an effect. In theoretical studies, culture has been demonstrated to have an influence on child development, particularly in the area of social development.22–24 Cognitive abilities such as memory, categorisation techniques and pattern recognition have also been reported to be influenced by culture.25–28 Even gross motor development may possibly be affected by culture.29–31
In this study, we aimed to create a simple, culturally appropriate developmental assessment tool adapted and modified from Western tools and standardised for use in rural Malawi. The first stage in the development of this tool was to identify which items from Western tools (eg, the DDST or Denver II) were not relevant to the age-appropriate experiences of rural Malawian children. These items were then replaced with ones more appropriate to this cultural context. We did this firstly by holding focus groups to agree which items should be replaced and to create alternative items. All items (both retained and new) were then validated and standardised in a large population study. The performance of all items was examined in a consensus meeting and a revised instrument proposed.
| METHODS |
|---|
The population of children used for this study is the original LCSS cohort of children aged 3.5–6.5 years and younger siblings aged 0–3.5 years. Out of the 1237 LCSS children and siblings available, 1197 were seen, with 40 families either refusing to take part or not being available. The ages of the children were known from LCSS birth data or from the "health passport" given to mothers at the birth of their baby where the date of birth is recorded and which almost all mothers carry with them for all health appointments. A quota sampling strategy was used as in the DDST and Denver II34 with target numbers of children being sought in each of 33 age groups (see supplementary table A). A total of 67 children were excluded due to premature birth (34 weeks or less measured by fundal height at the antenatal clinic),32 twin birth or significant disability including severe malnutrition (weight for height z score of less than –2), leaving 1130 children in the final analysis.
The LCSS received approval from the National Health Science Research Committee in Malawi (HSRC 93/94). Informed verbal consent was sought from each mother at the beginning of the LCSS and again before a development assessment was carried out.
Creation of the developmental assessment tool
The Denver II, DDST and Griffiths instruments were examined by the Malawian research team. Items considered to be culturally appropriate were included and translated, whereas those considered inappropriate (such as "prepares cereal" or "plays board/card games") were removed. New items and modifications to Western test items were then created through discussions with a series of focus groups. Key informants were the eight local research workers. They were all women of child-bearing age with at least 8 years education and research experience of at least 5 years. Themes relating to developmental milestones were discussed and ideas from these sessions were used to create new items. Illustrations were made for most items in the instrument and used as prompts for the research workers. Some came with permission from Disabled village children.35
Face validity36 and content validity were assessed by all research assistants, five Malawian paediatricians, a language expert from the University of Malawi and six medical students at the College of Medicine, Malawi. Once the new instrument was created, the team was trained in its use and it was piloted in two stages. At each stage, feedback and training were given and problematic items were re-adapted or re-translated. The process of creating and refining the more culturally appropriate tool is shown in fig 1.
|
Data entry and analysis were carried out using Microsoft Excel 6.0, SPSS 11.1, Stats-direct and STATA computer programs. Each child in the study was identified by a code. Data were checked prior to analysis and any outlying results were reviewed.
Standardisation is the process of determining normal age ranges for which children pass the items for a developmental assessment tool. A logistic regression analysis was carried out with decimal age and sex as explanatory variables. The observed and predicted probability of passing was determined and graphs were drawn for each item. The goodness of fit of the graph was visually assessed and discrepancies reviewed. To determine statistically whether or not the fitted curve was a sufficiently good representation of the data, a goodness of fit statistic was calculated.37 If this was significant at the 5% level, indicating a poor fit, then the data were re-examined and refitting was done using triple split spline regression. The ages corresponding to the 35th and 65th percentiles were calculated from the original fit to determine the cut-points. For some items that performed less well, the cut-points were chosen by viewing the graphs to facilitate a good fit. Three logistic curves were then fitted, one for each region, based on the split.38 39 Any items with significant gender effects were removed or considered for further modification to ensure the tool was applicable to all children irrespective of gender. Using the predicted probabilities found from the logistic regression analyses, the ages corresponding to 25%, 50%, 75% and 90% of the children passing were determined for each item. These were then used to plot the age norms of achievement of each milestone in a box-type representation.
Reliability of the items
Reliability for each item was tested by using two subsamples of 60 (inter-observer) and 28 (intra-observer) randomly selected children who were seen at 7 and 14 days after initial assessment. Of the 60 children, 46 completed the follow-up using two different examiners (inter-observer), while 25 of the 28 children used the same examiner (intra-observer). All items in the tool were assessed for both types of reliability. Kappa statistics (
) with 95% confidence intervals (CI) were used to calculate the degree of observer agreement for each question. Positive values of 0 to <0.2 indicate poor agreement, >0.2 to 0.4 fair agreement, >0.4 to 0.6 moderate agreement, >0.6 to 0.8 good agreement and >0.8 to 1 very good agreement.40
Respondent validation was carried out after the preliminary analysis. This method of validation involves the reporting of findings back to the participants. Findings were fed back at the end of the study to the Lungwena Health Centre Management Committee. This consisted of four chiefs, one overall representative and three women representatives, all from the local area.
Consensus meeting
Once all the items were analysed, an expert panel (MG, AJ, EM and GL), which included a Malawian paediatrician, met to review the results and decide which items should remain, which should be modified and which should be removed. Items were judged on their graphical representation, and goodness of fit on logistic regression, reliability and subjective ratings of "interpretability" by participants and researchers.
| RESULTS |
|---|
|
Examples of graphs created through logistic regression during the standardisation procedure and where triple split joined regression was used, are shown in fig 2. In terms of goodness of fit on logistic regression and on spline regression, social items had the highest number of poor fits (51%, 18/35), sex being an independent predictor in some of these (23%, 8/35) (see table 2). A larger proportion of the newly created items had a poor fit on logistic regression (15%, 7/46) and an effect of sex (17%, 8/46) than those from the Western tools. The few items not attained by 6 years came from the fine motor area of development and included "draws a man with 6 parts" and "draws a square". The results of the Lungwena milestones for the language section of development are shown in fig 3. The other areas of development are described in supplemental fig C (see supplementary data).
|
|
|
For inter-observer reliability, 82% (113/138) of the questions had moderate to very good reliability (
>0.4). There are no figures in the Denver technical manual for inter-observer reliability for comparison. Intra-observer reliability demonstrated moderate to very good reliability (
>0.4) for 75% (106/138) of the questions. This compares well with Denver II figures,34 where 81% of their items had a
>0.4. In relation to the domains of development, GM items had the best overall inter-observer (29/34 items) and intra-observer (32/34 items) reliability with
>0.4. Items from the social area performed less well, with only 74% (26/35) of the items on inter-observer reliability and 60% (21/35) of the intra-observer items having a
>0.4. In relation to the source of the item, more of the locally-derived items had poor inter-observer (33/46) and intra-observer (15/46) reliability (
<0.4) in comparison to those items derived from the Denver II (12/80 and 8/80).
After a consensus meeting, 110 of the 138 items (80%) were retained in the revised instrument, with some needing further modification. Only 69% (32/46) of the newly created items were retained in comparison to 86% (69/80) of the DDST or Denver items used (see table 2). The results of this meeting giving examples of items removed are detailed in the last two columns of table 1.
| DISCUSSION |
|---|
In the creation of new items, however, many newly created items were less reliable, more sex-specific and had poorer goodness of fit in logistic regression. This was most evident in the social domain and least evident for gross motor skills. Social skills seem to have the least "universality" and in measuring them, we need to question the appropriateness of the concepts being measured in such different settings. When measuring "social skills" we may be determining the ability of the child to have learned important skills instilled by parents and carers in particular cultural settings, but this can only be measured if pertinent skills are tested for. The difficulty when creating new social items for a tool such as this, is that the items must be specific enough to distinguish between the developmental age ranges of children, but also be clear and easy to explain in a developmental tool. This will continue to be a challenge.
It was not a primary aim of our study to compare our results with the Denver II or DDST. A formal statistical comparison has not been possible; however, when comparing our charts with those from the Denver II or DDST on gross comparison, it does seem that there are obvious differences in milestones with children from the West. For example, the item "combines two words" in the Denver II is attained at between 17 and 21 months, whereas in our sample this was obtained at between 21 months and 2 years 4 months. This demonstrates the importance and necessity of creating norms for a given African population, as they are likely to be different from those in the West.
A second phase of work is currently underway using the methodology that we have formulated in this first study to refine a further tool with a larger standardisation sample. This work will include creating a scoring system, and carrying out more detailed reliability measurements and further validity tests of between-group and construct validity. Once this new version has been created and has undergone the strict procedures that we have instituted in our methodology, we hope to have created a tool that may benefit community health workers in other rural settings in Africa after local validation. The complete tool may also be used by research workers who are investigating developmental outcomes as part of their intervention strategies.
What is already known on this subject
|
What this study adds
|
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Published Online First 22 March 2007
| REFERENCES |
|---|
Read all eLetters
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS | REGISTER |
| ARCH DIS CHILD | FETAL NEONATAL ED | ED PRACTICE |