Improved measurement in early child development (ECD) is a strategic focus of the WHO, UNICEF and World Bank Nurturing Care Framework. However, evidence-based approaches to monitoring and evaluation (M&E) of ECD projects in low-income and middle-income countries (LMIC) are lacking. The Grand Challenges Canada®-funded Saving Brains® ECD portfolio provides a unique opportunity to explore approaches to M&E of ECD programmes across diverse settings. Focused literature review and participatory mixed-method evaluation of the Saving Brains portfolio was undertaken using an adapted impact framework. Findings related to measurement of quality, coverage and outcomes for scaling ECD were considered. Thirty-nine ECD projects implemented in 23 LMIC were evaluated. Projects used a ‘theory of change’ based M&E approach to measure a range of inputs, outputs and outcomes. Over 29 projects measured cognitive, language, motor and socioemotional outcomes. 18 projects used developmental screening tools to measure outcomes, with a trade-off between feasibility and preferred practice. Environmental inputs such as the home environment were measured in 15 projects. Qualitative data reflected the importance of measurement of project quality and coverage, despite challenges measuring these constructs across contexts. Improved measurement of intervention quality and measurement of coverage, which requires definition of the numerator (ie, intervention) and denominator (ie, population in need/at risk), are needed for scaling ECD programmes. Innovation in outcome measurement, including intermediary outcome measures that are feasible and practical to measure in routine services, is also required, with disaggregation to better target interventions to those most in need and ensure that no child is left behind.
- child development
- low- and middle-income countries
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
WHY? Tracking programme impact at scale requires effective ‘real world’ approaches: While a multitude of child development assessment and screening tools are available, in practice, lack of standardised, meaningful measures of ECD outcomes which can be used at scale, impede measurement of intervention impact.
WHAT’S NEW? Measurement of coverage in ECD programmes has had relatively little attention: This requires agreed definition of ECD interventions and target population with available data, either for all children in a given age group or to target based on child development status and risk factors.
WHAT TO DO? Intervention quality is crucial but inconsistently defined and measured: Lessons from approaches used in early childhood education may inform efforts to improve quality metrics in ECD programmes
KEY GAPS? Implementation research gaps for ECD programme measurement: Application of more standardised measures of the caregiving environment (i.e. ‘home-readiness’ for ECD promotion) is feasible but challenges remain especially for simpler, routine, multi-domain outcome measurement and metrics regarding coverage and quality.
Improved measurement is crucial to accelerating and tracking progress towards the Sustainable Development Goals related to early child development (ECD), and the WHO, UNICEF and World Bank Nurturing Care Framework (NCF) emphasises strengthened monitoring as a crucial component of implementation for ECD programmes at national and subnational levels.1 2
However, most literature on child development metrics is focused on challenges in assessing outcomes in research and programmes at relatively small scale.3–11 As yet, measurement of outcomes in larger scale programmes and quality and coverage have received little attention. With increased focus on moving to scale, tracking and accelerating progress in ECD will require improved approaches to monitoring and evaluation (M&E) of programmes in diverse contexts.2 6
Purpose, scope and structure of series
The purpose of this series is to inform design and implementation of ECD interventions at national and subnational level in diverse low-income and middle-income countries (LMIC). The series is structured around a programme cycle, outlined in figure 1 and built on a mixed-method evaluation of the Saving Brains, Grand Challenges, ECD portfolio as well as several additional analyses.12–15 The first paper in this series detailed methods for Saving Brains portfolio evaluation and findings related to programme design and implementation decisions in diverse LMIC.12 Subsequent papers explore measurement of ECD in routine health services,13 global funding for ECD14 and multistakeholder perspectives on scaling.15
Aims and objectives
This paper aims to examine approaches to M&E of ECD programmes especially in LMIC.
Review requirements for monitoring and evaluation: consider guidance for M&E in other areas of maternal, newborn and child health (MNCH) as well as implications of targeted published and grey literature review to adapt a M&E framework for ECD programmes, implemented through health programmes at national and subnational levels.
Describe findings of Saving Brains portfolio evaluation related to measurement: analyse approaches to M&E across the Saving Brains projects, especially measurement of intervention quality, coverage and outcomes, using the adapted M&E framework for ECD and consider implications for improved measurement.
Consider implications for improved M&E in ECD programmes at scale.
Objective 1: review requirements for monitoring and evaluation of ECD programmes at scale
Based on experience in other areas of MNCH plus pragmatic literature review, we adapted a generic programme impact framework for ECD programmes.16–19 Our literature review used MEDLINE and Embase, with the following MeSH terms: ‘Child development’ OR ‘Developmental Disabilities’ AND ‘Developing Countries’. Additional papers were identified through reference lists of papers retrieved. Grey literature was searched via websites of major multilateral organisations engaged in ECD programming such as the WHO, UNICEF, Save the Children Fund, the World Bank, World Vision International and other related organisations. From articles and documents retrieved, themes explored included measurement of ECD and intermediary outcomes, quality, and coverage, challenges experienced, and strategies used to address these in research and programming.
Objective 2: describe findings of Saving Brains portfolio evaluation related to measurement
As detailed in the first paper of this series,12 a participatory mixed-method evaluation of the Saving Brains portfolio was undertaken (2016–2017) by a team at the London School of Hygiene & Tropical Medicine in collaboration with the Saving Brains platform team and innovators. A conceptual evaluation framework informed by the Medical Research Council Guidance on Evaluation of Complex Interventions20 and based around a portfolio-level theory of change was developed to systematically describe and assess direct and intermediary ECD outcomes, quality and coverage measurement across the portfolio (supplementary web appendices figure A and textbox A).12 Between 2012 and 2016, Saving Brains awarded 39 ‘Seed’ and ‘Transition-To-Scale’ (TTS) grants to innovators in 23 LMIC, with diverse backgrounds and variable innovation design and implementation approaches. Seed grants focused on demonstrating ‘proof of concept’ over 18–24 months, while TTS grants aimed to increase scale and sustainability in partnership with other organisations over 3 years.21 Project types within the portfolio were categorised using domains in the NCF; responsive care, early learning, good health, adequate nutrition, and security and safety.12
Projects collected quantitative impact and process data using Grand Challenges Canada prespecified tools and reporting mechanisms that were structured around the portfolio theory of change (see supplementary web appendices figure A and table A).
Qualitative methods were guided by the Consolidated Criteria for Reporting Qualitative Research (see supplementary web appendices textbox A and tables B,C).22 Broad thematic areas of enquiry relevant to ECD programming were established based on literature review, stakeholder consultation and analysis of written portfolio documents. Themes included: impact metrics, intervention content, integration, place of delivery, human resources, coverage and quality, working in partnership, technology and sustainable financing.
Key informants were purposively selected from professional networks and included national and international programmers and policy makers, ECD researchers, Saving Brains innovators and members of the Saving Brains platform (see supplementary web appendix table B). All were invited to focus group discussions and in-depth interviews that were conducted online or face to face at Saving Brains meetings and workshops. In-depth interviews and focus group discussions were facilitated using ‘topic guides’ (see supplementary web appendix table C). Audio recordings of meetings were submitted to a third party for transcription.
Quantitative data were cleaned and analysed using Stata V.14 and Microsoft Excel. Descriptive statistics related to frequency and mode of outcome measurement across the portfolio were generated. Project documents, in-depth interviews and focus group discussion transcripts were imported and coded in NVivo 11.0. An inductive approach was used to create a coding framework, and thematic content analysis was undertaken to explore themes related to outcome, quality and coverage measurement. Qualitative data were coded by two separate members of the evaluation team (KMM and MK-L) until data saturation was reached.
Objective 3: consider implications for improved measurement in ECD programmes at scale
Results of objectives 1 and 2 were synthesised to describe and discussion challenges, opportunities and next steps for improved M&E of ECD programmes at scale box 1.
Definitions of measurement levels in ECD programmes
Impact: measured change in early child development (ECD) outcomes including on cognitive development and human capital.21
Intermediary outcomes: factors that could be considered intermediary to ECD outcomes using an ecological model of child development (eg, child or caregiver nutrition, parental mental health and caregiving environment).21
Coverage: the number of individuals receiving an intervention or service (the numerator) compared with the population in need of the intervention or service (the denominator).47
Quality: variously defined including fidelity (the extent to which the delivered intervention is consistent with that intended) and/or evidence-based content and/or client satisfaction.
Objective (1): review of measures required to monitor and evaluate ECD programmes at scale
Based on the above methods, we considered that the following measurement levels and constructs need to be considered in M&E of ECD programmes at scale (table 1 and figure 2).
Objective 2: consider Saving Brains projects approach to M&E and implications for measurement in programmes at scale
Saving Brains portfolio evaluation included 39 projects implemented across 23 LMIC between September 2013 and November 2016. Of these, 34 were Seed and 5 were TTS projects. Most projects (n=32) included some aspect of responsive caregiving and early learning. Details of individual project details are provided in the first paper of this series.12 Table 2 outlines larger TTS projects and approaches taken to M&E within them.
Most implementing teams reported on cognitive, language, motor and socio-emotional development (table 3). Few projects reported on broader aspects of ECD (eg, executive functioning, adaptive skills) and no projects specifically measured disability or sensory impairments, including those following high-risk newborn populations. Given the age focus of the portfolio and short grant duration, no projects measured educational or longer-term outcomes.
Of the 37 projects that measured ECD outcomes, 51% (n=19) used a comprehensive developmental assessment. Of these, the majority used the Bayley Scales of Infant and Toddler Development (BSID II or Bayley-III), although very few formally translated and adapted this tool and none were locally validated. Regionally developed instruments including the Malawi Development Assessment Tool and the Kilifi Developmental Inventory were used in several projects in sub-Saharan countries.23–27
Among those using screening tools (49%, n=18), the Ages and Stages Questionnaires (ASQ) was the most common; ASQ was used for 22% (n=8) of projects across the whole portfolio with similar challenges in adaptation and validation. Alternative screening tools included those developed in high-income settings (eg, Parents’ Evaluation of Developmental Status Survey of Well-Being of Young Children) as well as some specifically for LMIC settings (eg, Ten Questions Questionnaire, Saving Brains Early Child Development Scale [a precursor to the Caregiver Reported Early Development Index]).28–33
Forty-three percent (n=16) of project teams used a combination of tool types. Biological methods for assessment were used by one project examining event-related electroencephalograms in infants. Details of instrument translation and adaptation were reported for a minority 19% (n=7) of projects.
Approximately half of projects (51%, n=20) measured child growth and/or nutritional outcomes, mostly anthropometry with a few reporting more detailed nutritional outcomes.
Caregiver capabilities, caregiver–child interactions and/or the home environment were measured in 39% (n=15) of projects. Details of specific tools were frequently unavailable; however, the Home Observation for Measurement of the Environment was most commonly used as well as the Family Care Indicators.34
Several projects measured maternal physical (13%, n=5) or mental health (31%, n=12), particularly maternal depression using a wide range of tools.
Coverage and equity
Within the Saving Brains portfolio most project teams measured number of children receiving an intervention, often in comparison with initial project targets. However, few projects measured coverage in terms of broader population-level need. This was likely due to the small size and emphasis on ‘proof of concept’ rather than scaling in the majority of Seed projects.
However, even for TTS grants that were more actively focused on scaling (table 2), population need for ECD interventions was typically defined on population-level data of risk factors (rather than ECD status per se) or implicit local knowledge. For example, in the absence of direct population-level data on ECD, nutritional or socioeconomic status, for which population-level data were more commonly available, was used to consider both population level need and coverage. Box 2 provides an example of the use of population-level data on socioeconomic status for targeting interventions, in implementation of an enhanced Family, Women and Infancy (FAMI) Programme to deprived rural communities in Colombia.
Monitoring quality of an integrated nutrition and stimulation intervention for families in rural Colombia52
Context and intervention: in rural Colombia a pre-existing government-funded parenting programme (the Family, Women and Infancy [FAMI]programme) was enhanced through adaptation and implementation of a structured responsive caregiving curriculum, combined with nutritional education and supplementation. Participating rural families were identified by socioeconomic vulnerability according to a pre-existing government ‘Proxy Means Test’.
The enhanced FAMI programme was delivered through combined group and home visits, implemented by local women who were high-school graduates but had no previous child development training.
Approach to measurement of quality: the intervention team considered the following elements in measuring the quality of intervention delivery: feasibility, fidelity and acceptability. Feasibility was defined as the expected ability to implement the intervention given existing resources, including personnel, context and intervention characteristics. Fidelity was defined as implementation of the intervention to compare with that intended. Acceptability was defined from front-line worker perspective. These elements were measured and tracked through all stages of intervention design and implementation.
Monitoring methods used included active supervision, in-depth interviews, focus groups and surveys of both front-line workers and participants. Process data related to quality indicators were used to make adjustments to both intervention design and implementation throughout implementation.
Implications for measurement of quality with scaling: based on monitoring and evaluation findings, several adaptations were considered necessary for ongoing quality with scaling. In addition, with scaling, intensive methods of monitoring, including videos and detailed qualitative methods, were considered likely to need adjustment. However, front-line worker supervision through intermittent observation and participant satisfaction surveys was considered to be both important and feasible moving forwards.
During focus group discussions, key informants reflected that the quality of intervention implementation was crucial to achieving and maintaining intervention impact with scaling. However, as reflected by the examples provided in boxes 2 and 3, there was no consensus about how to define quality. For example, the enhanced FAMI programme in Colombia, an integrated nutrition and parenting programme implemented through government services, developed their own definition and approach to monitoring quality. Their definition of quality included feasibility, acceptability and fidelity, and the project team developed their own approaches to measurement of these elements. By contrast, the Mobile Crèches project, implemented as an early childhood education (ECE) initiative in urban slums in India, drew on the large existing quality improvement literature within ECE plus specific, related tools to measure quality.
Monitoring quality in an enhanced early learning intervention for children of construction workers in India54
Context and intervention: Mobile Crèches (MC) has worked for several decades in partnership with other non-government organisations (NGOs), construction industry, government, local communities to increase access to quality crèches and early childhood education services across 20 states of India. A Saving Brains, Transition to Scale grant enabled the MC team to test the feasibility, effectiveness and scalability of its workplace-based childcare programme for young children of migrant construction workers.
Within the TTS project, MC reached out to 5000 under six children with a child care programme delivered through partnerships with 11 NGOs and 24 builders at 40 crèches. The model for scaling was based on partnership to support local NGOs with rigorous supportive supervision and on-the-job training for the first 6 months while at the same time strengthening NGO capacity for ongoing monitoring and supervision. After 6 months, MC continued with less frequent monitoring according to NGO capabilities.
Approach to measurement of quality: quality of the daycare programmes was assessed using the Early Childhood Education Quality Assessment Scale (ECEQAS), which included domains of infrastructure, physical setting, meals, naps, learning/play aids, classroom management and organisation, personal care, hygiene and habit formation, language and reasoning experiences, fine and gross motor activities, creative activities, activities for social development and disposition of childcare workers. NGO-run centres performed well on most on the ECEQAS and observer reports were consistent with this measured findings.
Future directions for scaling: based on the experiences of implementing the pilot project, MC continues to invest with current partners in strengthening the childcare provisions at construction sites and gathering data against child development outcomes and has expanded into new cities. In the near future, MC plans to replicate this model across other worksites such as tea plantations, factories and brick kilns to ensure quality ECD services for vulnerable young children, through low cost crèches and day care centres.
In this paper, through evaluation of measurement approaches used across the largest donor-funded ECD intervention portfolio, we have highlighted the urgent need to strengthen ECD M&E frameworks for scaling.
Specifically, our evaluation, based around an adapted generic M&E framework has highlighted the importance of ensuring that that programming inputs, quality, coverage and impact are all considered, defined and measured.4 Table 1 summarises these elements, building on our evaluation findings and considering indicators often available in routine government services as a basis for developing improved approaches to monitoring in existing systems.
Furthermore, as has been recognised in M&E of other areas of MNCH programming, we suggest that agreement on a select number of indicators in each of these areas will be important to meaningful monitoring and improved programme accountability for ECD moving forward.19 For example, figure 2, which has been used in development of a global newborn measurement improvement roadmap, illustrates that at lower health system levels multiple indicators are typically collected by different groups for diverse purposes, but at higher health service levels, fewer indicators are collected.35 36 While all health system levels are crucial, measurement needs vary. High levels of precision are needed at individual or lower system levels whereas lower sensitivity or precision for the individual is acceptable for measurement at higher system levels. Although alignment of indicators across levels is not possible, agreement on a select number, as indicated in yellow in figure 2, will be important to allow greater comparability across settings and with programme variation.19
However, our evaluation also highlights two major challenges for M&E of ECD programmes at scale.
Challenge 1: measuring ECD outcomes in routine programmes
Consistent with previous literature and despite substantial investment, our evaluation has demonstrated challenges in measurement of ECD outcomes across LMIC, which limit comparative understanding of impact.3–11 Issues highlighted include the use of screening tools to monitor the effect of interventions (used in 49% of projects), although these are not designed to measure impact. Furthermore, there was variability in translation, adaptation, piloting and standardisation of tools and emphasis on short-term cross-sectional rather than longer term measurement of outcomes. Additionally, important domains (eg, vision, hearing, functioning and disability) were often omitted, limiting holistic understanding of individual results.
As such, practical challenges to outcome measurement (eg, cost, access, staff training and cultural adaptability) need to be addressed or alternative approaches explored to improve understanding of ECD programme impact across diverse settings.
As Boggs et al discuss, limitations in existing outcome measurement approaches warrant further research to improve their use in routine services.13 While recent developments in population-level child development tools are encouraging, application of these tools for programme monitoring is not yet established.37
Challenge 2: defining and monitoring quality and coverage
Our analysis has highlighted major measurement challenges related to quality and coverage of ECD programmes.
Intervention quality was considered important but variably defined and measured. Examples such as the enhanced FAMI programme in Colombia demonstrate how, in a research context, multimethod approaches can be used to assess quality and can be fed back into existing programmes to enhance their design and implementation (R Bernal, unpublished data, 2018). However, it is also important to note the observation by authors of that study that quality monitoring approaches need to continue to be adjusted to remain feasible and achievable with scaling (R Bernal, unpublished data, 2018). Furthermore, given relative underexploration of monitoring of quality within ECD programming in health, lessons can perhaps be translated from more developed quality monitoring standards and approaches in ECE.38 The Mobile Crèches project provides an important example of using ECE quality standards in a low-income setting. At larger scale, initiatives such as the Measuring Early Learning Quality and Outcomes provide examples of deliberate efforts to improve feasible and meaningful measurement of quality of preprimary learning environments, which may also be informative for measurement of quality of ECD programmes for younger children.38
Within Saving Brains, emphasis on coverage measurement was mostly limited to coverage within individual projects rather than in the population more generally. For many Seed projects, emphasising proof of concept, this is not surprising. However, as projects and programmes are scaled, further evaluation of coverage related to the overall population in need will be crucial to accelerating impact. This resonates with previous literature emphasising that in other areas of MNCH, monitoring intervention coverage has been crucial to accelerating and tracking progress.39 40
However, within ECD, specific barriers to measurement of both numerator (intervention) and denominator (population in need of intervention) pose challenges to improving coverage measurement. First, the NCF describes many effective ECD interventions, but achieving consensus on what defines a core intervention package (ie, numerator) remains challenging. The NCF has broadly defined ‘responsive caregiving’ but more focused definitions of interventions are needed for estimations of coverage at scale.2 41
Furthermore, the population at need (ie, denominator) needs to be more clearly defined based on population level data rather than perceptions that may be incorrect. The NCF suggests periodic population-based assessment of child development home-care practices, along with risk and protective factors for nurturing care.2 As we have highlighted, many relevant indicators are available in DHS or MICS data that may support better understanding of population need. However, more research is required to see if population-based measures under development can be used for this purpose.32 37 42–44
Given current challenges in coverage measurement beyond small-scale research or projects, it may be possible to use intermediary measures or inputs as a proxy while improving more routinely useable measures of ECD. Consistent with other literature, stakeholders within our evaluation reported the value, of measuring intermediary outcomes, especially to policy makers who value short-term demonstration of change.45 Furthermore, previous literature has demonstrated the predictive value of intermediary factors, such as caregiving environment, on long-term development and educational outcomes.11 46
Finally, as the NCF outlines, improving equity while scaling ECD programmes will require data disaggregated by a broad range of stratifiers including sex, age, income, wealth, ethnicity, migratory status, disability and geographic location.2
Strengths, limitations and future research directions
To improve M&E of ECD programmes at scale, applied research that tests what is feasible and useful in routine health services, as well as approaches to intersectoral coordination is needed. Innovative ways to address well-recognised limitations of current approaches to measuring outcomes including further consideration of long-term function and well-being as well as educational outcomes are required. Greater investment in defining and measuring ECD programme quality and coverage are also crucial to accelerating progress.
Translating calls for action into positive change for child development requires clearer recommendations on what to measure, how to track ECD programmes and course correct when indicated. Evaluation of the Saving Brains portfolio highlights the strengths of a ‘theory of change’ based approach to M&E, and the importance of measuring intermediary outcomes, such as changes in caregiving environment, on the pathway to impact. However, substantial challenges in measurement of child development outcomes reflect the need for further research and development to ensure that outcomes can be meaningfully and feasibly measured in routine programmes. Clearer definitions of ECD interventions, quality and the population in need are also required to understand and improve equitable coverage. Approaches to data disaggregation in M&E are needed to ensure that ECD programmes delivery on their efforts to support all children to thrive.
We would like to thank all the innovators, participants and researchers involved in projects included in the Saving Brains portfolio and evaluation. We would also like to thank Grand Challenges Canada as funder of unpublished data. We would like to thank the Early Child Development Expert Advisory Group for their guidance, and we are grateful to Claudia da Silva and Fion Hay for administrative assistance.
Contributors Technical oversight of the series was led by JEL and KMM. The first draft of the paper was undertaken by KMM. Other specific contributions were made by SB, MB, TD, MG, JH, RH, MK-L, KM, VPH, JR, SS, FT, CT and JEL. The Early Child Development Expert Advisory Group (alphabetically: Pia Britto, TD, Esther Goh, Sally Grantham-McGregor, MG, JH, RH, KM, JR, Muneera Rasheed, Karlee Silver and Arjun Upadhyay) contributed to the conceptual process throughout. All authors reviewed and agreed on the final manuscript.
Funding This supplement has been made possible by funding support from the Bernard van Leer Foundation. Saving Brains impact and process evaluation funded by Grand Challenges Canada.
Disclaimer The authors alone are responsible for the views expressed in this article and they do not necessarily represent the views, decisions or policies of the institution with which they are affiliated.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.
Data sharing statement All data were shared under an agreement with Saving Brains. Further information is available if required.
Patient consent for publication Not required.