Analysis of both the content and process of examinations is central to planning the appropriate education and training of examiners in paediatric clinical examinations. This paper discusses the case for developing training, reviews the current literature, and suggests the desirable attributes of examiners and the components of a training programme. Potential areas of further research are also considered.
- examiner training
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
As assessment drives learning, analysis of the content and process of examinations will allow the development of more robust systems of assessment and is central to the training and professional development of tomorrow’s paediatricians. In clinical examinations the conduct of the examination is dependent on examiners who are able to maintain set standards and they, therefore, need defined attributes and expectations, as well as training and monitoring of their performance as examiners.
The published literature relating to the training of examiners is limited. However the Royal College of General Practitioners,1,2 the Royal College of Surgeons, and the Resuscitation Council3 have developed programmes for selecting and/or training examiners and instructors. This paper arose from an initiative of the Royal College of Paediatrics and Child Health (RCPCH) to improve examiner training and describes the components of a new programme for training paediatric examiners in the UK and the desirable attributes of examiners. The paper draws on current literature relating to examinations, both undergraduate and postgraduate, to illustrate the principles of paediatric examiner training, although not all the contexts presented necessarily relate equally to its development.
THE STATUS QUO
The RCPCH assesses doctors’ readiness to become middle grade paediatricians with the MRCPCH (Membership of the Royal College of Paediatrics and Child Health) examination. It consists of written papers in basic child health (part 1A), the scientific basis of child health (part 1B), and clinical problem solving (part 2 written). There is also a clinical examination (part 2 clinical), which, from October 2004, will consist of 10 stations, including communication and consultation skills, and clinical examination and child development. All apart from one station with clinical video clips will still rely on the assessment of clinical ability by examiners. The RCPCH also appoints and trains examiners for the Diploma in Child Health (DCH) which is an optional postgraduate examination for doctors in primary care or other medical specialties. There is a written paper (part 1A) and a clinical examination. In addition to paediatricians, examiners are general practitioners, child psychiatrists, and paediatric surgeons.
Examiners have historically been selected on the basis of clinical experience as a consultant (or general practitioner for DCH), personal recommendation by the Principal Regional Examiner, Paediatric Regional Adviser, or college tutors, or by self-proposal. Potential examiners submit a structured curriculum vitae, concentrating on their expertise in teaching and education. Selection is made by the Examinations Committee and ratified by the College Council, ensuring that examiners have sufficient clinical and teaching experience and that there is appropriate distribution by specialty, hospital, and region. Until recently, the post-selection training of examiners has consisted of written information about the nature and conduct of the examination, observation of a clinical examination, followed by examining for the first time paired with the senior examiner. Subsequent monitoring of performance has sometimes been by a “hawk: dove” index4 devised by the Royal College of Physicians but has been inconsistently fed back to examiners. Until October 2004 marking was by pairs of examiners. Although marks were awarded independently, this system allowed examiners to assess their own level of expectation of a candidate’s performance against each other’s marks. There is also a Senior Examiner Review of the marks when discrepant marks between examiners affecting a pass/fail decision are scrutinised and these may be amended. Feedback is given to examiners in these occasional cases. All candidates who fail are offered copies of all mark sheets and examiner comments.
More recently, two day examiner training conferences have been introduced to teach both the practicalities of the examination process, as well as generic skills including discrimination, the principles of assessment of clinical reasoning, decision making, and communication skills. All examiners for the new MRCPCH clinical examination have received specific training, including simulated or videoed examination encounters as a basis for discussion about conduct of the encounter and standard setting.
PROBLEMS WITH THE STATUS QUO
Assessment methods and practices that have been unchanged for many years need to be re-evaluated and refined. There is evidence that examiners’ current knowledge of technical aspects of assessment, including robust approaches to standard setting, are not yet optimal and require more development.5 The competencies of a well trained satisfactory examiner have not yet been adequately defined.
The process needs to be developed to ensure that it is transparent, fair, and objective. This should be done building on existing models of good practice.
Concerns about reliability and validity
Agreement between examiners can be poor because of undue emphasis on examinee factors such as apparent confidence, appearance, and use of language, particularly in oral examinations.6 Decision making by examiners during oral examinations, for example, is frequently a process of some complexity, and includes initial impressions, hypothesis generation, and hypothesis testing.7 In order to achieve a highly consistent or reliable result from a multiple station clinical examination, about 10 hours of testing time per candidate is needed.8 Clearly this is not feasible, so strategies to increase reliability while maintaining implementation feasibility will be necessary.
Management of change
The nature, content, and design of clinical examinations are in evolution. Examiners need to be equipped with new, relevant skills which require ongoing monitoring.
Conduct of examiners
Examiners must approach the examination with a standardised set of competencies to apply in their assessment of candidates, and they must use these techniques in a consistent way. It is not up to them to introduce idiosyncratic practices of assessment and rating. Because of the potential magnitude of examiner variance many educators have expressed concerns about pass/fail decisions being made on the basis of clinical examinations.9 Candidates and examiners must have a clear definition of the clinical competencies required to be demonstrated and a clear understanding of what constitutes suitable evidence of these in order for the examination be judged to be fair.10
Managing candidate diversity
A fair examination requires equal opportunity for candidates. Examiner awareness about a candidate’s disabilities will assist in making reasonable accommodation in examinations where required.11 While studies show no evidence of explicit examiner behaviour that might disadvantage students from ethnic minorities in comparison to white students in oral examinations, there is evidence that some ethnic minority candidates tend to use communicative styles that are poorly rated. This may influence the assessment of the candidates’ communication skills.12
The cost of legal challenge
In an environment of increasing awareness of clinical governance and an expectation that those responsible for appraisal and assessment will be trained and accountable for their decisions, complaints or legal challenges relating to decisions are possible. In addition, candidates have access to data documenting their performance in examinations and this highlights the need for documentation of decisions to be rigorous and transparent.
Research that underpins the process of the examination of clinical competence has been limited. There is a need for closer collaboration between research and practice.13
DOES TRAINING IMPROVE VALIDITY, RELIABILITY, OR FAIRNESS OF EXAMINATIONS?
Examiners need to be conversant with the concepts of content and construct validity.14 Content validity is a measure of the ability of the specific assessment systematically and representatively to assess the full range of activities it is designed to reflect. In other words, if the examination is supposed to evaluate potential clinical competence, does the test measure an adequate range of abilities from which to draw a conclusion that the candidate is indeed competent. Construct validity is the extent to which performance on a test fits into a theoretical scheme about the attribute the test seeks to measure. In clinical terms the assessment must measure skills that do, in fact, reflect the abilities theoretically characteristic of a competent clinician at this stage in their training. Although these issues are the responsibility of those designing the examination and are not the primary responsibility of examiners, a better understanding of such concepts is required for examiners to develop uniformity in assessment. Validity is a measure of how completely an assessment tool measures what it purports to. Table 1 presents the key features of the various type of validity.
This is the degree to which an assessment method produces reproducible outcomes. The impact of examiner training on reliability remains to be clarified. The literature is sparse, and where available, is divided on this issue.16 In a study of undergraduate examiners, the most significant improvements of reliability were achieved by identifying and eliminating the most inconsistent examiners, and, surprisingly, training as such had no significant impact on reliability.17 In another study, the training of doctors as examiners provided no significant benefit in terms of reliability over the use of medical students and lay staff as examiners in objective structured clinical examinations (OSCEs).18 There is also a debate around whether long examinations are needed to achieve higher levels of reliability in assessment by OSCEs.19 Studies on oral examinations show that unless examiner behaviour is extreme, it can be changed by training.1
This is the degree to which differences exist between the scores of different examiners. Variability is an important contributor to overall error in examinations. Previously, significant inter-examiner variability in scoring was adjudicated by the senior examiner and the Examinations Committee; in the restructured clinical examination to be introduced in October 2004, examiners will no longer assess candidates in pairs but will work alone to judge performance on a specific parameter of clinical competence. While this strategy will allow a more detailed assessment by increasing the components of the clinical examination, it is important to reduce the problem of inter-examiner variability while maintaining the validity of individual examiner input.20 Influences on variability include differences in the training of examiners and their involvement in station construction.21 Another can be a lack of explicit rating criteria.22 In any situation there are several potential sources of error, all of which will contribute to the overall reliability of the assessment. These include in addition to inter-observer error, intra-observer error, and test-retest reliability, as well as complex interactions between examiners, candidates, and station. Examiner involvement and commitment are more associated with inter-examiner reliability in examining, than experience in clinical medicine or as an examiner.23 Increasingly, therefore, a more sophisticated approach to the determination of reliability is being used, known as generalisability which allows all sources of error to be taken into account. In addition it is possible to take the analysis one step further and model the circumstances necessary to achieve a given level of reliability.24
COMPETENCIES OF EXAMINERS AND THEIR SELECTION
Work is currently being undertaken by the RCPCH to define the competencies of an appropriately trained examiner. We suggest the desirable attributes of an examiner should include:
Ability to use defined techniques to elicit the best performance from candidates
Keep abreast of current developments and issues in the profession25
Have knowledge of the principles and practicalities of the examination
Have an understanding of educational theory and practice in relation to assessment
Be able to make consistent and unbiased judgements
Have an understanding of reliability and validity
Have the ability to make and justify pass/fail decisions and develop the skill of marking candidates using the full marking spectrum
Be active clinically2
Act as an effective member of a small team2
Possess effective interpersonal skills2
Be dedicated to respect, fairness, and courtesy towards candidates while maintaining an appropriate level of enquiry
Be willing to accept training and regular monitoring of performance
Be objective in analysing and comparing a candidate’s performance against defined levels of competence
Be able to manage the diversity of candidates, incorporating the adaptation of examining style to candidate needs
Have the commitment and professionalism to examine and host the examination regularly, actively participate in regular examiner technique updates, and provide questions for the written examination
Be appropriately qualified with respect to degree requirements, level and length of general paediatric experience, professional credentials, revalidation, and accreditation
Be involved with junior staff training to be conversant with the standard expected of them.
In order to maintain the credibility of the examination, the selection of examiners should be rigorous, open, and well defined. Consistency in evaluation has been shown to be crucial, and prior assessment of this ability improves the selection of appropriate examiners.20,26,27 A model of selecting generic examiners (for both MRCPCH and DCH) adapted from the comparable model of the RCGP2 might be:
Self-proposal or recommendation/nomination
Applicants must be formally supported by at least two colleagues (e.g. Principal Regional Examiner, Regional Adviser in Paediatrics, Member of the College Examining Board, or a current examiner)
Applicants must also (in concordance with the competencies of the examiner listed above):
– Hold FRCPCH or equivalent postgraduate qualification
– Be clinically active in either general paediatrics or an appropriate specialty
– Be up to date with a continuous professional development programme (CPD)
– Demonstrate their ability to judge performance by ranking candidates in an order that correlates well with other examiners
– Have experience in managing and supervising junior and middle grade staff
– Have completed some appraisal and assessment training
– Demonstrate to referees evidence of prior experience/training in communications and managing diversity awareness
– Be prepared to commit to a training programme which will include preliminary assessment (examples could be sitting the part 2 written paper, creating questions, and possibly being observed performing during a mock clinical examination)
– Be prepared to commit to the ongoing requirements of being an active examiner
– Demonstrate the ability to receive and act on feedback.
Final ratification as an examiner would depend on satisfactorily completing examination training. Should an examiner not be selected in spite of completing a training programme, it is essential that this can be justified. An area that merits further evaluation is allowing middle grade doctors such as specialist registrars (SpRs) to nominate and serve as referees for consultants who are potential candidates for the post of examiner. SpRs have been shown to have considerable insight into the components of both clinical competence and good training.28
COMPONENTS OF A TRAINING PROGRAMME
The components of a formal training programme have yet to be determined but should be based on the principles discussed above.
An indicative (not exhaustive) list of areas to be covered might include:
Principles of assessment
Ethics, accountability, discrimination, fairness
Understanding cultural differences and expectations
Handling different learning and working styles in candidates
Planning and running structured medical examinations
Giving feedback (for counselling)
Revision on clinical examination skills (what to look for and what to ask)
Role play (by examiner trainees) and feedback (videoed sessions)
Observation of actual examiners by experienced supernumerary examiners for an exchange of ideas and approaches and a refinement of content and style of questioning6
Assessment of clinical reasoning and clinical decision making
Critique of videoed clinical examinations and evaluation of candidates’ performance
Final assessment of examiners skills.
The final elements of selection and training need collaboration between clinicians and those with educational experience. Training of examiners is a necessary but not a sufficient condition for the conduct of examinations. Hospital Trusts will have to commit to providing protected time for courses and assist with hosting of examinations, and additional research will have to be conducted to assess whether what has been learnt is applied in regulatory assessments.29 Examiners must be seen to be transparent in decision making in order to ensure fairness.30
AREAS FOR FURTHER RESEARCH
Reliability and validity: Further studies are needed to continue to establish reliability of postgraduate examinations. It is necessary to discover what reliability indicators are intended to measure and what conclusions can be claimed from their use. The use of generalisability to determine reliability of existing examinations would allow the sources of error contributing most to the overall error to be determined. Training of examiners and any subsequent modification of the examinations could be targeted at those sources of error contributing most to the overall error of measurement. Further work on all aspects of validity is important. There is some evidence to suggest that measurement of face validity has proved feasible and valuable and will assist in the further development of RCPCH examinations themselves,33 with obvious implications for relevance in examiner training.
Further exploration of the variation in marking among current examiners: The effects of examiners’ gender, experience, academic rank, regional affiliation, and country of qualification on examiner behaviour have been studied and have proved useful in eliminating deviant patterns of grading and presenting candidates with a balanced panel of examiners, in addition to improving the standardisation of content.22 Factors influencing examiner behaviour in College examinations could be investigated.
Evaluation of the training programme envisioned here: The training programme for examiners should be evaluated periodically so that it can continue to be fine tuned and updated. Data collection and analysis should focus on outcome measures such as examiners’ views on the training process, candidates’ confidence in the MRCPCH examination for fairness, and educationalists’ analysis of changes in validity and reliability.
The training of examiners is an essential component of a process to ensure validity and reliability in UK paediatric clinical examinations. It is crucial, however, that training is rigorous, evidence based, contains formative and summative assessment of examiners, is regularly evaluated, and has a defined curriculum directed towards achieving the required competencies of examiners.