Table 1


Type of validityKey features
Face validityHow well does the test appear to measure what it is supposed to? May be important in terms of doctors’ willingness to accept assessment tools. Doesn’t actually mean that it does measure what is appears to.
Content validityRefers to the concept that we can infer from a given task or series of tasks how well a doctor functions in a given domain.
Criterion validityRefers to the correlation of a measure of attribute or skill of interest with some other measure of that attribute or skill, ideally a “gold standard” which has been used and accepted. Comparison may be contemporaneous (concurrent validity) or with a measure sometime in the future (predictive validity). Gold standard often not available for performance assessment.
Construct validityRefers to how well a test measures attributes that are not directly observable and is also useful where there is no gold standard to measure an attribute. Establishing construct validity is an ongoing process as it often involves testing a number of constructs—for example, “based on my theory of construct X, people who score high on a test of X differ from people who score low in terms of attributes A, B, and C, where A, B, and C can be other instruments, behaviours, diagnoses and so on”.15
Consequential validityRefers to the effect that assessment has on learning—while less well described than other types of validity it is highly relevant because it recognises the close link between assessment and learning and the need to use assessment strategically.