Sample size considerations for the external validation of a multivariable prognostic model: a resampling study

Gary S Collins; Emmanuel O Ogundimu; Douglas G Altman

doi:10.1002/sim.6787

Sample size considerations for the external validation of a multivariable prognostic model: a resampling study

Stat Med. 2016 Jan 30;35(2):214-26. doi: 10.1002/sim.6787. Epub 2015 Nov 9.

Authors

Gary S Collins¹, Emmanuel O Ogundimu¹, Douglas G Altman¹

Affiliation

¹ Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Windmill Road, Oxford, OX3 7LD, U.K.

Abstract

After developing a prognostic model, it is essential to evaluate the performance of the model in samples independent from those used to develop the model, which is often referred to as external validation. However, despite its importance, very little is known about the sample size requirements for conducting an external validation. Using a large real data set and resampling methods, we investigate the impact of sample size on the performance of six published prognostic models. Focussing on unbiased and precise estimation of performance measures (e.g. the c-index, D statistic and calibration), we provide guidance on sample size for investigators designing an external validation study. Our study suggests that externally validating a prognostic model requires a minimum of 100 events and ideally 200 (or more) events.

Keywords: external validation; prognostic model; sample size.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biostatistics / methods
Cardiovascular Diseases / etiology
Databases, Factual
Diabetes Mellitus, Type 2 / etiology
Humans
Models, Statistical*
Multivariate Analysis
Prognosis*
Risk Factors
Sample Size*
Validation Studies as Topic

Abstract

Publication types

MeSH terms

Grants and funding