The use of 3D face shape modelling in dysmorphology
Facial appearance can be a significant clue in the initial identification of genetic conditions, but their low incidence limits exposure during training and inhibits the development of skills in recognising the facial “gestalt” characteristic of many dysmorphic syndromes. Here we describe the potential of computer-based models of three-dimensional (3D) facial morphology to assist in dysmorphology training, in clinical diagnosis and in multidisciplinary studies of phenotype–genotype correlations.
Many genetic conditions involve characteristic facial features that are the first clue to a diagnosis. Skill in recognising the facial “gestalt” of some dysmorphic syndromes takes time to develop as rarity limits exposure during training and expertise is typically perfected on the job. Hence, published cases, textbooks and electronic databases such as OMIM and the London Dysmorphology Database (LDDB) are important resources when examining children whose appearance is dysmorphic. Even with adequate knowledge, there remains the problem of reconciling sometimes imprecise descriptions of dysmorphic features in the literature with a personal and potentially subjective examination of an individual patient. International experts in dysmorphology are currently developing standardised terminology to address issues of imprecision and inconsistency,1 and there are well documented approaches to recording craniofacial dysmorphology in a more objective fashion.2 In terms of future technological support, three-dimensional (3D) models of facial morphology are showing potential in syndrome delineation and discrimination, in analysing individual dysmorphology, and in contributing to multi-disciplinary and multi-species studies of genotype–phenotype correlations.
More than 30 years ago, Farkas pioneered techniques for studying facial morphology using direct anthropometry.3 His approach, using a ruler, calipers, tape measure and protractor, has been applied widely in the analysis of facial dysmorphology. Many clinicians undertake such a manual craniofacial assessment and compare a patient’s phenotype to the norms of a control population of comparable age and sex. Ethnic variation can sometimes be taken into account, but normative values exist for relatively few dysmorphic syndromes.3 4 Anthropometry has the advantages of being low cost, simple to undertake (following appropriate training) and relatively non-invasive. Disadvantages include the need for co-operation from the subject and the inability to make additional measurements without recalling the patient.2 Although manual assessment of an individual can be quick, such an approach would be prohibitively time consuming for collecting normative sample data.
Conventional and digital two-dimensional (2D) photography offer rapid capture of facial images, almost permanent retention and opportunity for repeated measurement. The 2D photographs, or occasionally the subjects prior to imaging, are annotated with anatomical landmarks whose spatial co-ordinates can be acquired using 2D and 3D digitising devices on the images or the subjects themselves.5 Once landmark positions are known, measurements can be derived automatically and compared to norms. Photography is less invasive than manual anthropometry but still requires co-operation from the patient and is subject to the skill of the camera operator and available lighting conditions. Furthermore, measurements derived directly from a single 2D image are adversely affected by projection distortion and pose, for example, a backwardly tilted head may give the illusion of low-set ears.
Three-dimensional surface imaging systems have the potential to compensate for inadequacies of 2D imaging by capturing an image that can be inspected from any desired viewpoint (fig 1A). They have already been successfully introduced to a range of clinical situations such as dermatology,6 burns,7 forensic science,8 radiotherapy planning,9 orthodontics10 and maxillofacial surgery.11 Laser and photogrammetric devices, the two most commonly used, capture meshes of tens to hundreds of thousands of 3D points on a human face (fig 1B). The triangulated mesh of points constitutes a 3D surface (fig 1C). The fineness of the mesh, speed of image capture, surface coverage, accuracy and ease of use depend on the underlying technology employed and the features of the individual device. In parallel with device improvement, there has been considerable development of statistical techniques and computer software for modelling and analysing large sets of face images in both 2D and 3D. This short review describes how 3D face shape modelling can be used in syndrome delineation and discrimination, in the categorisation of individual facial dysmorphology and in phenotype–genotype studies. Although the review focuses on surface-based image capture, the reader is reminded that surfaces derived from CT, ultrasound and MRI modalities can be manipulated in much the same way. Constraints on space have forced the omission of descriptions of individual surface imaging devices.
SYNDROME DELINEATION USING AVERAGE FACES
Developments in geometric morphometrics, the statistical analysis of shape variation, have been pioneered in recent years by Bookstein, Kendall and Mardia.12 In the morphometrics approach, sets of 2D or 3D landmark positions for large numbers of individuals can be aligned so their mean positions can be computed and deviations therefrom analysed statistically. Three-dimensional landmark positions produce a polyhedral-like visualisation of the face (fig 1D). Such sparse sets of anatomical landmarks can also be used to warp a set of 3D face surfaces close together (fig 1C), as if they were rubber masks, so as to enable the generation of a correspondence of tens of thousands of common points on each face.13 Once a dense set of corresponded points is established, it is possible to calculate their average positions and produce a representation of the average face surface of the set. The average of a homogeneous set of individual faces typically produces a good representation of characteristic features common to the set.
Row C of fig 2 contains portrait and profile views of the average face surface of a group of individuals (n = 187) with no known genetic condition. The first two columns of the remaining rows of fig 2 show similar views of the average faces of individuals with Noonan (fig 2A, n = 63), Williams (fig 2B, n = 69), velocardiofacial (fig 2D, n = 64) and fragile X (fig 2E, n = 31) syndromes. Provided there are enough 3D images, the average face is not visually dominated by an individual face in the original set and is an excellent vehicle for demonstrating the important features of a facial phenotype of a syndrome.14–19 An average face also avoids issues of patient or parental consent which is an essential requirement when publishing images of individuals. From recent experience of about 10 different dysmorphic conditions, 30 or so images of children of a similar age can be sufficient to obtain a reasonable average face for visualisation purposes.
An alternative and semi-quantitative visualisation is to highlight surface shape differences between the unaffected and affected faces by colouring the densely corresponded points on the affected group mean to reflect their altered location from that on the control group mean. The position difference can be the 3D Euclidean distance between corresponding points or their separation parallel to an axis of choice, for example, laterally across the face, vertically up the face or depth-wise for anterior-posterior comparison. The third and fourth columns of each row of fig 2 show similar views as those in the first two columns but now points on the mean syndromic faces are colour-distance coded. Typically, green indicates where the two surfaces coincide. Red indicates points within the mean control surface at a distance at least that shown on the lowest part of the scale. For example, the red on the nasal alae of the average velocardiofacial syndrome face emphasises their smaller size compared to controls. Blue indicates points outside the mean control surface at a distance at least that shown on the highest part of the scale. For example, blue on the eyes and nose of the Noonan syndrome average face reflects ptosis and greater nose width, respectively. Intermediate colours represent intermediate distances, but care should be taken in their interpretation since the same scale is not used for each syndrome.
A revealing visualisation of face shape difference in a dysmorphic syndrome is shown by a morph, or rapidly interpolated image sequence, between the affected and unaffected mean faces. For training purposes, it is even more informative if a slight exaggeration of the affected mean is used. Morphs between the exaggerated mean face of each affected group in fig 2 and the mean of an appropriately age-matched control group are available for viewing on-line (see http://adc.bmj.com/supplemental). For example, the morph for Williams syndrome clearly demonstrates peri-orbital fullness, a shorter turned-up nose, temporal narrowing, fullness of the lips and backward rotation of the mandible. For velocardiofacial syndrome, the observable differences are malar flattening, hypertelorism, smaller nares, backward rotation of the mandible and slight upward and outward arching of the upper lip. The reader is encouraged to view each of the four morphs in order to evaluate their efficacy in visualising facial dysmorphology.
CONTROL–SYNDROME AND SYNDROME–SYNDROME DISCRIMINATION
For dysmorphic syndromes with known genetic causes, molecular analysis is the appropriate route of investigation in order to confirm a diagnosis. Even then, there are situations where a clinical examination may suggest multiple possibilities for a diagnosis or none at all. How might 3D models of face shape help to distinguish between different syndromic facial phenotypes? In general, single linear anthropometric facial measures are unlikely to discriminate well between controls and a syndrome or between different syndromes. Multiple measurements, following normalisation, can be combined to determine a craniofacial index of dysmorphology and hence give an average profile for each syndrome against which an individual can be compared.20 Combining measures provides a richer description of the dysmorphology, but the loss of the associated 3D geometry ultimately limits their potential. For example, philtrum length and inner canthal separation might be useful discriminators in isolation or in tandem. It is likely, however, that greater discrimination is achievable using the local geometry, that is, relative 3D juxtaposition, of the landmarks affording these two measurements (left and right inner canthi, subnasale and labiale superius).
Landmarks annotating 3D face surfaces (fig 1B) and derived anthropometric measurements found no significant difference in facial asymmetry between controls and syndrome-affected individuals.21 No firm conclusions about specific syndromes were able to be drawn because the 30 syndrome-affected subjects were of mixed ethnicity and affected by one of 18 different conditions. Landmark-based analyses have established strong discriminating features in a series of elegant studies of male–female and control–schizophrenia face shape differences.22 23 These morphometric studies employ a statistical analysis technique, principal component analysis (PCA), in order to transform the number of variables corresponding to the landmark positions to a much smaller number of important principal components or modes of shape variation. For example, 24 3D landmarks result in 72 parameter values being recorded for each face. The use of PCA can reveal as few as three modes explaining discriminating face shape differences.24 The application of a similar PCA-based approach to sets of face surfaces made up of tens of thousands of densely corresponded points, rather than a sparse set of landmarks, gives rise to a similar set of modes of face shape variation. The surface of each face can be reconstructed using a linear weighted sum of the PCA modes. The term dense surface model (DSM) has been coined for such a model of 3D face shape.13 A range of other shape modelling techniques are described elsewhere.12 23 24
It is possible to compute the proportion of face shape variation covered by a single DSM mode, and typically the modes are ordered in terms of increasing coverage. By far the greatest amount of variation, often over 80%, captures variation in overall size of the face (fig 3, mode 1, 79.3%). Subsequent modes may correspond to oval/round face shape variation (fig 3, mode 2, 5.3%) or differences in ear and mandible position (fig 3, mode 3, 2.3%). Depending on the mix of faces, the amount of coverage varies and additional shape complexities will be involved. For a DSM for a mixed collection of faces, for example 187 controls and 69 individuals with Williams syndrome, the first, or dominant mode, still reflects face size and correlates highly with age. Separate regressions of mode 1 against age, for the control and Williams syndrome subgroups, enable a quantitative comparison of facial growth (fig 4) that can also be visualised as a diagnostic aid (fig 5). The colour-distance codings in fig 2 are computed with mode 1 set to 0 in the appropriate DSM and thus emphasise mean shape rather than shape and size differences.
The later modes resulting from the PCA, those corresponding to extremely small amounts of shape variation, can be ignored and typically only those leading modes covering in total 95–99% are included in a DSM. Frequently, only 50–100 modes are required to cover 99% of shape variation in a set of faces. Thus a face can be represented by an ordered sequence of 50 or so numbers. This is a huge data compaction, reducing the representation of a face surface from as many as 75 000 parameters (25 000 3D points each with x, y and z ordinates) down to 50 or so DSM mode values. The average surface of a set of faces is then represented by the sequence of average values of the different DSM modes. A simple and intuitively appealing way to compare an individual face with two sets of faces is to calculate how close, in terms of the 50 or so mode values, that face surface is to the average face surfaces of each set. Whichever of the average faces is closest determines the classification of the individual. This so-called closest mean classification algorithm has achieved control–syndrome discrimination rates of between 85% and 95% for Cornelia de Lange,16 Noonan, Smith-Magenis, velocardiofacial and Williams syndromes. By considering face patches it is also possible to identify regions of the face that are the most discriminating.14 15 Discrimination rates for syndrome–syndrome comparisons are typically a few percentage points lower.
INDIVIDUAL FACE ANALYSIS AND GENOTYPE–PHENOTYPE STUDIES
The average faces of the controls and syndromic groups (both monochrome and colour-distance coded versions), the morphs and the facial growth sequences depicted earlier can be compared with the unaided eye to an individual child’s facial morphology. This may help to identify syndromes with a similar facial phenotype. If a 3D surface imaging system is available, then a semi-automated comparison is possible. The image of the proband’s face can be landmarked and anthropometric measurements automatically derived and compared to appropriate norms for a set of controls or a set of previously diagnosed, and preferably molecularly confirmed, children with a common condition. A next step would be to compare the proband’s face with DSMs of 3D face shape for a range of genetic conditions. For example, for a particular syndrome, the proband’s face shape could be compared using a bootstrapping approach in which multiple DSMs are generated for randomly generated subsets of unaffected controls and individuals affected by the same syndrome. The mean of the unseen classification positions of the proband relative to the mean control and affected faces of the different DSMs can then be calculated along with a confidence interval. This could be repeated, automatically, with DSMs for other syndromes. The resulting league table of the most similar facial phenotypes could then help to determine subsequent investigations including more appropriate genetic testing, and possibly even avoiding or delaying the need to undertake some of the more expensive genetic tests.
An exciting prospect is the use of 3D face shape models in multi-disciplinary and multi-species genotype–phenotype studies. A combination of behavioural, facial morphometric and molecular analyses of an atypical individual with Williams-Beuren syndrome and a related knock-out mouse model identified a gene candidate affecting human and mouse craniofacial development.25
VALIDATION OF 3D SURFACE IMAGING AND LANDMARKING OF FACES
Because they have been in use for much longer, there have been a number of studies of the reliability of laser-based surface scanning devices.26 Some laser scanners require the subject to be rotated or the scanner moved relative to the subject. The associated motion of the scanner or subject sometimes introduces serious artifacts, for example ridges, on the captured surface.27 Photogrammetric devices tend to be quicker and more suited to capturing images of children who have communication difficulties or impulsive movements that affect their co-operation. Photogrammetric devices and landmarking of the associated surfaces have proven to be accurate and consistent,28 although some differences have been found between photogrammetric and caliper-based measurements.29 A recent study evaluated the reproducibility of soft tissue landmarks on 3D face scans.30 Intraoperator data established reproducibility to within 1 mm SD for 12 of 24 landmarks. Interoperator reproducibility was more variable depending on the experience of the operator and the location of the landmark.
Three-dimensional models of facial morphology are beginning to have an impact on clinical genetics and studies of craniofacial development. The visualisations of average syndromic faces and colour-distance coded comparisons with appropriate control group average faces have obvious potential. However, their efficacy in helping trainees improve their ability to recognise dysmorphic syndromes is yet to be tested formally. Landmark- and surface-based models of face shape have shown high levels of discriminating accuracy in research projects in a small number of conditions.
Currently, there are obvious limitations to the use of 3D face shape models in clinical situations. Few genetics clinics have access to a 3D scanning device. The models of facial dysmorphology that exist incorporate relatively few images and so mix individuals of both sexes and from a wide age range. Restriction to a single ethnic group, typically Caucasian, is an additional limitation. As has been suggested for 2D images, population-specific 3D face image databases need to be established for different ethnic groups according to their geographic location, for larger numbers with wider age range, and possibly even for particular anatomical features. With current trends in obesity, body mass variation should also be considered. It is important to study the adult facies in order to recognise individuals who may have a subtle phenotype (including single aspects of a disease) who are at risk of having children later with full clinical manifestations.31 Family-based studies are an important component of a much needed human phenome project.32
Until larger numbers of 3D images have been collected for a wider range of dysmorphic conditions, there are technological developments that may fill the gap. There have been successful discrimination studies using 2D images of children with a variety of dysmorphic syndromes.33–35 More recently, computer scientists specialising in image analysis have developed sophisticated techniques for converting a 2D image to a 3D form that can be compared with a 3D model of facial morphology.36 Already, the approach has been tested with some success on a small number of individuals with acromegaly.37
The recognition of syndromes is not usually based on the presence of major malformations such as a cleft palate or heart defect, but on combinations of minor malformations and minor variants. Therefore, clinical experience and knowledge of normal ranges of morphological features continue to be essential for evaluating dysmorphic features. Furthermore, the identification of atypical individuals for phenotype–genotype correlation studies cannot succeed without the involvement of vigilant clinicians able to identify affected children who are inconsistent with expected behavioural or morphological phenotypes.38 This is yet further motivation for improving the gestalt recognition of facial dysmorphology.
Face animations are available at http://adc.bmj.com/supplemental.
The author is very grateful to Raoul Hennekam and Katrina Prescott for commenting on an earlier draft of the paper. The families and volunteers whose 3D images contributed to the average faces in fig 1 are gratefully acknowledged, as are the family support groups and clinicians who provided the face scanning opportunities: France: Génération 22; Italy: Dr Francesca Faranelli, Dr Francesca Forzano, Professor Teresa Mattina; UK: NewLife, MaxAppeal, The Fragile X Society, the Williams Syndrome Foundation; USA: TNSSG, VCFSEF, The National Fragile X Foundation, the Williams Syndrome Association.
Funding: Professor Hammond’s research is currently funded by the UK charity NewLife and by the US organisations National Institutes of Health (P50 DE016215-01 and Fogarty/NIH R21TW06761-01), Autism Speaks/NAAR and the Angelman Syndrome Foundation.
Competing interests: None.
- dense surface model
- principal component analysis