Introduction

Fabry disease (MIM 301500) is an X-linked lysosomal storage disorder (LSD) caused by deficient activity of the lysosomal enzyme α-galactosidase A (α-GAL A; EC 3.2.1.22). In its absence, globotriaosylceramide (Gb3) accumulates in the lysosomes of various cell types, particularly in endothelial cells. Clinical onset of the disease is often in childhood or adolescence and is characterized by recurrent pain episodes in the hands and feet (acroparesthesias), angiokeratoma, fever and abdominal complaints. Later in life, progressive substrate accumulation leads to renal failure, cardiomyopathy and multiple cerebrovascular accidents, which are associated with a reduced life expectancy. Affected female patients tend to show a more protracted course, but may also suffer from severe complications. The disease is rare with a reported incidence of 0.2–0.8 per 100 000 newborns.1, 2

In recent years, treatment of Fabry disease by administration of recombinant α-galactosidase preparations (agalsidase α (Replagal™) and agalsidase β (Fabrazyme™)) has become available. This so-called enzyme replacement therapy (ERT) aims to reduce the storage of glycosphingolipids, and stabilization of renal function and reduction of left ventricular hypertrophy have been reported.3, 4, 5

In some LSDs, such as in the mucopolysaccharidoses and oligosaccharidoses, facial dysmorphology is a striking feature and may even be present at birth.6 The characteristic facial coarsening may assist in diagnosing patients. Fabry patients may exhibit minor facial abnormalities. Skeletal abnormalities, including widening of the sella turcica, increased head size and acromegalic-like appearance, have been reported. Furthermore, maxillary and mandibular prognathism have been described.7 The occurrence of dysmorphic features in Fabry disease was recently confirmed in a large cohort of male patients.8

Three-dimensional (3D) imaging is becoming a widely used tool for facial analysis. Classical anthropometric analysis of distances and angles can be based on 3D facial landmarks generated using electromechanical digitizers.9 Volumetric analysis has been used to assess HIV-related facial lipoatrophy using laser scanners.10 Surface-based shape differences can be analysed using dense surface models (DSMs) derived from images captured with 3D photogrammetric cameras as previously described for Cornelia de Lange syndrome11 and for Noonan, velo-cardio-facial, Smith–Magenis and Williams syndromes.12, 13, 14 In this study, we aimed to elucidate dysmorphic facial features in patients with Fabry disease using a similar DSM-based approach.

Patients and methods

Patients

The Fabry patient group comprised 20 male patients 19.2–60.4 years old (mean: 38.4 years) and 22 female patients 21.0–60.5 years old (mean: 44.0 years). All Fabry patients had been diagnosed by means of a reduced α-GAL A activity in leucocytes (male patients) and/or by mutation analysis in the α-GAL A gene in female patients. Thirteen of the 22 female patients and 17 of the 20 male patients were imaged while being treated with ERT (either agalsidase-α (Replagal™, TKT/Shire) or agalsidase-β (Fabrazyme™, Genzyme)). Two of the male patients were imaged before and following 6 and 12 months of treatment. Mean treatment duration was 14 months (range: 2–31) in female patients and 32 months (range: 9–69) in male patients. Images of healthy subjects were selected from an existing collection and included 80 unrelated male patients 18.0–66.5 years old (mean: 38.8 years) and 80 unrelated female patients 21.2–67.5 years old (mean: 42.8 years). All subjects in the study were European Caucasians. The study had research ethics committee approval, and participants, parents or guardians gave informed written consent (JREC 00/E042).

3D imaging

A total of 206 3D face images of controls and individuals with Fabry disease were captured using commercial photogrammetric devices (http://www.canfield.com; http://www.3dMD.com). The number of surface points captured by the most recent models of these devices is of the order of 25 000 on an adult face.

Each image was manually annotated by the same operator (PH) with 25 landmarks: left and right endocanthion, exocanthion, palpebrale superius, palpebrale inferius, preaurale, otobasion inferius, crista philtrum, cheilion, alare and lower lip third; nasion, pronasale, subnasale, labiale superius, and gnathion. The accuracy of 3D photogrammetric devices and associated landmarking has been shown to be highly reliable, with lip- and eye-based locations being the most reproducible.15 Left and right frontotemporal landmarks are generally not reproducible because of variation in placement along the face surface. Hence, these two additional landmarks were used solely to calculate the bitemporal width of the face, where the placement error has much less effect, and not in the construction of the face-shape models.

Following the approach described previously,13, 16 a dense correspondence of points on the face surfaces was generated separately for the male and female subgroups. Four average faces were computed for the affected and unaffected individuals in the male and female subgroups. A DSM of face shape was then generated for each of the male and female subgroups using only those principal components (PCs) of the associated analysis that in total covered 99% of face-shape variation. Using these two DSMs, static images and dynamic morphs were computed comparing the mean faces of the controls and the patients. Analogous DSMs were computed for patches of the face surface: peri-aural, peri-orbital, peri-oral, peri-nasal and supra-orbital regions.

For both the male and female subjects, the densely corresponded points used to compute each DSM were colour coded on the mean face of the affected subgroup in order to visualize localized distance between it and the unaffected mean surface (Figures 1 and 2). Points on the affected mean surface are shown in green if they are inseparable from their corresponding points on the unaffected mean surface with respect to a particular scale range (−r,+r). Red is used where the affected surface is smaller and r mm or more within the control mean surface and blue is used where the affected surface is larger and r mm or more beyond the control mean surface. Intermediate colours of the red–blue spectrum shown on the associated scales reflect distances between −r mm and r mm.

Figure 1
figure 1

The top row shows portrait and profile views of the average male control (n=80). The inner two images of the middle row show the average face of the male Fabry group (n=20) and its colour-coded comparison with the average of the male control group (outer two images). The colour scale uses blue, respectively red, to indicate regions where the average male Fabry face is 3 mm or more outside/larger than, respectively, within/smaller than, corresponding regions of the average male control face. The bottom row repeats row 2 but employs a version of the mean male Fabry face that is exaggerated as far as the most extreme classification of the male Fabry faces in Figure 4.

Figure 2
figure 2

The top row shows portrait and profile views of the average female control (n=80). The inner two images of the middle row show the average face of the female Fabry group (n=22) and its colour-coded comparison with the average of the female control group (outer two images). The colour scale uses blue, respectively, red, to indicate regions where the average female Fabry face is 3 mm or more outside/larger than, respectively, within/smaller than, corresponding regions of the average female control face. The bottom row repeats row 2 but employs a version of the mean female Fabry face that is exaggerated as far as the most extreme classification of the female Fabry faces in Figure 5.

A 20-fold cross-validation of the classification of the faces of the 80 male controls and 20 Fabry male patients was undertaken. Each of the 20 randomized and stratified splits involved computing a DSM for a training set of 76 controls and 19 patients. An analogous procedure was undertaken with the female subjects. For any particular DSM, the distance between any two face surfaces, whether individuals or means, is computed as the Euclidean distance between their representative PC weight vectors. Unseen classification using the closest mean algorithm was carried out on the remaining Fabry patient and controls. Thus, each Fabry patient was omitted in turn for the construction of a DSM and then tested against it unseen. In the case of the two male patients scanned three times (before and twice during treatment), all images captured during treatment were omitted from the computation of these DSMs. Using an approach described elsewhere,12 each individual was introduced into the particular DSM that did not include them. Their positions relative to the mean faces of the control and the other affected individuals were then calculated (Figures 3 and 4). Thus, the face of each Fabry patient was categorized unseen, giving an unbiased and objective classification of his or her face in terms of relative similarity to the means of the Fabry and control faces in the DSM multi-fold cross validation training sets.

Figure 3
figure 3

(a) A colour-coded comparison of the average male Fabry and control faces along a coronal axis using a scale as shown. Thus, the redness of the nose and yellowness of the surrounding region indicate mid-facial hypoplasia. (b) A similar comparison for the average of the female control and Fabry groups. Here, the colour coding relates to a vertical axis. The blueness of the nose indicates its higher relative position on the mean Fabry female face. The yellowness of the eyelids reflects mild ptosis.

Figure 4
figure 4

Unseen classification of male faces in terms of similarity to mean male control and mean male Fabry faces using 20 separate DSMs. The y-axis measures age in years and the x-axis measures relative position between the average male control and male Fabry faces. The labels C1 and F1 on the x-axis indicate the positions of the average male control and average male Fabry faces, respectively. EF1 identifies the position of the most extremely classified face of the male Fabry patients.

Anthropometric measurements were also derived from the anatomical landmarks. These were suggested by the clinical experience of the authors, by the previous studies of the Fabry phenotype or following inspection of the dynamic morphs and colour-coded depictions of the mean face-shape differences calculated above.

Statistical analysis

The anthropometric measurements of the affected and unaffected individuals were compared using two-tailed t-tests. The female and male groups were analysed separately. For the discrimination testing, three statistical pattern recognition algorithms, closest mean, linear discriminant analysis and support vector machines,17 were used to categorize the unseen test faces. With closest mean, the average faces are computed for the control and syndrome subgroups in the training set and each unseen test face is classified according to which average it is nearest using its DSM representation. For the linear discriminant analysis, the goal is a linear combination of PC modes that exhibits the largest difference in the subgroup means relative to the within-group variance. Support vector machines, or large margin classifiers, focus on individual cases in the overlap of the subgroups to be classified that help to define a separating surface with the largest margin between the subgroups.

Results

Mean male and female faces and dynamic morphs

Static portrait and profile views of the mean faces of the unaffected and affected male subjects are shown as the inner pairs of images in the first and second rows, respectively, of Figure 1. Points on the affected mean surface (row 2 of Figure 1) are shown in green if they are inseparable from their corresponding points on the unaffected mean surface. They are coloured red if they are 3 mm or more within the mean control surface and blue if they are outside by 3 mm or more. Intermediate colours of the red–blue spectrum shown on the scale are used to reflect distances between −3 and +3 mm. Thus, the colour coding is sensitive to the scale used as well as the distance between corresponding points and needs to be interpreted carefully. The colour-coded comparison in Figure 3a uses a distance measure orthogonal to axes formed by the left–right exocanthia (x-axis) and nasion–gnathion (y-axis) of the overall mean face for the mixed group of faces in the DSM. Thus, it emphasizes differences parallel to the coronal z-axis of the head. The colour coding, in this case, uses blue to indicate displacement in the positive direction of the axis and red for displacement in the negative direction. The colour-coded image in Figure 3b is for a similar female comparison along the y-axis, reflecting upward displacement (blue) and downward displacement (red) using the same scale.

Dynamic morphs of portrait and lateral views between the mean control and Fabry patient are available in the Supplementary Material provided. By inspecting these video morphs, the viewer is able to see a much more informative comparison of the face-shape differences between the male controls and the male Fabry patients. When compared with the mean face of the male controls, the colour-distance-coded comparisons of the mean Fabry male and the dynamic morphs suggest that the mean male Fabry face has periorbital fullness, more prominent supra-orbital ridges, a larger bitemporal width, bushier eyebrows, ptosis of the eyelids, a more acute nasal angle, a broader nasal base, a slightly shorter and more bulbous nose, fuller lips, some additional fullness to the cheeks, and a slightly larger chin. The shallower depth to the orbit at the endocanthion and the rotation of the philtrum, and also Figure 3a, suggest very localized minor midfacial hypoplasia. By viewing the dynamic comparisons of the mean faces (shown statically in the inner pairs of rows 1 and 2 of Figure 1) these face-shape differences are shown in a more dramatic manner. The third row of Figure 1 shows a similar comparison but uses the mean Fabry male face exaggerated slightly further from the mean control face, but only as far as the face of the male Fabry patient who is shown (later) to be furthest from both the male control and the male Fabry mean faces.

The static comparisons of Figures 2 and 3b, as well as the dynamic morphs in the Supplementary Material, show only minor face-shape differences in the female case. The forehead shape is more prominent in the female Fabry mean face compared with the control mean. The mean female Fabry face is also wider at the temples and at the lower attachment points of the ears, and the nose is shorter but has a wider alar base.

Hypotheses about face-shape difference deduced from the colour-coded comparisons of the mean faces or from the dynamic morphs between them can be validated by measurements derived from the landmarks used in the DSM building process. For a range of anthropometric measurements, Table 1 gives the mean and variance for the affected and unaffected males as well as the results of a double-tailed t-test comparison where this is appropriate. For the male case, statistically significant differences confirm the wider nose, larger bitemporal width and ptosis but no significant differences in upward displacement of the nose or in nose length. The female Fabry mean face has statistically significant differences in a wider and shorter nose that is displaced upwards, and a greater head width. The detected ptosis is not statistically significant, nor does there appear to be any midfacial hypoplasia.

Table 1 Facial measurements derived from 3D landmarks

Face-shape categorization using statistical pattern recognition algorithms

Figure 4 shows a scatter plot of age against the position of each male face in terms of similarity to the mean faces of the male control and Fabry groups. The plot is an amalgamation of the closest mean classifications of the 20 male Fabry patients and the 80 controls arising from the 20 DSMs described above. The x-axis is labelled so as to indicate the position of the control mean (C1) at −1, the Fabry mean (F1) at +1 and that of the exaggerated Fabry mean (EF1) used in the colour-coded visualizations above.

Using a criterion of strict closest mean, the faces of 11 of the 20 male Fabry patients are correctly categorized and the remainder are classified as more ‘control’ like. Ten of the controls are also misclassified giving a sensitivity of 55% and a specificity of 87.5%. By altering the decision boundary, it is possible to generate a receiver-operating-characteristic (ROC) curve as shown in Figure 6. A convenient summary statistic for an ROC curve, the equal error rate, corresponds to the point on the curve where sensitivity=specificity. We refer to this single statistic as the overall classification accuracy. For the ROC in Figure 6, the overall accuracy for closest mean classification using the full area of the face is 75%.

Figure 6
figure 6

ROC curves for closest mean classification of faces of the female and male subgroups.

Figure 5 shows a scatter plot of age against the position of each female face in terms of similarity to the mean faces of the female control and Fabry groups. The plot is an amalgamation of the closest mean classifications of the 22 female Fabry patients and the 80 controls arising from 22 DSMs generated analogously as for the male classification testing. Similarly, the x-axis is labelled so as to indicate the position of the female control mean (C2) at −1, the female Fabry mean (F2) at +1 and that of the exaggerated female Fabry mean (EF2) used in the colour-coded visualizations of Figure 1. Using a criterion of strict closest mean, the faces of 13 of the 22 female Fabry patients are correctly categorized and the remainder are classified as more ‘control’ like. Eleven of the female controls are also categorized incorrectly, giving a sensitivity of 59.09% and a specificity of 86.25%. Figure 6 shows the corresponding ROC curve for the female classification results using the closest mean.

Figure 5
figure 5

Unseen classification of female faces in terms of similarity to the mean female control and mean female Fabry faces using 20 separate DSMs. The y-axis measures age in years and the x-axis relative position between the average female control and female Fabry faces. The labels C2 and F2 on the x-axis indicate the positions of the average female control and average female Fabry faces, respectively. EF2 identifies the position of the most extremely classified face of the female Fabry patients.

Table 2 summarizes the overall classification accuracy for the statistical pattern recognition algorithms (closest mean, linear discriminant analysis and support vector machines) and for the entire face as well as the face patches identified earlier. For the male subgroup, the peri-orbital region is the most successful in correctly categorizing faces and achieves a maximal rate of 85% for the SVM algorithm. For the female subgroup, no combination of region and algorithm categorizes the faces of the affected and unaffected subjects at an overall classification rate higher than 66.5%.

Table 2 Average overall discrimination performance for three pattern recognition algorithms for control-Fabry comparisons for both male and female subjects

Discussion

In this cross-sectional study facial dysmorphism in patients with Fabry disease is objectively assessed by 3D dense surface modelling and anthropometric analysis, revealing facial dysmorphic features in males and to a lesser extent in female patients. The most prominent abnormalities found in males are peri-orbital fullness, prominent supra-orbital ridges, large bitemporal width, bushy eyebrows, ptosis of the eyelids, acute nasal angle, broad nasal base, shorter and more bulbous nose, full lips, fullness of the cheeks and a larger chin. In female patients, a more prominent forehead with a wider face at the temples and at the lower attachment points of the ears was found. Female subjects further tend to have a shorter nose with a wider alar base.

Our results are in line with the features described by Ries et al,8 who assessed facial characteristics in 38 male Fabry patients based on medical photography and phenotype assessment by three different experts. Supplementary to their results our approach enables a more reliable quantification of the characteristics seen in Fabry patients, because results are generated by objective quantitative analysis and are not hampered by inter-observer variability. In addition, we compare the results obtained in both male and female Fabry patients with the images of healthy controls.

Coarsening of facial features is a frequent feature in a number of LSDs,6 but facial characteristics differ between the various diseases. Although we demonstrated statistically significant face-shape differences from healthy controls, the reported abnormalities are without doubt less obvious than in patients suffering from a number of other LSDs. We believe, in contrast to Ries et al,8 that the characteristic facial features now reported in Fabry patients will not be helpful in the diagnostic process. Using our DSM-based approach, a maximal overall classification accuracy of 85% for male and 67% for female patients is reached. Such accuracy rates for detection of face-shape differences will be unsatisfactory for clinical use. Unaffected relations of the Fabry patients were excluded from the study. Previously, we found that inclusion in the control set of unaffected siblings affected the discrimination accuracy by 1 or 2% points only.

In genetic conditions where the facial dysmorphism is usually stronger than in Fabry disease, such as Cornelia de Lange, Noonan, Smith–Magenis and Williams syndromes, the DSM-based approach has achieved discrimination rates between 85 and 95%.12, 13 At these higher levels of accuracy, less experienced clinicians may be encouraged to undertake appropriate genetic testing where it is available and there is uncertainty about the diagnosis. The dynamic and colour-coded visualizations of face-shape differences may be of use in clinical training. The approach is beginning to be useful in determining subtle facial endophenotypes and so it should assist in genotype–phenotype studies. Generally speaking, however, it will not replace the experienced eye.

Our results further suggest that the effects of ERT on face shape are not easy to detect in the time intervals used here. This is in line with the observations of Ries et al.8 This lack of effect is probably due to the fact that the shape of the facial bones, as well as the connective tissue structure, is fully developed in early adulthood. A reduction of storage products may thus not result in a major alteration of facial appearance in adult patients. However, we cannot exclude the possibility that a longer period of ERT results in a very slow change in facial features due to the slow remodelling of (facial) bones. It is further worth mentioning that detection of subtle changes following ERT will be very difficult due to inconsistency of the images as they are subjected to pose and variation in body weight. Perhaps an early start of ERT prevents the development of characteristic adult facial features in Fabry disease.

A limitation of this study is that 85% of male and 59% of female patients were treated with ERT at the time of the study. We can therefore not exclude the possibility that facial features may be more pronounced in untreated Fabry patients.

We did not formally evaluate intra-operator landmarking variability. The individual undertaking all manual landmarking (PH) has landmarked over 2000 images. Most of the landmarks we used were in the reliable category of a related study, which identified that variation in reproducibility depended upon operator experience and landmark location.18

The aetiology of the observed coarse facial features is unclear. α-GAL A is a housekeeping gene and its deficiency leads to storage of Gb3 in nearly all cells in the body. Probably, continuous storage of Gb3 in growing facial bones and developing facial connective tissues results in disturbed cellular functioning and thus an altered facial appearance. Female patients, who store Gb3 as well, show less pronounced facial abnormalities compared with male subjects. Apparently, the residual enzyme activity present in almost all female patients protects them, at least partially, from facial changes.

In conclusion, we report a valid method for quantification of facial dysmorphic features in Fabry disease. Male patients and, to a lesser extent female patients, show significant facial changes compared with healthy controls. However, its value in the diagnostic process appears to be limited.