A scoring system for bruise patterns: a tool for identifying abuse
- 1Department of Child Health, University of Wales College of Medicine, Academic Centre, Llandough Hospital, Penarth, Wales CF64 2XX, UK
- 2Department of Epidemiology, Statistics and Public Health, University of Wales College of Medicine
- Correspondence to:
Prof. Jo Sibert, Professor of Community Child Health, Department of Child Health, University of Wales College of Medicine, Academic Centre, Llandough Hospital, Penarth, Wales CF64 2XX, UK;
- Accepted 22 January 2002
Aims: To determine whether abused and non-abused children differ in the extent and pattern of bruising, and whether any differences which exist are sufficiently great to develop a score to assist in the diagnosis of abuse.
Methods: Total length of bruising in 12 areas of the body was determined in 133 physically abused and 189 control children aged 1–14 years.
Results: Our method of recording bruises by site, maximum dimension, and shape was easy to use. There were clear differences between cases and controls in the total length of bruises. These differences were at their greatest in the head and neck and were less notable in the limbs. A scoring system was developed using logistic regression analysis using total lengths of bruising in five regions of the body. Good discrimination between the two sets of children was achieved using this score; by including a variable that indicates whether a bruise had a recognisable shape the discrimination could be made even better. Given a prior probability of abuse the score can be used to give posterior odds of abuse, given a particular bruising pattern.
Conclusions: The scoring system provides a measure that discriminates between abused and non-abused children, which should be straightforward to implement, though the results must be interpreted carefully. We do not see this score as replacing the complex qualitative analysis of the diagnosis of abuse. This clearly includes history as well as examination, but rather as the beginning of the development of an important aid in this process.
Paediatricians are often asked for an opinion on whether a particular pattern of bruising is caused by abuse. This might arise in a variety of settings—clinical, child protection, or in legal proceedings. Although some studies have looked at the age of children and bruising,1,2 and others have looked at the age of individual bruises,3–5 the evidence base6,7 for coming to a conclusion on an individual pattern of bruising is very limited. One reason for this is that child protection is a multidisciplinary activity, led by social workers whose research base is largely qualitative. Another is the difficulty of obtaining data on bruises on non-abused children. There is also the problem of recording information on bruises in a way that is not invasive and yet is in sufficient detail for the results to be analysed statistically.
There are two related but separate issues to be investigated. Is the extent and pattern of bruising different in abused and non-abused children? Are any differences sufficiently great to develop a score to assist in the diagnosis of abuse? In a preliminary study7 we collected data on bruises in three areas of the body of abused and non-abused children, and used Bayes' theorem8 to arrive at a posterior probability that a particular bruising pattern was the result of abuse. That work was limited by certain assumptions about the independence of bruising patterns in different regions of the body. Therefore, we carried out a study in which bruises were recorded in more detail in more children to test the assumptions and build on this earlier work.
The subjects studied were children aged 1–13 years attending the Llandough Children's Centre, which serves the Vale of Glamorgan and the West of Cardiff. In the centre we see child outpatients, children with special needs, and referrals under child protection procedures but there is no accident and emergency department. We decided to study children under 1 separately as they are not mobile, so bruising in any area has greater significance than in older children.1 We decided to set an upper age limit of 14 years to fit in with the divisions used by the WHO. Children with significant special needs were excluded.
The abused cases were identified from our child protection database. They were children who had attended the centre between 1992 and 1996, whose notes were obtainable, and who were classified as having been physically abused following a case conference or other multidisciplinary meeting.
The bruising patterns of control children were obtained from those attending the centre for ambulatory outpatient consultation for reasons other than abuse between 1998 and 1999, during a clinical examination that would have been undertaken anyway. When this study was initially planned the controls were to be children attending the accident department, but this proved impractical because of the extra undressing of children that would be required. The timescale of collection of cases and controls was therefore different, but we do not believe that this invalidates our results.
Bruises were measured using paper tape measures. Parental consent was obtained; no parent declined to take part. Cases and controls were examined by consultants or specialist registrars (residents) in community child health. The sex ratios in abused and controls were nearly identical, with 66% boys and 34% girls. The mean age of cases was 7.7 years and of controls 6.4 years.
Details of bruises were recorded in each of 12 regions of the body: anterior chest and abdomen, back, buttocks, left and right arms, left and right legs, left and right face, left and right ears, and other head and neck. In each region, the number of bruises was recorded, together with the maximum dimension of each bruise, and whether or not each bruise had a specific shape, such as being linear or shaped like a hand.
In order to establish a scoring system we divided regions as follows:
Head, neck, and face
Chest, abdomen and back
The total length of bruising in each of these was calculated for each child and the totals analysed mathematically.
Clinicians involved in the study found it easy to record this information, though for ethical reasons it was not possible to compare different observers' findings on the same child, to estimate interobserver variation. There were 133 abused children, who had 763 bruises, and 189 controls with 282 bruises. Table 1 shows the percentages of abused and non-abused children who had bruises in the 12 regions, with p values derived by a Mann–Whitney test, for comparing the median number of bruises between groups. There are clear statistical differences between abused and control children for all regions except the legs.
Table 2 shows the mean lengths of bruises in abused and control children in the five combined regions. The distributions are heavily skewed, so these means must be interpreted with caution. The different regions have different discriminatory power, with bruising on the head, neck, and face being much more suggestive of abuse than bruising on the limbs. We found that the lengths in different regions were dependent. This means that to estimate the probability of a given pattern of bruising, a complex multivariate model is needed. Therefore, the approach adopted in our previous study cannot be used here. Instead we decided to construct a scoring system so that the pattern of bruises would lead to a score which could then be used to classify the child as abused or not.
In order to devise a scoring system using all the information on bruising, we used logistic regression to model the probability of abuse in terms of the total lengths of bruising in these five regions; all terms were highly significant. Age and gender were also considered for inclusion but were not significant as predictors. The coefficients in the resulting model were scaled and rounded to give integer values. The resulting score is:where all lengths are measured in cm.
The mean score in the abused children was 87.6 (SD 59.7), while the mean in the controls was 5.9 (SD 9.0); clearly the distributions are very different. Are they sufficiently different to enable accurate prediction of abuse status using this score? As abused children tend to have higher scores than controls, an obvious procedure would be to classify a child as abused if the score exceeds some threshold. The sensitivity and specificity of such a procedure depend on the threshold chosen. To calculate them the score was modelled by gamma distributions,8 separately for the abused and control children (see fig 1). Table 3 shows values for the sensitivity and specificity for different thresholds based on these distributions.
Such an assessment does not take account of other information which might be available, such as the method of referral, nor does it reflect the fact that a score of 140 is rather more likely to indicate abuse than one of 40.
If we can assess the prior odds of a child being abused, where the odds are defined as:then we can take into account the bruising to calculate the posterior odds. These are given by:where the likelihood ratio is derived from the ratio of the two fitted gamma probability distributions referred to earlier. This is calculated as the ratio of the ordinates of the two curves in fig 1: that for abused divided by that for the controls; typical values are shown in table 3. To illustrate, suppose that in our clinical situation children referred from the child protection procedures have a prior probability of 0.4—this was actually that observed in a retrospective examination of records of children referred under the child protection procedures from social services. The prior odds are then 0.4:0.6 or 2:3. Consider a child with a score of 40. The likelihood ratio for this score is 8, from table 3, and so the posterior odds are 5.33. This corresponds to a posterior probability of 0.84. If a child presented through an accident and emergency department, the prior probability might be much lower, say 0.01. Then the posterior probability is 0.075, a much lower value, reflecting the fact that abused children are unlikely to present by such a route. Table 4 shows the posterior probabilities for a variety of scores for four different prior probabilities, whose range is likely to cover most clinical situations.
The data collection system recorded whether bruises had specific shapes. Fewer than 2% of those classified as non-abused had a bruise with an identifiable shape. Of the abused children, 57% had at least one with an identifiable shape. A new scoring system was developed using an extra variable, defined to be 0 if there were no bruises with a shape and 1 if there were some with a recognisable shape. Logistic regression was used as before and the new score was defined to be:Including this extra variable increased the specificity, for a given sensitivity, by between 2% and 3%, and this is obviously of value. It does presuppose, however, that different observers will record such information in the same way. It is certainly possible that this information is more prone to subjective assessment than simply measuring the maximum dimension and for that reason we have concentrated on the simpler score here.
It is even easier to record the number of bruises rather than measure their lengths. An exercise similar to the above was carried out using simply the numbers of bruises in the five regions, deriving a different score. Choosing the threshold for this score to give approximately the same sensitivities as shown in table 3 led to specificities about 10% lower; thus there is added value from using the lengths.
Structuring studies on bruise patterns in children is difficult. In particular, there are problems with collecting information on controls. We had planned to obtain this information from children who present to the accident department, assuming that this would be part of the routine examination of the children. However, in practice a detailed examination is very difficult without unnecessarily undressing the child. We believe our method of examining outpatient children with consent who are visiting the same children's centre as the abused children is the best achievable.
We chose to record the maximum dimension of bruises, together with their site and shape. We believed that measuring the area of a bruise would prove practically difficult and might involve considerable error. Our method proved easy to use and produced results which give an important description of the extent of bruising in abused children.
The extent of bruising appears to be a good discriminator between children who were abused and those who were not. Bruising in a region such as the head and neck is a better discriminator than that on the limbs, as it is seldom present in non-abused children; we noted a greater extent of bruising to the left ear and left side of the face, possibly reflecting that the majority of adults are right handed. It appears to be most effective, however, to combine the results from different regions. Because of the lack of independence of different regions, this is more easily effected by a scoring system than by developing multivariate probability models for the joint distributions in abused and non-abused children.
We have shown that such a score can be developed by logistic regression analysis, using the lengths of bruising in the various regions and that it can help to differentiate between abused and non-abused children. A clinician would, before performing a clinical examination, estimate the prior probability of abuse in the clinical situation where the child is examined. A bruising score would be calculated and then a table, such as table 3, would be consulted to estimate the posterior odds of abuse. As this table shows, the prior probability can have a considerable impact on the posterior odds and its specification is obviously important; Healy,9 and Spiegelhalter and colleagues10 are among many authors who have discussed this. Here it would be based on past experience of children referred by different routes and of the case history. The values in this table were chosen to represent the range, which might be plausible in a variety of clinical situations, and were not meant to be definitive in any sense, but they illustrate interesting points. For example, from table 4 we can see that it is unlikely that a child with a score of 20 had been abused, whatever the prior probability, while unless the prior probability is very low a score of 100 is highly likely to indicate abuse.
We have tested the use of the score in distinguishing between abused and non-abused children using the study sample on which the scoring system was derived. This procedure has some bias and will tend to give over optimistic results. We are gathering more data on appropriate children and will test it on those once available. This should give a more realistic measure of its effectiveness in practice and also give information about the robustness of the scoring system.
Use of such a scoring system must clearly be cautious. In particular, in calculating the posterior probability of abuse, the value taken for the prior probability is very important and it is essential that due thought be given to this. We do not see this score as replacing the complex qualitative analysis of the diagnosis of abuse that clearly includes history as well as examination, but feel it will be a useful aid to the clinician. However, we do see it as the beginning of the development of an important aid in this process. Indeed, it could be added to such a qualitative analysis.
We thank the Wellcome Trust for the initial funding of this project. We also thank the NSPCC for supporting our work on the preparation of this paper.