The Swedish Agency for Health Technology Assessment and Assessment of Social Services (SBU) has recently published what they purported to be a systematic review of the literature on ‘isolated traumatic shaking’ in infants, concluding that ‘there is limited evidence that the so-called triad (encephalopathy, subdural haemorrhage, retinal haemorrhage) and therefore its components can be associated with traumatic shaking’. This flawed report, from a national body, demands a robust response. The conclusions of the original report have the potential to undermine medico-legal practice. We have conducted a critique of the methodology used in the SBU review and have found it to be flawed, to the extent that children’s lives may be put at risk. Thus, we call on this review to be withdrawn or to be subjected to international scrutiny.
- child abuse
- evidence based medicine
- forensic medicine
Statistics from Altmetric.com
What is already known on this topic?
Abusive head trauma (AHT) is a well-recognised and serious form of physical abuse.
Sound evidence-based research shows that there are several clinical features that are significantly associated with AHT.
There is a growing body of published studies with questionable methodology that attempt to throw doubt on these two statements.
What this study adds?
A thorough critical appraisal of a systematic review that erroneously states ‘there is limited evidence that the so-called triad …and therefore its components can be associated with traumatic shaking’.
Findings that suggest significant flaws in the methodology of the systematic review.
Due to the critical methodological flaws, we recommend that this review is withdrawn from publication for the sake of the unbiased protection of children who may have suffered from AHT.
When an infant presents with a serious traumatic head injury and there is no clear explanation for the trauma, abusive head trauma (AHT) must be considered within the differential diagnosis. The identification of AHT is challenging, and any attempt to improve this, as proposed in the recently published Swedish Agency for Health Technology Assessment and Assessment of Social Service’s report (SBU)1 and summarised by Lynøe et al 2, is welcome. However, there are many flaws and misconceptions in this systematic review (SR) that, if left unchallenged, have the potential to adversely affect infants who suffer AHT. We present a careful critique of the methodology used in the SBU report to ensure that readers can accurately assess its validity before allowing it to influence their practice.
It is well recognised that a flawed scientific publication has the potential to cause considerable harm to children, and their families, for many decades. The most obvious example of this relates to the article by Wakefield et al 3 hypothesising a link between the measles, mumps and rubella (MMR) vaccine and autism in 12 children in 1998. Although the Lancet issued a partial retraction4 once the deeply flawed methodology had been identified, the damage was done. The legacy of this flawed article continues today as MMR vaccine levels remain below target and outbreaks of measles continue to occur.
Several international commentators have questioned the validity of the review5–7 and propose that the SBU attempt to test a legal premise, and not a clinical one. The review conclusions rely on only two studies8 9 from the vast literature on this topic and conclude with the statement that ‘there is limited scientific evidence that the triad and therefore its components can be associated with traumatic shaking’. This is incorrect. As the SBU never set out to examine the relationship between the components of the triad and so-called ‘traumatic shaking’, they cannot draw this conclusion.
Critical appraisal of the SBU method
An SR is of value if conducted in a methodologically robust way, adhering to international standards. We confirm this appraisal is based on QUADAS-2, the tool that was also used by the authors of the SBU review.10 It is a validated tool to evaluate the quality of diagnostic test studies and assesses the risk of bias and applicability in four phases.
Phase I: review setting
The question proposed by any clinical SR must be focused, answerable and clinically relevant. The authors pose the following question: ‘With what certainty can it be claimed that the triad, [subdural haemorrhage, retinal haemorrhage, encephalopathy] is attributable to isolated traumatic shaking (ie, when no external signs of trauma are present)?[…] due to the healthcare principle that the triad is attributable exclusively to traumatic shaking’. There is no such ‘healthcare principle’, and therefore, the question posed is clinically irrelevant. In clinical practice, the decision regarding the likelihood of AHT is made after a rigorous assessment of the history, examination, assessment of comprehensive clinical investigation findings, in the context of a forensic assessment of the proposed mechanism of injury and family risk factors. The purpose is ultimately to determine whether an infant with no independent mobility, who is entirely dependent on their carer and unable to offer a history themselves needs protection from future harm. This decision is challenging, requiring input from multiple disciplines, and is never made exclusively on a triad of clinical features. During clinical investigations, the clinician considers a differential diagnosis of all potential causes for the presenting symptoms and signs.
Many terms are used to describe this clinical entity, the most widely accepted of which is AHT.11 The authors coined their own terminology ‘isolated traumatic shaking (ie, when no external signs of trauma are present)’. This implies a very specific scenario, which is neither identified nor diagnosed clinically nor used within published studies, thus undermining its validity. Any SR needs to explore an outcome that primary study authors will have set out to determine. By creating a novel outcome, it is thus impossible for a review to identify relevant studies, and the authors set themselves up to fail.
The review question proposed requires an SR of diagnostic test accuracy. It is standard practice when designing a research study, and thus an SR of primary studies, to carefully delineate who the population under study are, what ‘test’ is being evaluated, in comparison to the current standard test for this condition, and what outcome of interest will be sought to determine the value of the new test.10 Accordingly, the authors propose a Population, Index test, Reference standard and Outcome (PIRO). However, there are a number of concerns regarding the PIRO design.
The population (P) is defined as children ≤12 months of age, justified by the statement ‘the mean age of children subjected to traumatic shaking is […]2–3 months’.12 However, the reference is misquoted, as Park et al (in common with many other studies) identified this as the peak age, not equivalent to the mean age of their study population. Despite the proposed age limit, there were children >12 months in the included studies, consistent with most studies on this topic that include an age range up to 2 years to coincide with the age range of reported cases of AHT.8 9
The index test (I) is defined as ‘the triad’, not the individual components, which were neither tested nor described in detail in the results, yet the authors form a major component of the report’s conclusions. The ‘triad’ is the cornerstone of the SBU report; however, the ‘triad’ is never fully defined nor operationalised in their method. The ‘triad’ is neither a continuous, nor an ordinal, nor a categorical variable and is not used as a diagnostic test in clinical practice. The combined features itemised in the triad are physical signs and symptoms recognised as a consequence of head trauma. Again it is unlikely that the authors of the SBU report will therefore identify a study that sets out to evaluate the ‘triad’ as a diagnostic test. The authors do not describe how they identify if the ‘triad’ is present or absent in a retrieved paper. There is no clear definition of encephalopathy which is a clinical condition that presents with a broad spectrum of symptoms that vary from drowsiness, vomiting impaired levels of consciousness, with or without radiological changes. In the absence of any search terms for the clinical signs of encephalopathy or a categorical definition of the condition as applied within the ‘triad’, it is impossible to know, for example, whether the neurological status is simply described as ‘drowsy’ in the presence of retinal and subdural haemorrhages (SDHs) would the study be included?
The index test also appears to overlap with the population, ‘The triad in cases of suspected traumatic shaking’, thus introducing potential circularity and bias into the review question. If the diagnostic test is applied purely to ‘cases of suspected traumatic shaking’, then the index test is already reflecting the reference standard. The index test must distinguish between those cases that meet the reference standard and those that do not.
The reference test/gold standard (R) is categorised as ‘ admitted or witnessed traumatic shaking or other trauma’. While it is accepted that there is no gold standard diagnostic test for AHT, setting the threshold for inclusion at the level of admitted or witnessed shaking or ‘video documentation of’ is unrealistic as this level of surety is infrequently recorded in the real-world setting.8 9 13–16 In addition, the authors of the SBU introduce this standard of evidence without justifying or supporting (a) that the new standard is valid, (b) what would constitute a valid admission, witness or what constitutes video evidence or (c) why a new standard is necessary. To measure diagnostic accuracy, a high level of confidence is required to ascertain whether the condition is present or absent. The reference standard used does not define when the condition is absent, yet implies that in the absence of ‘admitted or witnessed traumatic shaking or video documentation’, shaking has not occurred. This is a very dangerous and erroneous assumption in the context of a potential assault by a carer where denial and obfuscation are prevalent. A non-specific category ‘other trauma’ is also included in the reference test. This is not defined, and therefore has no place in the SR.
The outcome measure (O) was ‘diagnostic accuracy’ which requires the inclusion of studies with a relevant study design, namely cross-sectional studies and studies that set out to test the index test in a representative population, such that the potential sensitivity and specificity of the stated test could be calculated.
Inclusion criteria relating to study type: Together with cohort and registry studies, the authors chose to include case–control studies and qualitative studies, neither of which evaluate diagnostic test accuracy, and failed to include cross-sectional studies, other than registry studies. The authors applied a threshold of a minimum sample of 10 cases when assessing studies addressing ‘isolated traumatic shaking’; yet single case reports were deemed adequate when drawing up a tabulated list of differential conditions where the triad exists without shaking. This double standard is inadmissible.
It would thus appear that neither the review question nor the PIRO format meet the required standard for an SR. In light of the fact that the review question chosen is clinically irrelevant, it is not possible to propose a correct PIRO.17–19
The search strategy focuses on the medical literature and Cochrane reviews which would identify most of the relevant studies. It addresses the population of interest, and to some extent the index ‘diagnostic test’. While some terminology relevant to SDH and retinal haemorrhage (RH) are included, there are no terms relating to encephalopathy, which the authors define as ‘various forms of brain symptoms’ and as ‘lethargy, seizures and dyspnoea’; instead ‘cerebral oedema’ and ‘brain oedema’ are search terms used, both are radiological markers of encephalopathy, not clinical indicators. Despite explicitly excluding coexisting injuries, such as fractures or bruising, these terms appear in the search strategy, with no rationale offered.
While it is accepted that searches for diagnostic test accuracy studies should not be narrowly focused, as studies will be missed, it seems perplexing that ‘accidental’ and ‘non-accidental injury’ feature as predominant search terms, yet neither appear in the methodology section of the report, or within the PIRO format. There are no terms relevant to the reference standards, namely ‘witnessed or admitted shaking’, nor to diagnostic accuracy such as sensitivity, specificity, predictive values, thus the search strategy is inadequate and inaccurate.
The authors excluded ‘AHT studies which included external injury to the head and/or fractures and other injuries’. The only rationale offered for this extraordinary omission is the statement, ‘As isolated traumatic shaking does not involve direct trauma to the head, there will be no external signs of head trauma such as swelling of soft tissues, contusions, lacerations or skull fractures’. This demonstrates an ignorance of the mechanics of ‘isolated traumatic shaking’. If an infant is shaken vigorously, flailing of the arms and legs is likely, with the recognised potential for metaphyseal fractures of the long bones20 21 forceful gripping of the infant chest may result in bruising22 and/or rib fractures.23 24 Thus, a comprehensive review aiming to identify all relevant high-quality studies would not have excluded articles with other injuries. Furthermore, this exclusion criterion is not applied with rigour as both included studies8 9 clearly describe cases with associated injuries within their studies.
Phase II: review specific tailoring
Critical appraisal tools are used to assist reviewers in identifying any relevant bias in a study. In order to ensure that reviewers consider these, the tools used to assess each study should include pertinent ‘signalling questions’, that is, specific prompts for the reviewer to consider the risk of bias.25 For example, a signalling question may be, ‘Did all of the cases and controls undergo the same testing?’ However, only three signalling questions regarding selection of the population for the study and duration of follow-up are provided (online supplementary appendix 1). The latter appears to have no relevance to this review, which does not address a prognostic test, predicting the long-term outcome of children. The critical appraisal tool should be independently tested before use, and only if agreement is achieved can the tool be applied.
Supplementary file 1
However, the SBU methodology appears to fail these desired standards as there is no information regarding development of signalling questions, piloting of the novel appraisal forms nor the methods for quality rating of studies, which in turn demonstrates the lack of objective judgement applied, a crucial requirement when conducting SRs.
Phase III: flow diagram
The SBU authors chose not to produce a flow diagram, which is not an essential component.
Phase IV: judgements on risk and applicability
A central tenet of critical appraisal is the risk of bias assessment, which is the final arbiter as to whether studies meet the required inclusion standards and whether its conclusions are justified. It is crucial that details of the information are used to support risk of bias judgement and their concluding summary are provided.
The forms that were used to determine risk of bias in the SBU SR can be found in online supplementary appendix 1 and 2. As no standards are given for the confirmation of the key triad features (SDH, RH or encephalopathy), it is perhaps not surprising that there is no detailed assessment of risk of bias relating to these components. The only detailed quality assessment relates to the confirmation of shaking. There is no assessment of the applicability of the studies under review, which invalidates any conclusion on the generalisability of the findings, and disappointingly, once again the authors chose to comment on this despite no appropriate assessment having been carried out.
The SBU authors defined 28 of the 30 included studies as having ‘a high risk of bias’ due to ‘risk of circularity’, which is always a consideration.1 However, there is a complete lack of any standards to assess circularity. Furthermore, a number of these studies are criticised based on the ‘lack of confessed cases’, yet many of these studies13 15 16 and other excluded studies26 did include cases of admitted AHT, and contact with the study authors may have elucidated useful information. This was only attempted for one study8 where clarification is provided in a recent publication.27 The SBU authors chose to criticise the two included studies of moderate quality8 9 regarding the circumstances in which the confession was obtained. This is highlighted in their results as a critical flaw, despite being a completely new standard, introduced solely in the results.
The flow chart of ‘literature search retrievals’ is provided, but does not follow the international standard of a Preferred Reporting Items for Systematic Reviews and Meta-Analyses flow chart,28 thus the precise numbers of excluded studies and the reasons why are not included in the main body of the report, but only in the 110-page table of excluded studies appended to the SBU online report.1
As in many SRs based on observational studies, the authors were unable to use Grading of Recommendations, Assessment, Development and Evaluations method of rating evidence and created their own assessment of: ‘Quality evidence was deemed to be limited (low) when combined assessment of studies of high or moderate quality disclosed factors which markedly weaken the evidence […] was deemed to be insufficient (very low) when there was a lack of studies, when the available studies were of low quality or when studies of similar quality showed contradictory results’. While it is acceptable to define your own ranking of evidence, the authors appear to have confused levels of evidence (ie, study design), with quality of evidence. It is notable, however, that while they conclude ‘there is insufficient scientific evidence on which to assess the diagnostic accuracy of the triad in identifying traumatic shaking (very low quality evidence)’, this is inconsistent with their own findings; namely, ‘Although both studies (moderate quality) have methodological limitations, they support the hypothesis that isolated traumatic shaking can give rise to the triad’. Taking this finding together with a number of the remaining 28 studies, deemed to have a ‘high risk of bias’, which do show an association between elements of the triad and traumatic shaking, then the correct conclusion would be that there is sufficient evidence that components of the triad are associated with traumatic shaking.
The value of any SR is entirely dependent on the quality and rigour with which it is conducted. We have identified significant errors in the methodological standards applied within this review. With a clinically irrelevant question, an inadequate literature search, a poorly designed PIRO format, no standards for confirmation of the key clinical features and a risk of bias assessment that relies solely on case ascertainment and confirmation of shaking, the final decision to include or exclude studies arrives at an unsupported and dangerous conclusion. Further to the detailed methodological critique, we append a commentary on the methodological limitations relating to the ophthalmological component of this review (online supplementary appendix 3).
Supplementary file 3
We recognise that our critique is based on a translation of the original report, thus some details may not have appeared as the authors intended. In our critical appraisal, we have also read the publication by one of the SBU authors,2 which exhibits further and numerous inconsistencies from the original SBU report.
It is very concerning that a national body, such as the SBU, would employ such a flawed methodology as identified in this analysis. There are many tools available for use in conducting SRs, including those designed specifically for reviews of observational studies such as MOOSE.29 30 Furthermore, despite neither searching for nor critically appraising biomechanical studies, the authors presented a definitive statement about the lack of biomechanical data on head injuries, drawn from three excluded studies. There is a large body of literature relating to the biomechanics of infant head injury, to inform this point, which would have required a separate SR. There are notable misconceptions expressed within the introductory section of the report, which highlight poor understanding of the associated radiological (7) and ophthalmological findings (online supplementary appendix 3).
In their online supplementary appendix 3 on ‘ethical analysis of traumatic shaking’, the SBU authors discuss a potential ‘conflict of values’ in relation to coming to a conclusion about the possibility of AHT. This is disingenuous; it is not a conflict of values but a careful and considered weighing of the presentation, clinical findings and available, informed evidence. There is no ethical issue here, rather sound clinical, evidence-based practice, with the child’s interest at its core.
We are extremely concerned that this SBU report may have a detrimental impact on the health and well-being of children now and in the future. Crucial errors were made in the setting of the question, search strategy, lack of standardised definitions for terms used, inadequate inclusion/exclusion criteria (including incorrect study design choices), critical appraisal tools and synthesis of included studies. It is incumbent on healthcare agencies and journal editors to recognise their responsibilities with regards to publications and their potential impact.31 Scientific scrutiny is necessary to advance medical science, and a constant review of our practice in the light of new scientific advances has the potential to improve our practice. Contrary to the proposal that the scientific community involved in the care of infants with intracranial injury, some of whom have been abused, are not open to challenging established ideas,32 any search of the scientific literature will identify an increasing body of high-quality scientific studies that sets out to explore new ways to delineate characteristics that improve the identification and understanding of AHT. However, if rigorous methodology is not followed, we are left with the dangerous situation where pseudo-science leads to critical changes in practice, with devastating consequences, as was seen with falling MMR vaccine uptake following the flawed article by Wakefield et al.3 Given that reliance cannot be placed on it, medically or legally, we propose that until this report passes rigorous external methodological scrutiny, it is withdrawn from publication, for the sake of the unbiased protection of children who may have been physically assaulted and suffer from AHT.
Supplementary file 2
The authors thank members of the RCPCH Standing Committee for Child Protection for their comments on this manuscript.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.