Article Text

Using child reported respiratory symptoms to diagnose asthma in the community
1. I T S Yu,
2. T W Wong,
3. W Li
1. Department of Community & Family Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
1. Correspondence to:
Dr I T S Yu
The Chinese University of Hong Kong, Department of Community & Family Medicine, Room 422, School of Public Health, Prince of Wales Hospital, Shatin, New Territories, Hong Kong; iyucuhk.edu.hk

## Abstract

Aims: To study how respiratory symptoms reported by children, with or without spirometry, could help to discriminate those with asthma from those without.

Methods: Respiratory symptoms (frequent cough, frequent phlegm, and wheezing) reported by 1646 schoolchildren (aged 8–12 years) in a respiratory questionnaire and the FEV1:FVC ratio measured with spirometry (at three different cut-off values of 0.70, 0.75, and 0.80) were compared against the criterion standard of a physician diagnosis of asthma reported by the parents.

Results: The overall prevalence of asthma was 6%; more boys had asthma. Wheezing had the best discriminating ability among the three symptoms and a cut-off point at 75% was best for the FEV1:FVC ratio. Combining wheezing with an FEV1:FVC ratio <75% gave the highest discriminating ability of 83%. If the tests were applied to hypothetical populations with higher prevalence ratios of asthma, the added value of the FEV1:FVC ratio became less apparent.

Conclusion: Respiratory symptoms, especially wheezing, reported by children had good discriminating ability for asthma and could be adopted for opportunistic screening in the primary care settings.

• asthma
• respiratory symptoms
• likelihood ratio
• predictive values
• diagnosis
• CI, confidence interval
• FEV, forced expiratory volume
• FVC, forced vital capacity
• NLR, likelihood ratio for negative result
• NPV, negative predictive value
• PLR, likelihood ratio for positive result
• PPV, positive predictive value

## Statistics from Altmetric.com

It has been reported that asthma is under-diagnosed and under-treated,1–4 especially in children. These lead to an increase in morbidity and, in the long term, may have a detrimental effect on the lung function and clinical state of the asthmatics.5 It has been suggested that the failure to treat airway inflammation may cause airway remodelling.6,7 One can expect clinical benefits to the individual and economic benefits to society by correctly identifying and treating children with undiagnosed asthma. While a full clinical evaluation of the entire community is unjustified for economic and pragmatic reasons, a simple questionnaire, with or without simple lung function tests, may be useful to discriminate asthmatic children from those who are not.

Sistek et al evaluated the predictive values of respiratory symptoms obtained by a standardised questionnaire among adults in Switzerland8 and concluded that they were reliable predictors for a clinical diagnosis of current asthma. It would be of interest and importance to see if similar good predictive values could be obtained among children. We therefore evaluated the utility of respiratory symptoms reported by schoolchildren in discriminating asthmatic children from others using data from a study on air pollution and respiratory health in Hong Kong carried out in 1995. We also examined the value of adding a one-time measurement of ventilatory function that could be easily administered in the clinic or in the field.

## METHODS

The study subjects were recruited from 12 primary schools in three districts in Hong Kong by two stage cluster sampling.9 The first stage was to select three districts (out of 18) in Hong Kong based on different air quality, and the second stage was to select four schools in each district according to their proximity to the air monitoring stations of the Environmental Protection Department. All schoolchildren studying in grade 3–6 from the selected schools were invited to participate, but only those aged between 8 and 12 were included in the analysis.

A respiratory questionnaire based on the American Thoracic Society’s ATS-DLD-78-C questionnaire10 was administered to the children to gather information about the children’s respiratory symptoms in the previous 12 months. The occurrences of three respiratory symptoms—frequent cough, frequent phlegm, and wheezing—in the 12 months before the study were identified for subsequent analysis. Frequent cough was defined as usually having a cough, whether with colds or apart from colds. Frequent phlegm was defined as usually feeling congested in the chest or bringing up phlegm, whether with colds or apart from colds. Wheezing was defined as having wheezy or whistling sounds in the chest when having a cold or occasionally apart from colds or for most days or nights. Doctor diagnosed asthma as reported independently by the parents of the children in a separate questionnaire was taken as the evidence of “true” asthma or “gold standard”. Lung function testing was carried out in those children with written consent from their parents by a trained technician in accordance with the American Thoracic Society’s recommendations.11 A total of 2292 of the 2649 eligible subjects (87% participation) completed the respiratory questionnaire and 2012 of them performed lung function tests with their parents’ written consent. Among the latter group, 1646, of which 863 (52%) were girls, provided parent completed questionnaires with information on doctor diagnosed asthma and were used in the current analysis.

The indices for measuring the accuracy of a diagnostic test, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratio for a positive test (PLR), and likelihood ratio for a negative test (NLR), were calculated in the usual fashion12 for the three child reported symptoms and the FEV1:FVC ratio at three different cut-off values (0.70, 0.75, and 0.80) separately, as well as in various combinations (models), using doctor diagnosed asthma reported by parents as the “gold standard”. The PPV was also the post-test probability of a positive test. A likelihood ratio close to 1.0 would indicate that the test (symptom) was indiscriminating.13 Ninety five per cent confidence intervals (CIs) were calculated around the likelihood ratios to assess their statistical significance.14 We also calculated the difference in post-test probabilities of having asthma between a negative result and a positive result for each model. A negative result would lower the pre-test probability of asthma to a post-test probability of (1–NPV), whereas a positive result would raise the post-test probability up to the PPV. The difference between the two post-test probabilities (PPV+NPV−1) was therefore attributable to the test (or combination of tests), and could be used as an indicator to reflect the discriminating ability of the test. Subgroup analyses by gender and age (8–9 years and 10–12 years) were done to check for the consistency of the indices. The FEV1:FVC ratio was not evaluated in the subgroup analyses due to small numbers. Predictive values (post-test probabilities) and the discriminating ability were also estimated for different hypothetical prevalence ratios (pre-test probabilities) of asthma using the following equations:

$Math$

$Math$

$Math$

All analyses were done using SAS 6.12.15

## RESULTS

The overall prevalence of asthma was 6% (99 children). Boys had a higher prevalence than girls (8.2% v 4.1%), and older children (10–12 years old) had more asthma than the younger children (6.5% v 5.5%). Respiratory symptoms were more common among boys (30.5% v 25.8%, 33.8% v 29.9%, 12.8% v 8.7% for frequent cough, frequent phlegm, and wheezing respectively); the differences between boys and girls were statistically significant except for phlegm.

Table 1 summarises the indices of accuracy for 17 different models (three child reported symptoms, three cut-off points of the FEV1:FVC ratio, two combinations of symptoms, and nine combinations of symptoms with different cut-off points of FEV1:FVC ratio) for the detection of asthma. All 17 models were significantly discriminating for detecting doctor diagnosed asthma as none of the 95% confidence intervals of the likelihood ratios included 1. However, the positive likelihood ratio of the models varied considerably, ranging from 1.7 to 126.1. Single symptom models (model 1 to model 3) indicated that wheezing had the highest positive predictive value (34.1%), the highest positive likelihood ratio (8.1), and the highest discriminating ability (31%). The optimal cut-off point for the FEV1:FVC ratio was 0.75 as this resulted in the highest positive predictive value (22.9%), the highest positive likelihood ratio (4.65), and the highest discriminating ability (18%) among the single factor models (model 6 to model 8), as well as among the combination models (model 9 to model 17).

Table 1

The performance of child reported symptoms, FEV1:FVC ratio, and various combinations in predicting asthma

Models combining symptoms suggested that phlegm played a very minor part in predicting asthma. Models combining symptoms with the three cut-off points for the FEV1:FVC ratio showed marked improvements in both positive predictive value and positive likelihood ratio (models 9 to 17), as well as the discriminating ability. The presence of cough and wheezing with an FEV1:FVC ratio less than 75% (model 13) had the best performance, with a positive predictive value of 88.9%, a positive likelihood ratio of 126.1, and a discriminating ability of 83%. The presence of wheezing alone with an FEV1:FVC ratio less than 75% (model 16) had a comparable performance.

Symptoms reported by girls had higher positive likelihood ratios (table 2), although the ranking of the models by performance was similar in both groups. There was not much difference in the PPVs and the discriminating ability between boys and girls. It should be noted that the prevalence of asthma among boys (8.2%) was twice that among girls, thus compensating for the lower positive likelihood ratios. Wheezing was the most useful symptom among the single symptom models. The 2 symptoms model (cough + wheeze) performed no worse than the 3 symptoms model. There was no significant difference in the indices by the children’s age (see table 3).

Table 2

The performance of respiratory symptoms reported by boys and girls in predicting asthma*

Table 3

The performance of respiratory symptoms reported by younger and older children in predicting asthma*

We calculated the post-test probabilities (PPV and NPV) for hypothetical populations with asthma prevalence ratios of 15%, 25%, and 35% using the likelihood ratios (PLR and NLR) derived from our subjects; results are shown in table 4. Understandably, the PPV for each model increased with the increase in prevalence. The discriminating ability of the models based purely on symptoms improved as the prevalence increased, whereas that for models combining symptoms and the FEV1:FVC ratio increased slightly in general with a 15% prevalence and then decreased as the prevalence got higher. There was less improvement in discriminating ability by adding the FEV1:FVC ratio to the symptom models as the prevalence of asthma increased. At a prevalence of 35%, wheezing alone performed almost as good as various combinations of symptoms and FEV1:FVC ratio.

Table 4

Post-test probabilities of asthma for positive and negative results in various models with different pre-test probability (prevalence) of asthma

## DISCUSSION

Sensitivity and specificity are frequently used to measure the performance of a screening or diagnostic test. However, it is not easy to evaluate a test when the directions of sensitivity and specificity are not the same—that is, being relatively high in one and relatively low in the other. For example, when comparing model 3 and model 16 from table 1, we can see that although the sensitivity for the detection of asthma was higher in model 3 than in model 16 (59.2% v 8.2%), the specificity was lower in model 3 (92.7% v 99.9%). Thus, while the number of false negative results in model 3 was low, the number of false positive results was relatively high. In this situation, which one was the “better” test? A number of authors used the Youden index (sensitivity + specificity − 1) to rate the performance of different tests,16,17 trying to take into consideration of both the sensitivity and specificity of a test. Unfortunately, the actual meaning of such an index is very difficult to interpret in clinical practice and may not be a good measure to compare diagnostic tests.18

Sackett and associates19 described an index for assessing “how good a diagnostic test is”, the likelihood ratio. In fact, likelihood ratio incorporates both the sensitivity and the specificity.20 From table 1 we can see that although the likelihood ratio for a positive result was higher in model 16 than in model 3 (125.1 v 8.1), the likelihood ratio for a negative result was better in model 3 than in model 16 (0.44 v 0.92), as it was farther away from the null value of 1.0. Again the question arose as to which was the “better” test? For clinical practice, the answer is determined by the post-test probabilities, which will affect clinical decision making as to whether further tests should be administered or treatment should be started.

Post-test probability is determined not only by sensitivity and specificity, or the likelihood ratios for positive and negative results of the test, but also by the prevalence of the disease (pre-test probability), which may change from setting to setting. A positive test result increases the post-test probability of having the disease. In contrast, a negative test result decreases the post-test probability of having the disease. We propose to use the difference in post-test probabilities of having the disease between a negative result and a positive result (PPV + NPV − 1) as an indicator to reflect the discriminating ability of a test on the basis that any test should end up with either a positive or negative result. The larger the difference, the better would be the discriminating ability. A discriminating ability of 50% or above would have important clinical utility, as the test result would either increase (positive result) or decrease (negative result) the post-test probability across clinical decision thresholds commonly adopted for further testing or treatment.21 As the proposed indicator of discriminating ability depends on post-test probabilities, it in turn will be affected by the prevalence of disease.

Using the criteria of difference in post-test probabilities, wheezing had the best discriminating ability for diagnosing asthma among the single symptoms and the cut-off point at 75% performed bests among the three cut-off points of the FEV1:FVC ratio in this study. These observations were consistent throughout the whole range of prevalence—from 6% to 35% tested in the current study. Combining symptoms with the FEV1:FVC ratio of <75% notably raised the discriminating ability for diagnosing asthma among the study subjects to over 80%. It is interesting to note that in the combination models, adding cough and phlegm did not contribute to increasing the performance if wheezing was already present in the model. As the prevalence (pre-test probability) got higher and higher (in the hypothetical populations), the discriminating ability of the best combination models dropped, although still being very satisfactory. On the other hand, the discriminating ability of the symptoms models all increased with increasing prevalence, and at the prevalence of 35%, the discriminating ability of wheezing alone or wheezing in combination with frequent cough approached that of the best combination model. In such situations, the use of spirometry would become redundant and has very little added value.

There might be some concern about the appropriateness of the “gold standard” used in our analysis. We used asthma diagnosed by a physician as reported by the parents as our “gold standard”. As asthma is an important disease that usually requires long term medical care and can have serious consequences, we believed that parents were unlikely to under-report or over-report on that once a diagnosis was made by a medical doctor. It was possible that asthma might be under-diagnosed by physicians and the effects on the present analysis could be two sided: as some cases of asthma were misclassified as normal subjects, the associations between asthma and symptoms and/or poor ventilatory function might be diluted and underestimated; the group with physician diagnosed asthma might represent the more severe end of the spectrum and hence the associations between asthma and symptoms and/or poor ventilatory function might be overestimated in the current study. In fact, there is no universally accepted “gold standard” for diagnosing asthma in epidemiological studies.22 Some investigators used bronchial hyperreactivity (BHR) as the “gold standard” for evaluating the performance of respiratory questionnaires in diagnosing childhood asthma,16,17 but BHR itself had unsatisfactory agreement with clinical diagnosis.23,24 Since asthma is essentially a clinical diagnosis, our approach appeared reasonable and has been used in other studies.24,25 An ideal approach would be to clinically examine all participants in the survey (by physicians blinded to the questionnaire and spirometry results) using a standard protocol for the diagnosis of asthma and then using that diagnosis as the gold standard.

The sensitivity, specificity, and predictive values of respiratory symptoms in our study compared reasonably well to those for respiratory symptoms among adults in the diagnosis of asthma.8 Several studies tried to document the utility of the International Study of Asthma and Allergies in Childhood (ISAAC) questionnaires in the diagnosis of asthma,16,17,24 and the performance was found to be generally satisfactory in terms of sensitivity and specificity or the Youden index. The PPVs of single symptoms or combination of symptoms in our study were lower than that reported by Jenkins et al in Australia24 using the ISAAC questionnaire (61%), but our NPVs were higher than theirs (94%). The differences in predictive values were mainly related to the higher prevalence of asthma (33%) in the Australian study. At the hypothetical prevalence of 35%, the discriminating ability of the wheezing symptom in our study (62%) compared favourably to that of the ISAAC questionnaire (55%). Our symptom/spirometry combinations also had quite satisfactory discriminating ability when compared to the combination of questionnaire and BHR in the Australian study (66% v 58%).

We attempted to use more restrictive (specific) definitions of the respiratory symptoms in our study by including only those symptoms that were present apart from colds (data not shown here). As expected, the sensitivity generally decreased but the specificity generally increased, and this was paralleled by a general increase in PPV and a general decrease in NPV. The net result was a marginal increase in the discriminating ability at low prevalence, which disappeared as the prevalence was increased. Specific symptoms might theoretically be useful to improve the specificity and positive predictive value, but if the negative predictive value was also taken into consideration, it appeared that they had no advantage over the use of more general symptoms.

In conclusion, we have shown that common respiratory symptoms reported by children (with relatively high sensitivity of around 60%) aged 8–12 were useful initially in discriminating subjects with asthma from those without, especially when the prevalence was expected to be high (15% or above). Adding a one-time and easily administered spirometry test would be beneficial in increasing the discriminating ability if the prevalence of asthma is expected to be low in the study population. Our findings suggest that a simple respiratory questionnaire, with or without simple spirometry, could be adopted for opportunistic screening in the primary care settings. As the study was limited to apparently healthy schoolchildren in Hong Kong, further evaluation of the tests in other settings would provide more evidence on their usefulness and applications in clinical practice.

## Acknowledgments

The study was supported in part by a grant from the Environment and Conservation Fund of the Hong Kong Government.

View Abstract

## Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.