The discriminative value of inflammatory back pain in patients with persistent low back pain.

Objectives: To estimate the prevalence of inflammatory back pain (IBP) characteristics and analyse the discriminative value of IBP relative to axial spondyloarthritis (SpA) according to the Assessment of SpondyloArthritis international Society (ASAS) criteria. Method: Patients who had low back pain for > 3 months were selected from a cohort of secondary care patients aged 18–40 years. Data included information on SpA features, human leucocyte antigen (HLA)-B27 typing, C-reactive protein (CRP) level, magnetic resonance imaging (MRI) of the sacroiliac joints, and self-reported IBP questions covering the pain characteristics included in the Calin, Berlin, and ASAS IBP definitions. Results: Of the 759 included patients, 99% [95% confidence interval (CI) 98–100] had at least one IBP characteristic. The prevalence of the single IBP characteristics ranged from 10% (95% CI 7–12) for ‘pain worst in the morning’ to 79% (95% CI 76–82) for ‘morning stiffness’. Two-thirds of the patients (67%, 95% CI 63–70), met at least one of the three IBP definitions. In all, 86 (11%) were classified as ‘SpA according to ASAS’. All three IBP definitions were significantly associated with ‘SpA according to ASAS’; however, the discriminative value was low, with sensitivity, specificity, and balanced accuracy values of 64, 50, and 57% for Calin, 59, 60, and 60% for Berlin, and 35, 79, and 57% for ASAS IBP definitions, respectively. Conclusions: In this study population, IBP characteristics were in general common and the discriminative value was low, as IBP could not differentiate patients with SpA according to ASAS criteria from patients with other causes of back pain.

Axial spondyloarthritis (SpA) refers to a group of rheumatological disorders that lead to inflammatory and structural changes in the spine and sacroiliac joints. SpA can be a disabling disease and interest in its early diagnosis has increased, especially in the light of technical developments within magnetic resonance imaging (MRI) and the advent of new treatment possibilities. Distinction between SpA and the much more common non-specific low back pain (LBP) frequently causes difficulties for clinicians because of similarities in symptoms and lack of findings on conventional radiographic images in the early stages of SpA (1).
Pain characteristics of inflammatory back pain (IBP) are considered a key feature of SpA and several definitions of IBP have been proposed (2)(3)(4). Furthermore, IBP is considered a central part of referral recommendations to specialized rheumatological evaluation (5)(6)(7) and is also incorporated into classification criteria for both ankylosing spondylitis and SpA (8)(9)(10). Thus, clinicians may perceive IBP characteristics as highly predictive of SpA.
However, the utility of IBP characteristics in distinguishing SpA patients from patients with persistent LBP is under debate (11) and IBP was not included as an entry criterion in the most recent criteria for SpA, from the Assessment of SpondyloArthritis international Society (ASAS) (10). The majority of studies that have evaluated the association between IBP and SpA have used expert opinion of SpA as the reference standard (3,9,10,(12)(13)(14)33). However, the use of expert opinion introduces a risk of circularity bias, as the assessment of IBP is often incorporated in the decision making of the SpA diagnosis. This sort of bias has been shown to produce an overestimation of measures (15). To overcome this risk of bias in the current study, we used classification of SpA based on standardized collected information from imaging and the clinical presentation. Furthermore, previous studies mainly included SpA patients with advanced disease and selected LBP patients as controls (3,9,12,13). However, if IBP is intended to screen for, or diagnose, earlystage SpA, it must be evaluated in an unselected patient population (i.e. patients with back pain of various causes), including patients with early-stage SpA (11).
In 2011, the Spines of Southern Denmark cohort was initiated, with the purpose of evaluating the diagnosis of early-stage SpA in patients with persistent LBP who had been referred to a regional secondary care spine centre. The aims of the current study were (i) to estimate the prevalence of IBP characteristics and (ii) to analyse the discriminative value of the Calin, Berlin, and ASAS IBP definitions (2-4) relative to SpA according to the ASAS criteria.

Patients
The cohort was recruited from the Spine Centre of Southern Denmark, which is an outpatient, non-surgical unit specializing in managing patients with back pain in a secondary care public hospital setting. The unit performs multidisciplinary assessment of patients with spinal pain after referral from general practitioners, chiropractors, and medical specialists in primary care. At the time of the study, from March 2011 to October 2013, the referral criterion to the Spine Centre was an episode of back pain with a duration of 2-12 months, where conservative treatment had insufficient effect. Patients with back pain under strong suspicion of an inflammatory aetiology were referred elsewhere. All patients from the cohort who reported LBP for more than three consecutive months were included in the current study.
During the study period, 16 clinicians (physiotherapists, chiropractors, and medical doctors including rheumatologists) were allocated to a multidisciplinary 'project team'. Caucasian patients aged 18-40 years referred with LBP were allocated to the project. Booking secretaries randomly allocated patients to the project team in a consecutive manner. Each week, all referrals were numbered in the random order they were received, and the patients with the lowest number were assigned to the project. When the consultations with clinicians in the project team were fully booked, the remaining patients were assigned to clinicians outside the project. The number of consultations in the project team were adjusted according to the capacity at the MRI scanner. The clinicians in the project team excluded patients from the project who did not understand Danish or had contraindications for MRI (see Figure 1 for details).
The study was conducted according to the Declaration of Helsinki, and before inclusion, each patient gave written informed consent for research use and publication of their de-identified data. The Regional Scientific Ethics Committee for Southern Denmark determined that, under the Danish legal framework, this study did not require formal ethics approval (reference number S-2010200-58).

Collection of clinical data and blood samples
Data on demographic and clinical characteristics were collected using electronic patient self-reported questionnaires, as part of the Spine Centre's standard procedure, and included items on back and leg pain intensity (16), activity limitation (17,18), general health (19), and present work situation. On their first visit, before consultation, patients also completed an electronic questionnaire that included single characteristics of IBP from the Calin, Berlin, and ASAS IBP definitions (2)(3)(4). The questionnaire (translated into English) about IBP characteristics used in the study is shown in Table 1.
As part of the initial consultation with the patient, the clinicians completed an electronic standardized scheme, covering SpA features (present not present) included in the ASAS criteria for axial SpA (10). To ensure uniform interpretation of the SpA features, the clinicians in the project team received focused training by an experienced rheumatologist, prior to study start. During the study period, 'refresher' sessions were held regularly. The presence of the clinical features had to be diagnosed by a medical doctor as required in the ASAS criteria for axial SpA (10). Physiotherapists and chiropractors noted features previously diagnosed by a medical doctor, and consulted a rheumatologist in the project team in the case of a feature not previously diagnosed.

MRI
Details of the MRI protocol have been published previously (20). In brief, MRI of the whole spine and the sacroiliac joints was performed with a 1.5-T unit (Philips Achieva, Best, The Netherlands) MRI system. The following sequences were used for the sacroiliac joints: semi-coronal T1-weighted turbo spine echo, semi-coronal T1-weighted spectral pre-saturation with inversion recovery (SPIR), and semi-axial T2-weighted short-tau inversion recovery (STIR). Three SpA expert radiologists, blinded to all clinical information except the patients' age and gender, participated in the reading of the MRI scans. Each MRI scan was evaluated by one reader and, in the case of uncertainties (6.1% of the evaluations), consensus was reached by two readers. The presence of sacroiliitis was noted according to the definition used in ASAS criteria for axial SpA (21). The reproducibility of this definition has previously been tested with kappa values > 0.8 for inter-and intra-observer agreement (20).

Data analysis
The data collected from the IBP questionnaire, the SpA feature scheme, and the MRI evaluation were entered directly into an electronic database (the SpineData database) and were analysed using STATA 11.2 (Stata-Corp, College, TX, USA).
Based on the presence or absence of SpA features, HLA-B27 and sacroiliitis on MRI, patients were classified as 'SpA according to ASAS' if they fulfilled the ASAS criteria for axial SpA (10); the rest were classified as 'non-SpA LBP'. To avoid a possible circularity caused by IBP being included in both the reference standard and the index test, 'IBP according to ASAS' Initially allocated to the study, n = 1619 Patients attending first consultation, n = 1459 Reasons for exclusion before the first consultation: • Patient non-attendance, n = 60 • Attended clinician outside the study, n = 100 Reasons for exclusion after the first consultation: was excluded from the classification of 'SpA according to ASAS'. The classification of patients as 'SpA according to ASAS' or 'non-SpA LBP' was determined after data collection and based strictly on the standardized collected items described above. Fulfilment of the Calin, Berlin, and ASAS IBP definitions was based on summation of the presence of the individual characteristics in the self-reported IBP questionnaire. Descriptive data were tabulated. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were estimated for each single item of the IBP characteristics and for each of the three definitions of IBP. Values are presented as percentages with 95% confidence intervals (CIs). Balanced accuracy was used to assess discriminative values and reported with 95% CI. For binary predictions, balanced accuracy [(sensitivity + specificity)/2] equals the area under the receiver operator characteristic (ROC) curve (22). The following general interpretation of balanced accuracy was used: < 60% 'no clinical value', 60-70% 'limited value', 70-80% 'modest value', and > 80% 'discrimination adequate for genuine clinical utility' (23). A two-sample Wilcoxon rank-sum test for continuous and ordinal categorical variables and Pearson's χ 2 test for binary variables were used to test for statistically significant difference between groups, using a significance level of 5%.

Results
Approximately 5000 patients aged 18-40 years were referred with LBP in the study period and, of these, 1619 patients were randomly selected and invited to participate in the study. Of these 1619 selected patients, 160 were excluded before, and 422 after, the first visit. In total, 1037 patients were included in the cohort; 257 had back pain for less than 3 months and 21 patients had missing clinical data. In total, 759 were included in the analysis for the current study (Figure 1).
A total of 86 patients were classified as 'SpA according to ASAS' and 673 as 'non-SpA LBP', resulting in a pre-test probability of SpA of 11%. Demographic and clinical characteristics are shown in Table 2.
The discriminative value of IBP Four single IBP characteristics and all the IBP definitions were positively associated with 'SpA according to ASAS' (Tables 3 and 4). The balanced accuracy for the single characteristics and the IBP definitions were similar, all ranging between 54% and 60%, which  (Tables 3 and 4).

Post-hoc analysis
If IBP had been included as an SpA feature, an additional 2% (18 patients) would have been classified as SpA. Including IBP as an SpA feature did not substantially change the results, and the balanced accuracy for the IBP definitions was all below 0.64 (data shown in Appendix 1).
To test the optimal combination of the single IBP characteristics for classification of SpA, a post-hoc analysis was performed using multiple logistic regression analysis with 'SpA according to ASAS' as the outcome. Single IBP characteristics with a p-value < 0.1 in univariate logistic regression analysis were included in the multivariate model. The model was reduced with  forward selection using a significance level of 5%. Five IBP characteristics had a p-value < 0.1 in the univariate analysis. However, only 'pain at night' remained in the multivariate model after reduction, and thus further analysis of new combinations of the IBP characteristics was not possible.

Discussion
The concept of IBP has been widely used in the clinical evaluation of SpA since its introduction by Calin et al in 1977 (2). The presence of IBP is often used as an entry test for more advanced and costly investigation, such as tissue typing and MRI. However, the accuracy of IBP in differentiating patients with SpA from patients with other causes of back pain has not been fully clarified.
In this study we investigated the presence of IBP in an unselected cohort of patients with persistent LBP. We found that IBP was relatively common, with 67% of patients fulfilling at least one definition of IBP.
Furthermore, the discriminative value of IBP definitions in relation to SpA was below the predefined threshold for clinical value.
These results raise the notion that clinical characteristics thought to be indicative of SpA may be characteristics of persistent back pain in general. The concept of IBP was established several decades ago as a key symptom of SpA, but our understanding of inflammation and the role of inflammatory mediators in the pain pathway have advanced extensively since then (24). Moreover, it is likely that pain in non-SpA LBP patients also has an inflammatory component (25), for example caused by Modic type 1 changes (26,27) or inflammatory reaction associated with tissue damage in disc herniation or nerve root compression (28). Thus, the dichotomy between inflammatory and non-inflammatory back pain might not be as simple as previously believed.
The current study also showed an inconsistency between the three IBP definitions; while 67% of the patients fulfilled at least one of the IBP definitions, only 16% fulfilled all three definitions. These results highlight the need for clearer definitions of IBP that can show reproducible test results in a clinical setting. To our knowledge, the reproducibility of IBP characteristics has not been investigated extensively, although the agreement between the referring physician and rheumatologist about the presence of IBP has been reported to be fairly low, with kappa values < 0.2 (29,30).
The discriminative value of IBP was evaluated in the ASAS cohort, consisting of 258 non-SpA LBP patients and 391 SpA patients. The sensitivity and specificity were, respectively, 86% and 40% for the Calin IBP criteria, 63% and 64% for the Berlin IBP criteria, and 73% and 55% for the ASAS IBP criteria (10). These results correspond to a balanced accuracy between 0.63 and 0.64, in concordance with our results. Other studies evaluating the discriminative value of IBP in relation to SpA have shown similar results of the discriminative value measured as balanced accuracy (12,31,32).
Some case-control studies in patients referred for specialized rheumatological care have reported better discriminative value regarding the experts diagnosis of SpA for some single characteristics and IBP definitions; with a balanced accuracy > 0.7 for some of the evaluated characteristic and definitions (2,3,9,13). However, caution should be exercised in applying these results to daily clinical practice in the early diagnosis of SpA: first, a case-control design with evaluation of patients with long-lasting established SpA compared with LBP controls does not reflect the challenges of clinical practice (i.e. early identification of SpA) and is likely to overestimate test performance (15). Second, the use of expert opinion as the reference standard introduces a risk of bias because it restricts the possibility for proper blinding, as endorsed by the Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tools (33,34). If clinicians perform the diagnosis of SpA, it is likely that the evaluation of IBP (i.e. the index test) will also be included in the expert opinion (i.e. the reference standard). This sort of bias, also called test review bias, has been shown to produce an overestimation of measures (15).
One methodological strength of the current study is that it is unbiased with regard to the fulfilment of ASAS criteria for axial SpA. The latter was based strictly on a standardized data collection, and the completion of the IBP questionnaire was performed independently of the collection of SpA features, blood samples, and MRI scans. Moreover, this use of standardized data acquisition for the reference standard allowed IBP to be excluded as an SpA feature, thereby reducing the risk of circularities. Furthermore, the interpretation of the study results was strengthened by the large study sample. Lastly, the population of unselected patients with back pain reflects the challenges of clinical practice in early identification of SpA.
The current study, however, also has important limitations that need to be considered in the interpretation of the results. First, formulating a standardized questionnaire required the creation of operational definitions for the IBP characteristics because no validated selfreported questionnaire including the assessed items existed when the study was initiated. There are also limitations regarding the choice of reference standard, as the diagnostic performance of the ASAS criteria in a clinical setting is under debate. However, they are currently the internationally accepted criteria for early classification of SpA. Using criteria for late-state disease, such as the modified New York criteria for ankylosing spondylitis, would not be representative of the diagnostic challenges associated with the early stages.
In conclusion, IBP was in general common in this study population of patients with persistent LBP as almost all patients had at least one IBP characteristic and two-thirds met at least one of the IBP definitions. Furthermore, the discriminative value of IBP for SpA according to ASAS was low, although some positive associations were found between IBP and SpA. These results thereby support the notion that IBP is inadequate as an entry criterion for SpA.