The diagnostic value of three sacroiliac joint pain provocation tests for sacroiliitis identified by magnetic resonance imaging.

Objectives: The aim of the current study was to investigate the diagnostic value of three sacroiliac (SI) joint pain provocation tests for sacroiliitis identified by magnetic resonance imaging (MRI) and stratified by gender. Method: Patients without clinical signs of nerve root compression were selected from a cohort of patients with persistent low back pain referred to an outpatient spine clinic. Data from Gaenslen’s test, the thigh thrust test, and the long dorsal sacroilia ligament test and sacroiliitis identified by MRI were analysed. Results: The median age of the 454 included patients was 33 (range 18–40) years and 241 (53%) were women. The prevalence of SI joints with sacroiliitis was 5%. In the whole study group, only the thigh trust test was associated with sacroiliitis, the area under the receiver operating characteristic (ROC) curve (AUC) was 0.58 [95% confidence interval (CI) 0.51–0.65], sensitivity 31% (95% CI 18–47), and specificity 85% (95% CI 82–87). In men, sacroiliitis was associated with all the SI joint tests assessed and multi-test regimens, with the greatest AUC found for at least one positive out of three tests [AUC 0.68 (95% CI 0.56–0.80), sensitivity 56% (95% CI 31–79), and specificity 81% (95% CI 77–85)]. In women, no significant associations were observed between the SI joint tests and sacroiliitis. Conclusions: Only in men were the SI joint tests found to be associated with sacroiliitis identified by MRI. Although, the diagnostic value was relatively low, the results indicate that the use of SI joint tests for sacroiliitis may be optimized by gender-separate analyses.

Spondyloarthritis is a relatively common inflammatory disorder with a frequency of 1-2% in the European population (1). However, because of its insidious onset and delayed appearance of radiographic changes, the diagnosis of spondyloarthritis is often delayed 5-10 years from the onset of symptoms (2). During this period, undiagnosed patients may experience severe pain and stiffness and have potentially progressive loss of function, contributing to reduced health-related quality of life (3). As new treatment possibilities have been shown to reduce both symptoms and inflammation, efforts have been made to improve early referral and diagnosis of spondyloarthritis.
Although sacroiliac (SI) joint pain provocation tests have been shown to correlate with SI joint dysfunction (4), and their use is recommended in clinical guidelines for identification of low back pain (LBP) originating from the SI joint (5), the value of SI joint tests for the purpose of identifying patients with sacroiliitis has been sparsely investigated (6). Thus, to date, no physical examination tests have been included in referral recommendations for spondyloarthritis. The current recommendations are based primarily on pain characteristics, extraspinal symptoms, and analysis of human leucocyte antigen (HLA)-B27 (3,7,8). Sacroiliitis is considered the keystone in the diagnosis of spondyloarthritis (9,10) and magnetic resonance imaging (MRI) has been proven to be superior to scintigraphy (11,12), computed tomography (11,13), and conventional radiography (12,13) in diagnosing early sacroiliitis. However, to reduce the number of unnecessary MRI scans, efficient and feasible identification of patients with possible sacroiliitis is needed. As physical examination tests are low-cost and simple to perform, it would be ideal if SI joint provocation tests could be used to identify patients at risk of having sacroiliitis. Consequently, there is a need for further investigation of the validity of SI joint provocation tests for sacroiliitis.
The majority of studies on the diagnostic performance of SI joint pain provocation tests have been performed on mixed gender samples but generally with an overrepresentation of women (4). However, the anatomy of the pelvic ring, along with its load distribution, varies between genders (14), and thus it is possible that the SI joint pain provocation tests perform differently in men and women. Therefore, the aim of the current study was to analyse the diagnostic value of three SI joint pain provocation tests for sacroiliitis in a mixed gender sample of patients with LBP and stratified by gender.

Setting
The Spines of Southern Denmark (SSD) cohort consisted of young patients with persistent LBP referred to the Spine Centre of Southern Denmark, where their data were collected with the purpose of investigating the early diagnosis of spondyloarthritis. The Spine Centre is an outpatient, non-surgical unit specializing in managing non-inflammatory back pain in a secondary care public hospital setting. The unit performs multidisciplinary assessment of patients with spinal pain after referral from general practitioners, chiropractors, and medical specialists in primary care.

Patients
During the study period, from March 2011 to October 2013, 16 clinicians (physiotherapists, chiropractors, and medical doctors) were allocated to a multidisciplinary 'project team'. Based on the appointment availability in the project team and access to the MRI scanner, the booking secretaries randomly allocated patients to the project team in a consecutive manner. Patients aged 18 to 40 years referred with LBP as the primary complaint were allocated to the project. The clinicians in the project team excluded patients from the project who did not understand Danish or had contraindications for MRI. A study inclusion/excusion flow chart is presented in Figure 1.
Of the patients referred with LBP in the study period, 1619 patients aged 18-40 years were randomly selected and invited to participate in the study. Other clinical teams in the Spine Centre attended to the remaining patients according to normal clinical procedures at the Spine Centre. One hundred and sixty were excluded before, and 422 after, the first visit; thus, in total, 1037 n = 422 Initially allocated to the study (n=1619) Patients attending first consultation (n=1459) Included in the SSD cohort (n=1037) Reasons for exclusion before the first consultation: • Patient non-attendance (n=60) • Attended clinician outside the study (n=100) Reasons for exclusion after the first consultation: • Declined participation (n=94) • Less than 18 years or more than 40 years (n=12) Reasons for exclusion before the first consultation: • Clinical signs of nerve root compression (n=106) • Missing information on sacroiliac joint test (n=477) n =583 patients were included in the SSD cohort ( Figure 1). For the current analysis, only patients without clinical signs of nerve root compression and with no missing data on the three SI joint pain provocation tests were included. Signs of nerve root involvement were defined as leg pain on a pain drawing, in addition to at least one of the following clinical findings associated with the site of the leg pain: muscle weakness, altered sensation to touch or pinprick, impaired tendon reflexes, a positive straight leg raise test (at 60°or less), or a positive prone knee bend test combined with pain to the anterior thigh.
The study was conducted according to the Declaration of Helsinki and, before inclusion, each patient gave written informed consent for research use and publication of their de-identified data. The Regional Scientific Ethics Committee for Southern Denmark determined that, under the Danish legal framework, this study did not require formal ethics approval (reference number S-20102000-58).

Clinical characteristics
Data on patients' demographic and clinical characteristics were collected using internet-based patient selfreported questionnaires as part of the Spine Centre's standard procedure.

Clinical investigation and SI joint tests
As a part of the standard procedure at the Spine Centre, information from the clinical examination was collected using an internet-based evaluation form. Information collected from the physical examination included neurological examination and three SI joint pain provocation tests (positive or negative): Gaenslen's test (15), the thigh trust test (16,17), and the long dorsal sacroiliac ligament test (18). The SI joint tests were considered positive when reproducing pain in the SI joint (Table 1). All 16 members of the multidisciplinary project team participated in the collection of the SI joint test data.

MRI
Details of the scan protocol have been published previously (19). In brief, MRI of the whole spine and SI joints was performed with a 1.5-T MRI System (Philips Achieva, Best, The Netherlands). For the SI joint, the following sequences were used: semi-coronal T1-weighted turbo spin-echo (TSE), semi-coronal T1-weighted acquisition with spectral pre-saturation inversion recovery (SPIR), and semi-axial T2-weighted short-tau inversion recovery (STIR).
Three spondyloarthritis expert radiologists, blinded to all clinical information except the patients' age and gender, participated in the reading of the MRI scans. Each MRI scan was evaluated by one reader and, in the case of any uncertainty, consensus was reached by consulting another reader (6.1% of the evaluations). In the included patients, the median time between the clinical examination and the MRI scan was 15 days [interquartile range (IQR) 13-21 days]. Each joint was subdivided into four osseous locations: the cartilaginous and ligamentous portion of the iliac and the sacral bones, respectively (20). Sacroiliitis was defined as a minimum of two SI joint regions with bone marrow oedema lesions or a minimum of one SI joint region with bone marrow oedema at > 25% of the periarticular area, and fulfilling the minimum definition of bone marrow oedema used in the Assessment of SpondyloArthritis international Society's (ASAS) criteria for axial spondyloarthritis (21). The agreement of the evaluation of bone marrow oedema at the SI joint has previously been tested with kappa values > 0.8 for inter-and intra-observer agreement (19). An imaging example of sacroiliitis is shown in Figure 2.

Data analysis
Information from the physical examination, the self-reported questionnaire, and the MRI evaluation were entered directly into an electronic database [the SpineData database (22)] and were analysed using STATA version 13.1 (StataCorp, College Station, TX, USA).
Descriptive data were tabulated. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were estimated and are presented as percentages with 95% confidence intervals (CIs) for each SI joint test and for ≥ 1 positive out of three tests, ≥ 2 positive out of three tests, and all three tests positive. The area under the receiver operating characteristic (ROC) curve (AUC) was used to assess the diagnostic value and is reported with 95% CI. For binary predictions, the AUC equals balanced accuracy, that is (sensitivity + specificity)/2 (23). The AUC can range from 0.5 (useless model) to 1.0 (perfect discrimination). The following general interpretation of AUC was used: < 0.6 means no clinical value, 0.6-0.7 limited value, 0.7-0.8 modest value, and > 0.8 discrimination adequate for genuine clinical utility (24).
To test for statistically significant difference between groups, a two-sample Wilcoxon rank-sum test was used for continuous and ordinal categorical variables, and Pearson's χ 2 test was used for binary variables. The significance level was set at 5%.

Results
Of the 1037 patients included in the SSD cohort, 106 were excluded from the current analyses because of clinical signs of nerve root compression and 477 because of missing data on the SI joint pain provocation tests. The remaining 454 were included in the analysis (Figure 1). The patients excluded due to missing values were generally similar to the included patients regarding the prevalence of sacroiliitis and the descriptive and clinical data (see Supplementary Table S1 for details).
The median age of the 454 included patients was 33 (IQR 27-37) years and 241 (53%) were women. A total of 36 patients had changes observed on MRI classified as 'sacroiliitis' in one or two SI joints. The remaining 418 patients were classified as 'non-inflammatory LBP'. Descriptive and clinical data for the two groups are presented in Table 2. Patients with MRI-identified sacroiliitis were more often on sick leave compared with patients classified as non-inflammatory LBP. No differences between the groups were found on the other parameters shown in Table 2.
Women were twice as likely to have a positive test response compared with men (Table 3). Of the 908 included joints, 45 (5%) were classified with sacroiliitis. There was no statistically significant gender difference in the prevalence of sacroiliitis (4%) in men vs. (6%) in women, p = 0.12).

Diagnostic value of the assessed SI joint tests
The sensitivity, specificity, PPV, NPV, and AUC for each of the SI joint pain provocation tests calculated for the whole group are shown in Table 4. Only the thigh trust test was significantly associated with sacroiliitis in the whole group.
The diagnostic parameters of the SI joint pain provocation tests for men and women, respectively, are shown in Table 5. In men, sacroiliitis was significantly associated with all the SI joint tests and multi-test regimens, with the greatest AUC found for at least one positive test out of three tests. In women, no significant associations were found between the SI joint tests assessed and sacroiliitis.   Values are percentages with 95% confidence intervals in parentheses. Difference in prevalence between men and women: * p < 0.01.

Discussion
In this study, we investigated the diagnostic value of SI joint pain provocation tests for identifying early onset sacroiliitis in a large sample of young patients with persistent LBP. In the whole study sample, only one of the SI joint tests was associated with sacroiliitis identified by MRI, and with a low diagnostic value. However, the results also revealed a gender difference in the performance of the SI joint pain provocation tests. While no associations were found between the SI joint tests and sacroiliitis in women, all the tests assessed were associated with sacroiliitis in men, albeit with a low diagnostic value. These results reveal important aspects that need to be considered with regard to the usefulness of physical examination tests for detecting sacroiliitis. Several previous studies have shown pain provocation tests to be useful in identifying patients with SI joint pain (4). The majority of these previous studies have used intra-articular injections with local anaesthetics as a reference test, aimed at identifying all patients with pain originating from the SI joint, regardless of cause (4). Patients with SI joint-related pain are, however, a heterogeneous group with multiple causes of pain. Sacroiliitis constitutes an important subgroup of SI joint pain, but patients with sacroiliitis have often been excluded in studies on SI joint pain provocation tests (4). Thus, the diagnostic value of SI joint tests for sacroiliitis identified using MRI has, to our knowledge, only been investigated in one previous study (6). That study evaluated five SI joint tests including Gaenslen's test and the thigh trust test in 40 patients with chronic LBP, of whom 13 had sacroiliitis shown on MRI (6). The study reported a low diagnostic value of the assessed SI joint tests regarding sacroiliitis (6).
In the current study, we likewise found a low diagnostic value of the SI joint pain provocation tests. In the whole study sample, only the thigh trust test was associated with sacroiliitis and with a low diagnostic value (AUC < 0.6). However, the evaluated tests performed very differently in the two genders. This gender difference in test performance may have multiple explanations. The anatomy of the pelvic ring and its load distribution vary between genders (14), which may compromise the comparability of the performance of the SI joint tests in men and women. Moreover, in the current study, the prevalence of the positive SI joint tests was substantially higher in women than in men, while the prevalence of sacroiliitis did not differ significantly. This resulted in a higher false positive rate among women and is likely to reflect different causes of positive tests in the two genders. While ankylosing spondylitis, which is considered the prototype of the spondyloarthritis, is more prevalent among men (28,29), SI joint dysfunction and pelvic stress are commonly associated with pregnancy (30). Finally, the gender differences in the prevalence of positive tests could be influenced by a gender difference in the threshold for reporting pain during the tests. Overall, there appears to be a need for further investigation of the different causes of SI joint pain among men and women. This may also be beneficial for the evaluation of SI joint tests for SI joint dysfunction in general. The majority of previous studies on SI joint tests have evaluated men and women at the same time, but women seem to be over-represented in the study populations. In a systematic review on the diagnostic validity of SI joint tests regarding patients with SI joint pain, the overall prevalence of women in the included studies was 60% (4). However, the results from the current study indicate that the diagnostic value of SI joint pain provocation tests could be optimized by conducting gender-separate analyses.
The methodological strengths of the current study are, first, the large number of patients included in the analyses, which strengthened the precision of the estimates of diagnostic value and also made gender-separate analysis possible. Second, the independent interpretation of information from the MRI evaluation and the physical examination reduced the risk of bias. Third, the time period between the clinical examination and the MRI evaluation was relatively short, reducing the risk of changes regarding onset of sacroiliitis in the time span after the examination.
One of the limitations of the current study was that the primary aim with the data collection was not to evaluate SI joint tests. Consequently, the performance of the pain provocation tests was not standardized and the reproducibility of the test procedure was not evaluated in the study. Other studies have found an acceptable reliability for two of the three included SI joint tests (Gaenslen's test and the thigh thrust test), although there is a general need for further investigation of the reliability of SI joint tests (31). Moreover, because the SI joint test data were collected and recorded as part of the daily clinical examination, there was a relatively high number of missing values in the current study. Thus, it is unknown whether the relatively low diagnostic value of the tests observed in the current study could partly be due to clinicians inconsistently interpreting the tests prior to recording. However, we expect the results to be more generalizable to daily clinical practice, as the data were collected as a part of the everyday routine in the Department.
Finally, the current study investigated only three tests whereas a multi-test regimen with four or more tests is recommended based on results from previous studies on the validity of tests for SI joint dysfunction (4). Moreover, it is possible that pain provocation tests, which correlate with SI joint dysfunction, may differ from those correlating with sacroiliitis, which investigating a greater variety of tests might reveal. Thus, the observed diagnostic value in the current study may be improved by investigating combinations of more tests in future studies.
In summary, the three SI joint pain provocation tests that were assessed were all positively associated with sacroiliitis among men, but with relatively low diagnostic value, while no associations were found between the SI joint tests and sacroiliitis in women. These results reveal that the diagnostic value of pain provocation tests for sacroiliitis may be optimized by gender-separate analyses. Moreover, the results indicate a need for further investigation of the different causes of SI joint pain among men and women, respectively, which also could add to knowledge regarding tests for SI joint dysfunction in general.