A trial-based economic evaluation of the CaFaSpA referral strategy for axial spondyloarthritis

Objective To assess the cost–utility from healthcare and societal perspectives of the digital CaFaSpA referral strategy (CS) for axial spondyloarthritis (axSpA) in primary care patients with chronic low back pain (CLBP). Method A cluster randomized controlled trial was performed in the Netherlands. General practice units were randomized into CS or usual care (UC). Economic evaluation was performed from the healthcare and societal perspectives within a 12-month time horizon. Outcome measures encompassed disability [Roland–Morris Disability Questionnaire (RMDQ)] and health-related quality of life (EQ-5D-3L). Direct medical (iMTA Medical Consumption Questionnaire) and indirect costs (iMTA Productivity Cost Questionnaire), including productivity loss, were evaluated. Incremental cost–utility ratios (ICURs) were calculated. Results The study included 90 GP clusters with 563 patients (CS: n = 260; UC: n = 303) (mean ± sd age 36.3 ± 7.5 years; 66% female). After 12 months, no minimal important differences in outcomes were observed for RMDQ (−0.21, 95%CI −1.52 to 1.13) or EQ-5D (−0.02, 95%CI −0.08 to 0.05). However, total costs were significantly lower in the CS group owing to lower productivity loss costs. The ICUR for RMDQ was €18,059 per point decrease and €220,457 per quality-adjusted life year increase. Conclusions Digital referral did not decrease the overall healthcare status of patients after 1 year of follow-up and appears to be more cost-effective than UC. Therefore, CS can be used as an appropriate primary care referral model for CLBP patients at risk for axSpA. This will accelerate timely provision of care by the right caregiver.

The prevalence of axial spondyloarthritis (axSpA) among patients with chronic low back pain (CLBP) ranges between 5% and 24% (1)(2)(3)(4)(5).Despite the high prevalence, early recognition of axSpA patients within those with CLBP is difficult for general practitioners (GPs) (3,4).The diagnostic delay of axSpA is reported to be around 8-10 years (6,7).This diagnostic delay causes an increase in disability and reduced quality of life (QoL), and affects work participation, all leading to increased healthcare costs (8).As early diagnosis and treatment can reduce the clinical burden of axSpA and reduce healthcare costs in the long term, this has necessitated the development of referral strategies for CLBP patients at risk for axSpA (7,9,10).Several referral strategies have been developed to assist GPs to identify axSpA (11).However, most referral strategies have a low specificity and/or are expensive as they require imaging and human leucocyte antigen (HLA)-B27 status (11)(12)(13).The CaFaSpA referral strategy (CS) has been developed and validated in a primary care setting for patients with CLBP.This referral strategy uses a simple algorithm in patients with low back pain (LBP) lasting for more than 3 months and age at onset under 45 years, with a sensitivity of 75% and specificity of 58% (3,4).Furthermore, the impact of implementing the referral algorithm in daily practice on functionality has been analysed (5).Since there is no international consensus yet on which referral strategy should be used, the Assessment of SpondyloArthritis international Society (ASAS) recommends that GPs refer patients with CLBP who have at least one feature of axSpA (12).In daily practice, this may result in the inappropriate referral of the majority of CLBP patients from primary to secondary care (13).Since healthcare resources are limited, the balance between innovative healthcare interventions and costs is crucial.Furthermore, cost-effectiveness or cost-utility studies of referral strategies for axSpA are necessary for decision making before implementation in daily clinical practice (14).Therefore, our aim was to assess the cost-utility of the CS for axSpA in primary care patients with CLBP.

Method
Within the Dutch healthcare system, each individual can consult a GP in case of any health issues.The GP acts as a gatekeeper in referring patients to secondary care.Rheumatology care is delivered within secondary or tertiary care through public hospitals or academic medical centres.Referrals to the rheumatologist are based on the knowledge and experience of the individual GP (usual care).Referrals are 95% in digital form, and there are no standardized referral sheets incorporated by the Dutch College of General Practitioners.

Study design and population
We used data from the IMPACT study (5), and performed a trial-based economic evaluation.The IMPACT study was a cluster randomized controlled trial in the Dutch primary care setting in patients at risk for axSpA.Randomization took place at the level of the general practice.Each cluster consisted of GPs from a single primary care practice and their patients.In total, 93 practices were randomized to either the CS or usual care (UC).The block randomization schedule was computer generated and controlled by an independent person.Stratification on the number of GPs working per practice was performed to ensure an equal number of patients in both study groups.GPs in the surrounding areas of participating Dutch rheumatologists using the International Classification of Primary Care (ICPC) coding system were invited to participate.Patients of the participating GPs were recruited between September 2014 and November 2015.Patients with LBP for more than 12 weeks and aged between 18 and 45 years were recruited from participating practices using the ICPC code L03 (15).Exclusion criteria were a clear medical explanation for back pain (e.g.trauma, hernia nuclei pulposi), mental incompetence, or an insufficient understanding of the Dutch language (written).
Informed consent was obtained at the research centre before the start of the study.This study was approved by the medical ethics committee of the Maasstad Ziekenhuis, The Netherlands (trial registration: NCT01944163; Clinicaltrials.gov).

Intervention and control groups
The intervention was the use of the CS by the GP during the consultation when a patient presented with LBP complaints.The CS consists of four parameters: inflammatory back pain (IBP), a positive family history of axSpA, a positive reaction to treatment with nonsteroidal anti-inflammatory drugs (NSAIDs), and a duration of back pain of more than 5 years (3).Referral to a rheumatologist is advised if at least two out of four referral variables are present in the CS.A positive or negative scoring outcome of the CS for referral to a rheumatologist was assessed and registered by a trained research assistant.
In the control group, care as usual was performed in primary care based on the Dutch guideline for LBP (16).Results of the CS were provided to the UC group after 4 months.In the design phase of the IMPACT study, we aimed to provide the results of CS to the UC group after 12 months to increase our study window.However, the medical ethics committee advised to provide the CS results after 4 months as patients might benefit from early treatment.

Estimates of effectiveness and utility
Outcome measures were disability and QoL during 12 months of follow-up.Disability is an important patient outcome measure in patients with LBP and has previously been used in cost analysis (17).Therefore, it was included in our cost-utility analysis as well.Disability was measured by the Roland-Morris Disability Questionnaire (RMDQ) (18).The score ranges from 0 (no disability) to 24 (maximum disability).QoL was measured with the 3-level EuroQol 5 Dimensions (EQ-5D-3L) (19).The EQ-5D scores were transformed into utilities using the Dutch values and time trade-off methods (20).The utilities were then multiplied by the amount of time a patient spent in this particular health state.This resulted in quality-adjusted life years (QALYs) ranging between 1.0 (full health) and 0 (death).Both disability and QoL were assessed at baseline and after 4 and 12 months.

Estimates of cost
The economic evaluation was performed from a societal and healthcare perspective.Cost estimates were assessed at baseline and after 4 and 12 months of follow-up.Direct costs are the costs of all medical consumption (inside and outside hospital costs) and medications.Indirect costs are costs due to productivity loss (PL) (absenteeism and presenteeism).Medical consumption was measured with the institute for Medical Technology Assessment (iMTA) Medical Consumption Questionnaire (iMCQ) (21).The iMCQ is a non-disease -specific questionnaire which gathers information in a consistent and standardized way for medical consumption through self-reporting.This questionnaire contains information about contacts with healthcare providers, hospitalizations, and medication use.The cost guideline by van Roijen et al was followed, including the mentioned reference prices (22).Medication costs were calculated from dosages reported in the iMCQ and prices were estimated using unit prices from the Dutch care institute pharmacy database (23).
Indirect costs are costs due to sick leave, unpaid work, and reduction in work time, and were measured with the iMTA Productivity Cost Questionnaire (iPCQ) (24).We applied the friction cost method to estimate indirect costs due to PL (25).All prices were adjusted to the year 2019 using consumer price indices and calculated in Euros (€) (26).Since the time horizon of this study was 1 year, discounting of costs and effects was not required.

Secondary outcome: axSpA diagnosis
The secondary outcome was an axSpA diagnosis by a rheumatologist.After 12 months of follow-up, all patients were asked to fill in a questionnaire on whether they were receiving care in the rheumatology setting and for which condition.The self-reported diagnoses were verified by retrieving hospital records after informed consent had been given by the patient.When hospital records could not be verified, the self-reported diagnosis was reported as a proxy.

Statistical analysis
Descriptive statistics were used to describe the patient characteristics.Clinical outcomes and total costs were analysed for the CS and UC groups.We performed an intention-to-treat (ITT) analysis.As mentioned in the subsection 'Intervention and control groups', following advice from the medical ethics committee, after 4 months of follow-up, patients in the control group may have received delayed referral advice.In the ITT analysis, they were analysed as though they remained in the control group.This may give an underestimation of the intervention effect.
Incremental cost-utility ratios (ICURs) were calculated in which the mean difference in total costs (CS minus UC) was divided by the mean difference in improvement on the RMDQ and per QALY.
To account for uncertainties in ICUR estimates, we used a two-stage bootstrapping approach, combined with single imputation to account for missing data (27).In this approach, a bootstrap sample is first taken from the clusters, after which bootstrap samples of the individual patients within each bootstrapped cluster are taken.Subsequently, missing data are completed by performing a single imputation on the doubly bootstrapped sample, after which the estimates of interest are calculated by taking the means over the imputed bootstrap sample.This process was repeated until 1000 bootstrap estimates were obtained, which were used to construct a cost-utility plane.
Cost-effectiveness acceptability curves (CEACs) were derived for different willingness-to-pay thresholds.The required threshold in the Netherlands for a screening approach for LBP is ≤ €20 000/QALY (28).CEACs were constructed by plotting the proportion of the incremental cost-effect pairs that lay in the south and east of a ray in the cost-effectiveness plane through the origin with a slope equivalent to the x-axis (i.e.λ = 0).This was repeated until the slope of the line was equivalent to the y-axis (29).Sensitivity analyses were performed comparing complete case data and imputed data.
To explore the group of patients with missing data, we investigated differences in case mix between responders and non-responders by patient characteristics and clinical outcomes.Additional sensitivity analyses were performed excluding patients who reported absenteeism at baseline in both the CS and UC groups to rule out potential effects due to baseline imbalance.All statistical analyses were carried out using STATA version 14.2.A two-sided p-value < 0.05 was considered statistically significant.

Results
In total, 679 patients were included (Figure 1), of whom 563 patients filled in at least one questionnaire (RMDQ, EQ-5D, iMCQ, or iPCQ) at any visit and were included in the analyses.Of these, 260 patients were in the CS group and 303 patients were in the UC group (Figure 1).

Baseline characteristics
The patient characteristics of both groups at baseline are shown in Table 1.
The average response rate for all questionnaires at 12 months was 55%.Complete data on all costs at baseline and after 4 and 12 months were available for 35.5% of patients in both groups.Missing values occurred as the result of patients not filling in the questionnaires.The percentage of missing values at 12 months was comparable between the CS and UC groups (p = 0.14).At baseline, QoL, disability score, and duration of CLBP were comparable between responders and non-responders (p > 0.05).

Healthcare resources
Table 2 lists the mean resource utilization per patient at 12 months.Percentage healthcare utilization was not statistically different between the two groups.

Associated costs
The mean difference in total costs during the 12 month follow-up was €5866 (p < 0.05), favouring the CS (online supplementary file Table S1).Mean total costs at baseline, 4 and 12 months are shown in the online supplementary file Table S2.When excluding patients who reported absenteeism at baseline, higher absenteeism costs were still found in the UC group at 4 months (mean difference €194, p = 0.03) and 12 months (mean difference €245, p = 0.03).

Cost-utility analysis
Following the combined cluster bootstrap and single imputation procedure, all n = 563 participants were included in the base-case analysis.No significant differences in adjusted (for clustering effect) mean difference were found between the CS and UC groups for RMDQ (−0.21, 95% CI: −1.52 to 1.13) and EQ-5D (−0.02, 95% CI: −0.08 to 0.05) (Table 3).Total costs (direct and indirect) were significantly higher in the UC group (mean difference: €−3867, 95% confidence interval €−7074 to €−765).Ninety-nine per cent of the imputed bootstrapped ICURs were located in the two southern quadrants of the cost-effectiveness planes (Figure 2(A) and (B)), indicating that the costs of the CS were lower.The ICUR for RMDQ was €25 716, indicating that for each point improvement on the RMDQ, the CS saved €25 716.The difference in QALYs between the CS and UC groups was very small, resulting in a large ICUR of €220 457.
The sensitivity analysis of complete cases showed similar results, where in all estimated bootstrap samples the ICURs were located in the southern hemisphere, indicating that the CS is associated with lower costs (online supplementary file Figure S1a and b).Excluding patients who reported absenteeism at baseline in both the CS and UC groups also showed that in approximately 97% of the Cost-utility of a referral strategy bootstrap samples of complete case data and imputed data, the ICURs were located in the southern quadrant (online supplementary file Figure S2a and b).

Cost-effectiveness acceptability curves
At willingness-to-pay level of ≤ €20 000, the CS had a probability of being cost-effective in comparison with UC of approximately 98% per QALY gained (Figure 3(A)).
For each reduction of 1 score on the RMDQ, the CS had a probability of being cost-effective in comparison with UC of approximately 48% at a willingness-to-pay level of ≤ €20 000 (Figure 3(B)).

Secondary outcome: axSpA diagnosis
The number of self-reported axSpA diagnoses during the 12 month follow-up was 12/260 (4.6%) in the CS group and 14/303 (4.6%) in the UC group.Owing to a low response rate from hospitals in giving information on the diagnosis, we only could verify the diagnosis of eight out of the 32 referred patients in the US group compared with 59 of the 68 referred patients the CS group.The verified number of axSpA diagnoses was finally three in the UC group and four in the CS group.For details, see online supplementary file Figure S3.

Discussion
Our economic evaluation showed no difference in costs between the CS and UC groups from the healthcare perspective, but a significant difference in costs from the societal perspective, which was in favour of the CS.After 1 year of follow-up, total costs were higher in the UC group despite the similarities in disability and QoL between the CS and UC groups.Costs were mainly driven by lower costs due to PL in the CS group at 4 and 12 months, irrespective of baseline differences in costs.This could possibly be explained by rheumatologists offering lifestyle advice, education, or physiotherapy to CLBP patients in the CS group, to improve their CLBP complaints, which may have resulted in lower PL costs.
To the best of our knowledge, this is the first study to investigate the cost-effectiveness of a referral strategy for axSpA in patients with CLBP.The total costs for patients with CLBP during 1 year of follow-up were higher in our study compared to those in a study by Jellema et al (17).This difference in costs may be explained by the ways in which the data were collected.We used the widely adopted iMCQ questionnaire, while the study by Jellema et al used cost diaries to document the consumption of healthcare resources.The main difference between these two approaches is that the iMCQ covers visits to all healthcare providers as well as other health issues besides CLBP.Moreover, an advantage of the iMCQ is that it reflects real life, as the total costs of patients with LBP, including comorbidities and mental healthcare, are taken into account.Nevertheless, both the iMCQ and cost diaries are self-reported methods to measure healthcare utilizations.Although self-reported questionnaires are reliable (30), actual costs are best captured by medical records and disease registries.
PL costs were significantly higher in the UC group.However, it could not be verified whether the reported sick leave occurred because of CLBP complaints, since the iPCQ is a standardized instrument for measuring overall PL.Therefore, the costs of illness for CLBP patients in this study could be overestimated.A study performed in the Netherlands also showed lower PL costs due to LBP (31).
The current economic evaluation showed no difference in outcomes on effectiveness (i.e.disability and health-related QoL) between the CS and UC groups after 1 year of follow-up.This lack of difference could be a consequence of the low prevalence of axSpA compared to the previous CaFaSpA 1 and CaFaSpA 2 studies (3,4,14).The axSpA diagnosis in this study was, however, reached through the work-up of a rheumatologist, which reflects daily clinical practice, and not by a predefined research protocol, as was performed in the published CaFaSpA studies.This could partly explain the low prevalence of axSpA, as not all rheumatologists might have performed the advised diagnostic work-up in all cases.Also, many patients in the CS group, despite receiving positive referral advice, did not visit a rheumatologist.This is unfortunate and could have underestimated the observed effect of the CS, as we expect a higher QoL and less disability in patients in in the CS group who receive an axSpA diagnosis and receive appropriate treatment.In addition, participation in the study may have led to increased awareness among GPs regarding axSpA or LBP complaints in the UC group.
Furthermore, although the current economic evaluation showed no difference in effectiveness, incremental cost-utility planes indicated lower costs in the CS group and therefore added value in terms of value-based healthcare (32), which is the reforming strategy of Dutch healthcare (33).Moreover, the CEACs showed that the CS is cost-effective.The likelihood that the CS is cost-effective exceeds 90% at willingness-to-pay thresholds of ≤ €20 000 per additional QALY.Although additional research is required, we may speculate that the introduction of the fit-for-work platforms may have encouraged rheumatologists in the CS group to provide advice regarding productivity, which resulted in lower PL costs.Fit-for-work programmes have been developed to improve healthcare providers' knowledge and skills to support work-related challenges (34).In this way, more people with a chronic condition can continue to work.
This study has several strengths and limitations that are worth mentioning.The first strength is that we assessed the impact of an innovative referral strategy in terms of health effects and costs, as a crucial step before implementation in daily clinical practice.Unfortunately, these types of analyses are generally lacking in the majority of implemented disease management strategies, while health resources are scarce and can only be spent once.Secondly, we used a clustered randomized trial to assess the cost-effectiveness of the CS versus UC from a societal and healthcare perspective.Thirdly, we used disability in addition to QoL to investigate the cost-utility of the CS, as disability is an important patient-reported outcome among patients with CLBP (35).Although the EQ-5D is less sensitive in evaluating the change in score over time on a patient level, it is useful as a benchmark between disease indications and countries.Fourthly, our study has good generalizability, as our baseline characteristics, including age, gender, LBP duration, and RMDQ scores, are comparable with other studies performed in the Dutch primary care setting (4,36).Finally, we included presenteeism costs and informal care costs to give a more accurate representation of the true costs related to PL (37).
Study limitations should be noted as well.First, as in most cost-effectiveness trials, sample size calculations were not based on demonstrating cost-effectiveness, but rather on demonstrating a clinically relevant difference of 2.5 points on the RMDQ, which was the primary endpoint of the original trial (5).The required sample size for the cost-effectiveness analysis is therefore expected to be higher than in the clinical effectiveness study (38).Secondly, the level of missing data was high.However, in addition to a complete case analysis, we performed bootstrap sampling combined with imputation to evaluate the main outcomes of this study.Thirdly, we used the friction cost method instead of the human-capital method to value productivity.The human-capital method takes the patient's perspective and counts any hour not worked as an hour lost.By contrast, the friction-cost method takes the employer's perspective, and only counts as lost those hours not worked until another employee takes over the patient's work.Productivity costs have the potential to compensate for the costs of expensive biological agents, but only in early-onset disease when patients still have jobs and if productivity is given full weight using the human-capital method.If productivity is given less weight by excluding productivity costs or by using the friction-cost method, biological agents are probably too expensive.Although the friction-cost and the humancapital methods can produce widely different results, we believe that this would not have led to a different conclusion, since despite the use of the friction-cost method, PL costs were still lower in the CS group.
With respect to generalizability, the results of this study are likely to be representative of the Dutch situation, since our RMDQ scores and baseline characteristics are comparable with other studies performed in the Dutch primary care setting.Although we do not expect great variability in EQ-5D and RMDQ scores among young primary care patients with CLBP in other countries, differences in healthcare systems, and the volume and costs of resource use, can be expected to be different.

Conclusion
The digital CS referral algorithm did not decrease the overall healthcare status of the patients after 1 year of follow-up, but appears to be more cost-effective than usual GP referral.Therefore, the digital CS can be used as an appropriate primary care referral model for CLBP patients at risk for axSpA.This will accelerate the provision of care at the right time by the right caregiver.For the future, we recommend investigation of the costeffectiveness of referral strategies as a crucial step before implementation in daily clinical practice.Relevant patient-reported outcome measures should be included when investigating the cost-effectiveness of a referral strategy.

Figure 2 .Figure 3 .
Figure 2. Cost-utility planes for (A) quality of life and (B) disability.Dark grey (blue) dots indicate the estimated incremental cost-utility ratios (ICURs) for each bootstrap sample.The light grey (red) dot indicates the overall mean ICUR over all bootstrap samples.EQ-5D, EuroQol 5 Dimensions; RMDQ, Roland-Morris Disability Questionnaire.

Table 2 .
Healthcare consumption in the CaFaSpA strategy (CS) and usual care (UC) groups at the 12 month follow-up.