Disease activity-guided tapering of biologics in patients with inflammatory arthritis: a pragmatic, randomized, open-label, equivalence trial

Objective To evaluate whether disease activity-guided tapering of biologics compared to continuation as usual care enables a substantial dose reduction while disease activity remains equivalent. Method In this pragmatic, randomized, open-label, equivalence trial, adults with rheumatoid arthritis, psoriatic arthritis, or axial spondyloarthritis in low disease activity on stable-dose biologics for ≥ 12 months were randomized 2:1 into either the tapering group, i.e. disease activity-guided prolongation of the biologic dosing interval until flare or withdrawal, or the control group, i.e. maintaince of baseline biologics with a possible small interval increase at the patients request. The co-primary outcome in the intention-to-treat population was met if superiority in ≥ 50% biologic reduction at 18 months was demonstrated and disease activity was equivalent (equivalence margins ± 0.5). Results Ninety-five patients were randomized to tapering and 47 to control, of whom 37% (35/95) versus 2% (1/47) achieved ≥ 50% biologic reduction at 18 months. The risk difference was statistically significant [35%, 95% confidence interval (CI) 24%–45%], while disease activity remained equivalent [mean difference 0.05, 95% CI −0.12–0.29]. A statistically significant flare risk was observed [tapering 41% (39/95) vs control 21% (10/47), risk difference 20%, 95% CI 4%–35%]; but, only 1% (1/95) and 6% (3/47) had persistent flare and needed to switch to another biological drug. Conclusions Disease activity-guided tapering of biologics in patients with inflammatory arthritis enabled one-third to achieve ≥ 50% biologic reduction, while disease activity between groups remained equivalent. Flares were more frequent in the tapering group but were managed with rescue therapy.

Biological therapies and disease activity-guided monitoring have improved the management of patients with inflammatory arthritis (IA), i.e. rheumatoid arthritis (RA), psoriatic arthritis (PsA), and axial spondyloarthritis (axSpA), as large proportions of patients reach low disease activity (LDA) (1)(2)(3).But what is the more appropriate maintenance treatment: continuation of standard-dose biologics lifelong or tapering to the lowest possible dose?
In clinical trials, biologic tapering is carried out as either fixed dose reduction, often in one step, e.g.50% interval prolongation/dose reduction, or disease activity-guided tapering after an algorithm until flare or withdrawal.Fixed dose reduction is more frequently evaluated; the majority of studies have an RA population (4)(5)(6)(7)(8), fewer studies have an axSpA population (6,7,9), and limited studies have a PsA population (7,10).Disease activityguided tapering of biologics is generally the more aggressive approach as it allows maximal tapering.Evidence is limited to a few randomized controlled trials (11)(12)(13) and some observational studies (14)(15)(16)(17)(18).Both fixed dose reduction and disease activity-guided tapering seem to allow a considerable biologic dose reduction without persistent disease activity deterioration (5-12, 14-16, 18).However, the STRASS trial did not observe equivalent disease activity at 18 months (13).
International treatment guidelines recommend tapering biologics in patients with IA in sustained remission (19)(20)(21), but not how this should be implemented.The BIOlogical Dose OPTimisation (BIODOPT) trial compares a disease activity-guided tapering algorithm to continuation of biologics as usual care in patients with IA in sustained LDA.The primary aim is to evaluate whether the algorithm enables a ≥ 50% biologic dose reduction at 18 months while disease activity remains equivalent.

Patient and public involvement
Two patient research partners were involved in the initial trial protocol development; one provided additional advice on the intervention burden and approved the trial protocol and participant information.

Study design and participants
This investigator-initiated, pragmatic, multicentre, randomized, open-label, equivalence trial was conducted at four rheumatology outpatient clinics in Denmark.The study complied with the Declaration of Helsinki; approval was obtained from the North Denmark Region Committee on Health Research Ethics (N-20170073), the Danish Medicine Agency (2017091722), and the Danish Data Protection Agency .The trial was registered at EudraCT (2017-001970-41) on 21 December 2017.
The study setting has been described in detail elsewhere (22).Eligible patients were adults diagnosed with RA, PsA (peripheral), or axSpA (including axial PsA) treated with abatacept, adalimumab, certolizumabpegol, etanercept, golimumab, infliximab, or tocilizumab, including biosimilars, on a stable dosage for ≥ 12 months.Other biologic modes of action were not included owing to limited use or availability (approval) when the protocol was drafted.Patients who had been treated with oral, intra-articular, or subcutaneous glucocorticoids within the past 12 months were excluded.A revision to allow for inclusion of patients in LDA and not solely remission was approved on 12 March 2019.LDA at enrolment was defined as: no swollen joints and (i) RA: Disease Activity Score based on 28-joint count-C-reactive protein (DAS28-CRP) ≤ 3.2 (23), (ii) PsA: Disease Activity in PSoriatic Arthritis (DAPSA) ≤ 14 (24), and (iii) axSpA: Ankylosing Spondylitis Disease Activity Score (ASDAS) < 2.1 (25).Written informed consent was obtained before inclusion.The trial protocol (version 7, 16 December 2017) is available as supplementary material.Supplementary Data S1 provides an overview of important protocol modifications.

Procedures
Patients were randomized 2:1 (tapering:control), stratified by trial site, diagnosis, and biologic failure history (on biologic number ≤ 2 or ≥ 3) by a computer-generated allocation sequence in a dedicated electronic case report form in Research Electronic Data Capture (REDCap) (26,27) (Supplementary Data S2).A disease activity-guided tapering strategy for biologics was applied to the tapering group; the dosing interval was prolonged by approximately 25% every 4 months (Supplementary Data, Figure S1) until flare or discontinuation, except for infliximab, which was spaced by 2 weeks for each infusion, as previously described (22).A pragmatic usual care practice was applied to the control group, i.e. with an unchanged biologic dosing interval but, if requested by the patient, a small increase was allowed.Disease activity was monitored every 4 months during the first year and at 18 months by DAS28-CRP (28) (RA and PsA) or ASDAS (29) (axSpA).Furthermore, enthesitis, dactylitis, skin and nail psoriasis, uveitis, and symptoms of inflammatory bowel disease (IBD) were assessed for patients with PsA or axSpA.
Flare was defined as ∆DAS28-CRP > 1.2, or ∆DAS28-CRP > 0.6 AND current DAS28-CRP ≥ 3.2 (30) (RA and PsA), or inflammatory back pain AND ∆ASDAS ≥ 0.9 (31) and/or ≥ 1 swollen joint (axSpA).Patients with symptoms of flare (e.g.increasing arthralgia/inflammatory back pain or swollen joints) who did not fulfil the flare criteria were noted.Patients with flare due to tapering went back to the last effective biologic dosing interval, or if necessary, to the standard dose, and could receive glucocorticoids and/or non-steroidal anti-inflammatory drugs.Patients with persistent flare despite stepping back to the standard-dose biologic were switched to another biological therapy.Patients with symptoms of flare whom the physician judged to be in LDA were advised to continue tapering but could remain on the current dosing interval or even go back one algorithm step.

Outcomes
The co-primary outcome was met if superiority in the proportion of patients achieving ≥ 50% biologic reduction at 18 months was demonstrated and mean disease activity was equivalent (hierarchical order).Thus, the co-primary endpoint reflects the importance of not only achieving a considerable biologic dose reduction with tapering but also maintaining equivalent disease activity.Key secondary outcomes were rates of remission and LDA at 18 months.Other secondary outcomes included changes from baseline to month 18: Health Assessment Questionnaire Disability Index (HAQ-DI), Pain visual analogue scale (VAS), Fatigue VAS, Patient Global Health VAS, 36-item Short Form Health Survey (SF-36) version 1.0 physical component summary (PCS) and mental component summary (MCS), Physician Global Health VAS, tender joint count, swollen joint count, and C-reactive protein (CRP).Harm and safety outcomes were serious adverse events (SAEs), serious infections, non-serious infections, cardiovascular events, malignancy, death, uveitis, skin or nail psoriasis flare, IBD flare, and biologic discontinuation due to any adverse event (AE).Furthermore, arthritis flare as well as the need for rescue therapy were summarized.

Statistical analyses
The sample size calculation was performed in accordance with the DELTA2 guideline (32) and was based on the primary efficacy endpoint 'disease activity', with an equivalence margin of ± 0.5 disease activity points and a requirement of minimum 85% power (Supplementary Data S3).
All analyses were performed and reported in accordance with the prespecified statistical analysis plan (available as supplementary material) and the Consolidated Standards of Reporting Trials (CONSORT) statements (33,34).The primary analyses were based on intention to treat (ITT), i.e. all randomized patients, independent of protocol deviations (35).Repeatedmeasures, linear mixed-effects models were applied to evaluate contrasts between groups for continuous outcomes.The fixed factors were group, diagnosis, biologics failure history, centre, and time-point, as well as the interaction between group and time.To reduce random variation, the baseline value of the relevant variable was included as a covariate.Missing data were handled indirectly via the mixed-effects model framework (36).Categorical outcomes were analysed using generalized linear models for binomial data; missing data were handled as trial failures.Sensitivity analyses on the primary and secondary outcomes, which included a per-protocol analysis, were performed to explore the robustness of our findings.The per-protocol population consisted of patients who adhered to the terms of eligibility, interventions, and scheduled trial visits.Subgroup analyses on disease-specific outcomes according to IA diagnosis were also performed.To explore potentially relevant effect modifiers, stratified analyses on the co-primary endpoints were performed (37).
Flares were visualized by Kaplan-Meier cumulative incidence curves.Flare rates were assessed with a univariate Cox proportional hazards model; thereafter, a multivariable Cox model evaluated whether diagnosis and biologics failure history were potential explanatory variables.Safety and harm outcomes were summarized as number and percentages with an estimate of the between-group difference.All analyses were performed in SAS (version 9.4) or STATA (version 16).
Disease activity was equivalent between the two groups at 18 months (mean difference 0.08, 95% CI −0.12 to 0.29), as the 95% CI was within the prespecified equivalence margins of ± 0.5 (Table 2).Thus, the second part of the co-primary endpoint was also met. Figure 2 and Supplementary Data, Figure S2 illustrate disease activity by diagnosis during the study period.At 18 months, no statistically significant differences in disease activity per diagnosis were observed between the tapering group and the control group (RA −0.04, 95% CI −0.49 to 0.41; PsA −0.09, 95% CI −0.56 to 0.38; axSpA 0.24, 95% CI −0.19 to 0.67).However, the trial is not powered for analyses per diagnosis; therefore, the results must be interpreted with caution.
As presented in Table 2, no differences in rates of remission or LDA were observed.Noteworthy secondary findings were that the tapering group had a slightly higher increase in Pain VAS between baseline and 18 months compared to the control group (mean difference 4.7 mm, 95% CI 0.01 to 9.3).Moreover, CRP increased slightly more in the tapering group between baseline and 18 months compared to the control group (mean difference 2.4 mg/L, 95% CI 0.2 to 4.7).In contrast, a small increase in tender joint count was observed in the control group compared to the tapering group (mean difference −0.6, 95% CI −1.2 to 0.00).
Sensitivity analyses on the primary and secondary outcomes showed similar results (Supplementary Data, Tables S4 and S5), with the exception of Pain VAS in the per-protocol population, as the difference was no longer statistically significant (mean difference 3.8, 95% CI −0.7 to 8.4).
The only potential effect modifier for the co-primary endpoints was sex, as females in the tapering group had lower disease activity at 18 months compared to the control group, whereas the opposite was seen for males (between subgroup difference −0.49, 95% CI −0.97 to 0.001) (Supplementary Data, Tables S6 and  S7).Baseline remission or LDA status was not an effect modifier.However, conclusions must be drawn with caution as the trial not was powered for the subgroup analyses.Statistically significantly more flares were observed in the tapering group [fulfilment of the flare criteria: 41% (39/95) vs 21% (10/47), mean difference 20%, 95% CI 4% to 35%; and symptoms of flare when not fulfilling the flare criteria: 40% (38/95) vs 13% (6/47), mean difference 27%, 95% CI 14% to 41%] (Table 3).Flares were treated with rescue therapy; only one patient (1%) in the tapering group and three patients (6%) in the control group had persistent flare.These patients were switched to another biological agent which managed the flare for the patient in the tapering group, whereas one patient in the control group had persistent flare despite the switch and was switched to a third biological drug.The additional two patients switched therapy shortly before or at the 18 month visit; thus, data on flare reversibility were not available.Figure 3 illustrates the cumulative incidence of flare.Hazard ratios (HRs) were high for all flare definitions (fulfilling flare criteria: crude HR 2.19, 95% CI 1.09 to 4.40; symptoms of flare: crude HR 3.92, 95% CI 1.66 to AEs and SAEs were similar between the two groups (Table 3).Non-serious infections were the most frequent AE and were observed in 55% (52/95) versus 51% (24/47).SAEs in the tapering group were ocular thrombosis (n = 1), breast cancer (n = 1), hospitalized due to stomach pain (n = 1), hospitalized due to chest pain (n = 1), and hospitalized due to surgery (n = 1).SAEs in the control group were pulmonary infection (n = 1), erysipelas (n = 1), diagnosed with heart failure (n = 1), paraesthesia in the face (n = 1), and hospitalized due to symptoms of stroke (n = 1).No deaths occurred.

Discussion
The BIODOPT trial demonstrated that a disease activity-guided tapering algorithm for biologics is an effective tool to achieve a substantial dose reduction in patients with IA while maintaining LDA.Subgroup analyses identified sex as a potential effect modifier, as females in the tapering group had lower disease activity at 18 months compared to the control group, whereas the opposite was seen for males.Just as importantly, baseline remission or LDA was not an effect modifier, thus providing evidence for the debate on when biologic tapering should be initiated (38,39).However, as the subgroup analyses are insufficiently powered, conclusions can only be made with caution.
Strengths of the BIODOPT trial are that it was investigator initiated, with no pharmaceutical industry involvement, a randomized design, comparable group monitoring, and few patients lost to follow-up.Exclusion criteria were kept to a minimum to allow inclusion of a broad spectrum of patients, e.g. with previous biologic failure, various lengths of LDA, and various comorbidities.Thus, the study population is judged to resemble the real-life outpatient population.Another strength is the tapering strategy, which can easily be implemented in routine care.
Recent meta-analyses did not find any statistically significant impact of biologic or janus kinase inhibitor (JAKi) tapering versus continuation in patients with RA or axSpA when evaluating serious infection, SAEs, malignancy, cardiovascular disease, or death (6,40).The BIODOPT safety results support these findings.Moreover, no increased risk of persistent flare or of flare in nail psoriasis, skin psoriasis, uveitis, or symptoms of IBD was observed, which is an aspect of great value for physicians and patients.
An equivalence approach for the co-primary endpoint 'disease activity' was chosen to allow for a two-sided testing of the research hypothesis.Potentially, tapering could improve patient satisfaction and quality of life as the disease is controlled with fewer drug doses (41), which could result in lower Patient Global Health VAS scores and thereby lower disease activity.However, as the sample size was not met, a non-inferiority approach, which requires fewer participants, would have been more optimal.Attempts to reach the target sample size were made by including additional sites and enrolment of patients in LDA.The latter strategy was based on evidence from Tweehuysen et al, who found that baseline disease activity was not an effect modifier for successful tapering of biologics in patients with RA (38).Our study supports this conclusion, as subgroup analyses did not find any statistically significant differences between patients in baseline remission and LDA when evaluating the primary efficacy outcomes (Supple mentary Data, Tables S6 and S7).However, the subgroup analyses are insufficiently powered; therefore, conclusions are made with caution.
BIODOPT is not a strict equivalence study, as the first part of the co-primary endpoint aims for superiority and the second for equivalence.As prespecified in the statistical analysis plan, the primary analyses were performed as ITT and a sensitivity analysis performed as per protocol was conducted to assess the implication of protocol violations (Supplementary Data, Table S5).Similar results were observed with ITT and per-protocol analyses, thus strengthening the robustness of our findings.
Another possible limitation to discuss is the openlabel design, which potentially could lead to bias.However, as tapering is expected to increase the risk of flare, potential bias due to an unblinded trial design would be likely to result in overestimation of flare in the tapering group, thereby underestimating the proportion of patients able to taper their biological therapy.
The BIODOPT study population consisted of patients with different IA diagnoses; this was chosen to evaluate the tapering algorithm in a patient population similar to the one most rheumatologists see in daily clinical practice.However, pooling data across IA diagnoses could introduce noise and complicate the interpretation of results, with the risk of overlooking important findings in one subgroup.Nonetheless, as presented in Supplementary Tables S1, S3, S6, and S7, no substantial differences according to diagnosis were identified, but as the subgroup analyses not are adequately powered caution must be applied.
PsA-/axSpA-specific biologics (e.g.secukinumab, ixekizumab, or JAKi) were not included owing to their limited use or availability (approval) when the protocol was drafted.Only a few patients were treated with a non-TNFi, and these patients more frequently were female, were older, had a higher HAQ-DI (maybe due to older age), and experienced repeated biologic failure.Thus, this limits the external validity of BIODOPT to patients with IA treated with a TNFi.
Another limitation is that patients with PsA were monitored by DAS28-CRP and not a PsA-specific disease activity measure (e.g.DAPSA or MDA), as no flare criteria had been defined for PsA when the protocol was drafted.However, the authors acknowledge that a PsA-specific disease activity measure would have been preferable.The co-primary endpoint 'disease activity' was assessed by DAS28-CRP in patients with RA or PsA and by ASDAS in patients with axSpA.The difference in disease activity measurement was handled by the linear mixed-effects model for repeated-measures framework as diagnosis was included as a fixed factor, which enables the comparison within each condition, i.e. as if it were three independent disease-specific trials.
At 18 months, a small but statistically significant difference in Pain VAS was observed between the trial groups; however, the difference was not judged to be of ‡During the study period, three patients had an infection and an episode of uveitis, one patient had an infection and a psoriasis skin flare, one patient had an infection and a psoriasis nail flare, one patient had an infection and flare in symptoms of IBD, and one patient discontinued the bDMARD due to any AE and experienced flare in symptoms of IBD.§During the study period, four patients discontinued their bDMARD due to any AE and had an infection, and two patients had an infection and an episode of uveitis.||During the study period, one patient had both a psoriasis nail flare and an episode of uveitis.
clinical relevance as the 95% CI of 0.01 to 9.3 was within the limits of the minimal clinically important difference of ± 10 (42).Similarly, the 95% CIs of the observed differences in CRP and tender joint count were very small and judged not to have clinical relevance.
No radiographs were performed in BIODOPT.The DRESS-PS trial reported similar radiographic progression in patients with PsA when comparing tapering to control (11).Moreover, a systematic review in patients with axSpA did not find a significant risk for radiographic progression when comparing TNFi treatment to no TNFi or to TNFi tapering (43).The DOBIS trial showed limited add-on value of magnetic resonance imaging (MRI) during TNFi tapering in patients with axSpA, as the clinical flare criteria identified 106 out of 107 flaring patients (15).Furthermore, only minimal changes in radiographic outcomes not considered clinically relevant were found.Similarly, the ADOPT trial showed that 102 out of 104 patients with RA who flared while tapering biologics were identified by clinical flare criteria; one patient had solely MRI flare and another only progressed radiographically (14).However, a Cochrane review based on low-certainty evidence concluded that disease activity-guided tapering in RA may slightly increase the rate of minimal radiographic progression (relative risk 1.45, 95% CI 0.77 to 2.73) (5).Thus, radiographs may be valuable for identifying patients with RA who progress despite being judged not to have clinical flare.
In BIODOPT, more patients in the tapering group experienced a flare but, reassuringly, no increased risk of persistent flare was observed.Nonetheless, a flare can considerably impact the individual patient's quality of life, including social activities and work life, if it is not easily managed.Therefore, shared decision making between the physicians and the patients should take all aspects of tapering or continuation into consideration, as   the potential benefits (minimizing drug dose/potential side effects) for some patients may be outweighed by the risk of flare.

Conclusion
One-third of patients with IA achieved ≥ 50% biologic reduction at 18 months with disease activity-guided tapering, whereas such a reduction was rarely achieved with usual care.Furthermore, disease activity was equivalent between the groups.Flares were more frequent in the tapering group but were managed with rescue therapy; no increased risk of persistent flare was observed.Shared decision making between the physician and the patient, taking all aspects of tapering into consideration, is encouraged before initiating tapering.

Figure 3 .
Figure 3. Flare by trial arm visualized in Kaplan-Meier cumulative incidence curves: (A) fulfilling flare criteria; (B) symptoms of flare; and (C) total flares.Patients censored owing to loss to follow-up are presented in parentheses after the number of patients with flare.

Table 1 .
Baseline demographics and disease characteristics analysed 'as observed', based on the intention-to-treat population.

Table 2 .
Comparison between groups at the 18 month follow-up, based on the intention-to-treat population.

Table 3 .
Safety and harm summary analysed 'as observed', based on the intention-to-treat population.
CI, confidence interval; bDMARD, biological disease-modifying anti-rheumatic drug; NSAID, non-steroidal anti-inflammatory drug; IBD, inflammatory bowel disease; AE, adverse event.*Analysed as tapering group − control group.†Intra-articular glucocorticoid treatment or, if judged necessary by the physician, intramuscular or oral administration.