Cost-effectiveness of a telehealth intervention in rheumatoid arthritis: economic evaluation of the Telehealth in RA (TeRA) randomized controlled trial

Objective Telehealth is rapidly gaining ground from usual treatment, not least because of coronavirus disease 2019 (COVID-19) measures. Within rheumatology, telehealth has been used for, inter alia, follow-up for patients with rheumatoid arthritis (RA) with low disease activity or in remission. This study aims to assess the cost-effectiveness of such a telehealth intervention. Method In a randomized controlled trial, 294 patients were randomized into patient-reported outcome-based telehealth follow-up by either a nurse (PRO-TN) or a rheumatologist (PRO-TR) or to conventional outpatient follow-up (control). Cost-effectiveness was evaluated using costs per quality-adjusted life-year (QALY) gained. Individual-level healthcare and productivity costs were retrieved from national Danish registers. Incremental cost-effectiveness ratios were calculated for the intervention groups compared to the control group. Bootstrapping with 10 000 replications was used to obtain confidence intervals. Furthermore, cost-effectiveness acceptability curves were generated. Results The cost comparison showed that PRO-TR was significantly less costly than the control group, whereas the relative reduction in costs for PRO-TN was not significant. The telehealth groups experienced minor, non-significant declines in QALYs, whereas the control group experienced a slight, non-significant increase. The cost-effectiveness analysis showed that for PRO-TR, the willingness to accept a QALY loss was 89 328 EUR. A similar but smaller and non-significant result was seen for PRO-TN. Conclusion PRO-TR and PRO-TN seem to cost less but provide broadly similar health outcomes compared with conventional follow-up. Between the intervention groups, PRO-TR was significantly less costly. More studies are needed to conclude whether rheumatologist- or nurse-led telehealth is more cost-effective than conventional follow-up.

Treatment of rheumatoid arthritis (RA) requires monitoring throughout the course of the disease to prevent permanent joint destruction and consequently reduced functional ability and quality of life (1,2). The anticipated increase in RA prevalence (3) will increase the pressure on healthcare systems. Telehealth follow-up is a promising alternative to conventional follow-up, which has become even more relevant with the coronavirus disease 2019 (COVID- 19) pandemic.
The Telehealth RA (TeRA) study was a pragmatic noninferiority randomized controlled trial (RCT) which tested the effectiveness of telehealth-based disease monitoring among RA patients with low disease activity or in remission at two rheumatology outpatient clinics in Denmark (4). Patients were included for a period of 1 year starting in May 2014. They were randomized into three groups: (i) conventional outpatient follow-up by a physician (control); (ii) patient-reported outcome-based telehealth follow-up by a rheumatologist (PRO-TR); and (iii) patient-reported outcome-based telehealth follow-up by a rheumatology nurse (PRO-TN) (4). The study confirmed non-inferiority in disease activity, as measured by the Disease Activity Score based on 28-joint count-C-reactive protein (DAS28-CRP) (5, 6) after 1 year (4), and patients' experiences with telehealth follow-up were mainly positive (7).
There seems to be a gap in the literature regarding the cost-effectiveness of telehealth interventions, as only a few high-quality, RCTs have demonstrated the effectiveness of telehealth interventions (8,9). We aim to fill this gap by using the cost-effectiveness framework (10) on the data from the TeRA study. Thus, the objective of the current study was to evaluate the cost-effectiveness of a patient-reported outcome (PRO)-based telehealth follow-up intervention for patients with RA with low disease activity or in remission.

Method
Interventions RA patients were randomized 1:1:1 using a computergenerated random number sequence (4) if they fulfilled the American College of Rheumatology (ACR)/European League Against Rheumatism (EULAR) 1987/ 2010 criteria for RA (11,12), and were aged above 18 years, able to speak and understand Danish, and had a disease duration of at least 2 years. Every fourth month, telehealth patients filled out a PRO questionnaire through the generic configurable tele-PRO system, AmbuFlex (13). The questionnaire included the Danish version of the Flare-RA instrument as a screening tool, to assess the need for a physical consultation (6,14,15). Patients in the control group had no access to the tele-PRO system but continued conventional treatment with pre-scheduled consultations by a physician (i.e. a rheumatologist or rheumatologist in training) every fourth month (4). All patients had access to acute outpatient clinic visits whenever needed, and had their blood tests taken at the hospital laboratory every 8-10 weeks to check for disease-modifying antirheumatic drug (DMARD) side effects and inflammation (2). All patients were assessed at baseline and at the end of follow-up, as well as at every acute visit at the clinic.

Outcomes
The primary outcome of the present study was utility, as measured by health-related quality of life (HRQoL), indicated by the 5-level EuroQol 5 Dimensions (EQ-5D-5L) questionnaire (16), at baseline and at end of follow-up, using preference-based Danish weights, developed using the time-trade-off method (17,18). The transformation of weights from EQ-5D-3L to EQ-5D-5L was carried out by the authors as linear interpolation. We equalized HRQoL with quality-adjusted life-years (QALYs) for a 1 year period, based on the notion that the time horizon of the study was only 1 year and therefore was not expected to affect life expectancy. QALYs are applied in health economic evaluations and are comparable across diseases. The index ranges between −0.624 (worst quality of life) and 1 (best quality of life).
A significant number of participants did not answer the questionnaire. Hence, as a sensitivity analysis, EQ-5D levels were imputed for those patients lacking information in only one of the two questionnaires (59 patients). Baseline EQ-5D was imputed based on age, gender, Charlson Comorbidity Index (CCI) (19), and baseline DAS28. End follow-up EQ-5D was imputed based on baseline EQ-5D, age, gender, CCI, end followup DAS28, and intervention group. The last of these was included to consider changes in EQ-5D due to the intervention.

Resource and cost measurements
Costs were assessed from a societal perspective, although only healthcare costs were included in the costeffectiveness model, because social care and productivity costs were found not to differ between the groups (see also Discussion). All costs were measured at the individual level, and obtained from national Danish registers using the unique personal identification number assigned to all Danish residents (20). The National Patient Register (NPR) (21) was used to calculate the CCI covering the period 1-4 years prior to enrolment.
Healthcare costs comprised all somatic inpatient and outpatient hospital activity, primary healthcare costs, and costs of prescription pharmaceuticals. Data on hospital costs were retrieved from the NPR and valued using national diagnosis-related group (DRG) tariffs (22). Fixed costs and one-time costs related to the intervention (e.g. randomization costs) were not included, because the evaluation applied a marginal cost perspective (the cost of one additional patient).
Primary healthcare costs comprised services at general practitioners, practising specialists, physiotherapists, etc. We retrieved data on costs in primary healthcare from the Danish National Health Service Register for Primary Care (NHSR) (23). Only activities fully or partly financed under the public healthcare sector were included. Hence, visits to physiotherapists, for example, that were paid entirely out of pocket were disregarded.
Costs of prescription pharmaceuticals outside hospitals are registered in the National Prescription Register (24) and comprise the total costs of patients' transactions at pharmacies, regardless of payment scheme (subsidy, health insurance, or out of pocket). Only prescription medicine was included. As biological medicines are registered directly in the NPR, these were included in the costs of outpatient visits.
For the descriptive statistics, healthcare costs were measured during the intervention year and the year before the intervention. All costs were measured in 2019 Euros (EUR).

Data
All register data were obtained from the research server at Statistics Denmark (further details are provided in Table A in the supplementary material). Data from the randomized trial were uploaded to the server, ensuring compliance with Danish data protection legislation.

Analysis
In the economic evaluation, the change in costs and change in effect were compared pairwise between each intervention group and the control group, and finally between the intervention groups. We conducted the comparisons as incremental cost-effectiveness ratios (ICERs) (10,25): where � E i and � E c denote the average change in the effect measure (QALYs) from baseline to the end of follow-up for an intervention group i and the control group c, respectively. � C i and � C c denote the average change in costs in the year before the intervention relative to the intervention year for an intervention group and the control group, respectively. Hence, the ICER expresses the cost of gaining one additional QALY (10,25). In a sensitivity analysis, the average change in the effect measure was calculated as � to test the implicit assumption that the effect of the intervention took place immediately at baseline and continued throughout the whole intervention year.
Resampling through bootstrapping with 10 000 replications was used to obtain the 95% security intervals. We displayed the results of the bootstrapping graphically in cost-effectiveness planes, illustrating the probabilities of the interventions being dominant (better and cheaper), dominated (worse and more expensive), better and more expensive, or worse and cheaper. To assess cost-effectiveness, the magnitude of the ICER was compared with the societal willingness to pay for one additional unit of 'health' (10,25). There is no official threshold for the societal willingness to pay or accept in Denmark; however, historically, a threshold around 50 000 EUR has been applied (25).
Finally, we generated cost-effectiveness acceptability curves based on the bootstrapping results. Probabilities of cost-effectiveness were calculated for each 1000 EUR between 0 and 150 000 EUR for each group comparison in the main results.

Results
In total, 294 patients were randomized into the three groups: PRO-TR (n = 99), PRO-TN (n = 88), and control (n = 94). No statistically significant differences in baseline characteristics were found between the three groups (4) (see supplementary Table B). Nineteen patients left the study during the intervention; these patients were not significantly different in terms of age or gender, but had higher disease activity (4). Furthermore, 72 patients did not complete the EQ-5D questionnaire at both baseline and end of follow-up ( Figure 1). These patients were otherwise similar to the 203 patients who remained in the study population,   (5)) and a further 72 patients did not fill out the EuroQol 5 Dimensions (EQ-5D) survey at both the baseline and the end of follow-up. In the imputed sample, 262 patients were included. This total consisted of the 275 patients who did not leave the study but omitting 13 patients for whom there was not information enough to impute EQ-5D. PRO-TR, patient-reported outcome-based telehealth follow-up by a rheumatologist; PRO-TN, patient-reported outcome-based telehealth follow-up by a rheumatology nurse.  except that they were more likely to live alone (34% vs 22%, p = 0.035) (see supplementary Table B).
The study population mainly consisted of women (69%) and the average age was 60 years (Table 1). Around 20-25% were living alone and 50-60% had retired. The proportion of retirees was lower in the control group, although not significantly so. As a result, the control group had a higher annual income, although this difference was also not statistically significant. No between-group differences were found in comorbidity or length of RA disease history. The PRO-TN intervention group had a significantly lower proportion of patients who were rheumatoid factor positive (40.4% vs 58.9% in the control group, p = 0.049). However, the proportion of anti-cyclic citrullinated peptide antibody (ACPA)-positive patients was similar in all three groups.

Effect measures
Table 1(B) and (C) show the QALY levels and DAS28 scores at baseline and end of follow-up. At the end of follow-up, the QALY level was almost identical for all three groups. However, because the QALY level was higher in the intervention groups at baseline, the intervention groups experienced a small reduction in QALY compared with the control group. However, none of these differences was statistically significant. Furthermore, no clinically relevant difference in DAS28 was found between the groups.
For the DAS28 outcome, the PRO-TR intervention group was similar to the control group, whereas the PRO-TN intervention group experienced a small improvement of 0.2, which was not statistically significant. The threshold for a clinically relevant difference is 0.6 or above (26). We found no significant differences in the mean number of visits or contacts, although the number of inpatient admissions tended to increase more in the control group than in the PRO-TR group (p = 0.066) ( Table 2A). The difference in total costs during the intervention year compared with the year before was 699, -1601, and 348 EUR for the control group, PRO-TR group, and PRO-TN group, respectively (Table 2B). The difference between the control group and the PRO-TR group was statistically significant (p = 0.047), whereas the difference between the control group and PRO-TN group was not significant (p = 0.700). The three groups had very similar developments in terms of primary care, pharmaceutical, and social care costs. The difference in overall costs mainly relates to inpatient costs. Hence, as a sensitivity analysis, we compared the costs when excluding inpatient activity. Table 2(C) shows that without inpatient activity, the PRO-TN group experienced only slightly lower relative costs. The difference for the PRO-TR group was still larger, but it became non-significant.

Cost measures
Finally, the cost trajectories between the two intervention groups were compared. The PRO-TR group had decreasing costs (see last column of Table 3A); this finding was mainly explained by higher costs in the PRO-TR group in the year leading up to the intervention, which again was mainly explained by larger inpatient costs in that year. If leaving out inpatient activity, the PRO-TR group still experienced lower costs compared with the PRO-TN group; however, the difference became non-significant (last column of Table 3B). (A) Mean number of visits or contacts in the control group, the two intervention groups separately, and the intervention groups combined. (B) Mean of the cost measures in the control group, the two intervention groups separately, and the intervention groups combined. (C) Total healthcare costs excluding costs related to inpatient activity. Mean difference: mean change in visits from the pre-intervention year to the intervention year; Diff-in-diff: mean change from the pre-intervention year to the intervention year in the intervention groups relative to the control groups. Pharmaceuticals include all medication (not only rheumatoid arthritis related), except biological medication, which is included in outpatient activity. Social care visits (home care and preventive visits) are not included in the table as the number of visits was very low. PRO-TR, patient-reported outcome-based telehealth follow-up by a rheumatologist; PRO-TN, patient-reported outcome-based telehealth follow-up by a rheumatology nurse. *Significant difference-in-differences (Diff-in-diffs) at the 5% level.  With lower costs and a negative effect, the value of the ICER can be interpreted as the cost saved when losing a QALY. Relative to the control group, the PRO-TR intervention saving when losing a QALY was 89 328 EUR and statistically significant. For the PRO-TN intervention, the saving was much smaller (10 842 EUR) and not statistically significant. When combining the two intervention groups, the cost saving compared to the control group was also non-significant, at 48 511 EUR. Scatterplots based on bootstrapping with 10 000 replications confirmed small changes in HRQoL, and the PRO-TR plot (Figure 2A) clearly shows a decrease in costs. The last column of Table 3A and Figure 2D, respectively, show the equivalent ICER calculations and scatterplots for a comparison between the intervention groups. Patients in the PRO-TR group were significantly less costly and experienced slightly (not statistically significantly) better quality of life, leading the PRO-TR group to dominate the PRO-TN group.

Incremental cost-effectiveness ratios
The sensitivity analysis of leaving out inpatient activity and recalculating the ICERs showed that the ICERs were not significant and were much lower for the PRO-TR intervention (15 287 EUR) and the PRO-TN intervention (1161 EUR) (Table 3B). When leaving out inpatient activity, the PRO-TR was still dominant over PRO-TN (see last column of Table 3B).
The second sensitivity analysis looked at the change in QALYs being estimated as the average change rather than the full change from baseline to the end of followup. By doing so, the ICERs were doubled, whereas significance did not change (Table 3C).
The final sensitivity analysis in Table 3 recalculated the ICERs after including a larger sample of patients after having imputed missing QALYs for 59 patients. With this larger sample, the reductions in QALYs were smaller, and with a smaller denominator the ICERs were increased by approximately 50%. For the combined intervention group (PRO-TR + PRO-TN), the ICER became significant, with a value of 62 208 EUR (Table 3D).

Cost-effectiveness acceptability curves
With mainly reductions in both costs and effect being found, the cost-effectiveness acceptability curves were the 'inverse' of traditional acceptability curves (starting in 0 and asymptotically approaching 1) (27). All curves except for PRO-TN intersected the y-axis around 1, as most combinations were cost-saving ( Figure 3). All curves approached the asymptote at values between 0.25 and 0.75, since not all combinations involved health gains. PRO-TR involved the largest share of health gains relative to PRO-TN (dotted curve) and hence approached the asymptote at the largest value. In contrast, PRO-TN involved the smallest share of health gains relative to the control (grey curve) and hence approached the asymptote at the lowest value.

Discussion
This cost-effectiveness analysis of PRO-based telehealth follow-up for RA patients with low disease activity or in remission showed that the PRO-TR intervention was cost-effective compared with conventional treatment. Furthermore, the PRO-TR dominated the PRO-TN intervention.
In general, the evidence of the effect of telehealth follow-up offered to patients with RA is limited to a few RCTs (4,8,28,29). Previously, cost savings have been evaluated in a prospective non-RCT among 85 RA patients (30), which concluded that PRO-based care delivered through telehealth was equally effective but less costly compared with usual care. This is overall in line with our findings; however, to the best of our knowledge, the present study was the first to assess the cost-effectiveness of a telehealth intervention targeting tight control of disease activity in RA based on an RCT.
Among the strengths of the present study was the use of comprehensive population-based register data, which is used throughout the Danish healthcare sector and for health services research (31).
Information regarding transportation costs and time was not available. Including such information would further benefit telehealth solutions from both a societal and a patient perspective, because telehealth saves transportation costs. Information on social care services is provided and registered at the local government level. We retrieved these data from the registers on care for the elderly (32). However, the use of home care and nursing homes was small, and there were no differences between the groups. We therefore disregarded these costs for the cost-effectiveness analysis. Information on productivity costs was retrieved from the Danish Register for Evaluation of Marginalization (DREAM) (33), which contains weekly information on social transfers such as unemployment benefits, supported employment, early retirement, sickness benefits beyond four consecutive weeks of illness, and age and disability pensions. We measured the difference in labour market affiliation, including supported employment positions, between the control and intervention groups in the 12 months following the intervention relative to the previous 12 months in a descriptive analysis. The analysis found no between-group difference, partly due to a high proportion (> 50%) of retired patients. Therefore, productivity costs and sickness absence were not included in the cost-effectiveness analysis.
The short time frame was a major limitation. With no information on disease progression and cost development beyond the 12 month follow-up, we applied a time frame of only 1 year. The utility measure did not adjust for future life expectancy, because the intervention had no impact on longevity. Hence, the study would benefit from a longer follow-up period, which would, inter alia, allow for inclusion of life expectancy.
Another limitation relates to sample size. The costeffectiveness analysis was not planned and the TeRA trial (4) was powered to test non-inferiority in DAS28. Hence, to some degree, the study was underpowered for a cost-effectiveness analysis. Indeed, the small sample proved to be a problem for the analysis, rendering mostly small, not statistically significant differences. In particular, we found a large variation in inpatient costs, seemingly unrelated to the intervention. In a sensitivity analysis, we found that the results were somewhat sensitive to the inclusion of inpatient activity. If the baseline cost measurement is expanded to 2 years, the problem persists but decreases in magnitude. This raises the question of the appropriateness of including inpatient activity. One might argue for keeping only inpatient activity at rheumatology departments. Still, multi-morbidity is well known in RA (34), making it difficult to restrict inpatient activity to only one speciality. Future studies are needed to determine whether and how inpatient activities are affected by telehealth solutions.
The PRO-TR intervention seemed to be cost-effective relative to PRO-TN. This finding was primarily driven by lower inpatient costs in the intervention year, but for other types of costs the PRO-TR also seemed to be cheaper. Since non-inferiority was achieved in the clinical trial (4), more studies are needed to determine whether this was due to more efficient treatment by the rheumatologists than by nurses. For example, it may be that rheumatologists were able to handle questions about comorbidity better than nurses, leading to less patient contact with the hospital and the general practitioner.
The relatively small numbers, in combination with small and non-significant changes in utility, make the uncertainty of the estimates important. Thus, we conducted a probabilistic sensitivity analysis, which confirmed the need for more information.
Finally, the small sample was further reduced by attrition and missing information on quality of life. In general, patients who dropped out had higher disease activity than patients who remained in the study (4), and this selection bias may have led to an overestimation of the costeffectiveness in our study. However, when we addressed this in a sensitivity analysis by imputing missing information, it did not alter the overall conclusion of the study.
The TeRA study population was selected as patients with low disease activity or in remission, and, according to guidelines (2), they should only be offered control visits every 6-12 months. In the study, patients were, however, seen every fourth month, which may have affected the generalizability and may have led to overestimation of the cost-effectiveness of the telehealth interventions.
The allocation into treatment and control groups generally worked well. More patients dropped out from the intervention groups; however, the groups remained very similar, except for a lower fraction of immunoglobulin M rheumatoid factor-positive patients in the PRO-TN intervention group. We found no difference in ACPA between the groups, and thus, the included patients fulfilled the ACR/EULAR 2010 classification criteria for RA (12). In addition, the study design accounts for this potential bias because baseline values are included in the analysis; hence, adjustment is made for a lower need for care, which also existed at baseline.
Outside the scope of this study, other outcome measures of equal importance remain. For instance, patients receiving telehealth follow-up via the TeRA study mainly showed positive perceptions towards this mode of control, not least in terms of flexibility (7).

Conclusion
Telehealth, with the right set-up and for relevant patients, seems relevant as a cost-effective solution. Between the intervention groups, PRO-TR dominated PRO-TN. Other relevant considerations, such as patient satisfaction and organizational issues, should be considered when organizing RA disease management.