Review of economic modeling evidence from NICE appraisals of rare disease treatments for spinal muscular atrophy

ABSTRACT Introduction The National Institute for Health and Care Excellence (NICE) in England has appraised three treatments for spinal muscular atrophy (SMA), namely, nusinersen, onasemnogene abeparvovec, and risdiplam. As rare disease treatments (RDTs) commonly face challenges in health technology assessment (HTA) processes due to their clinical and economic uncertainties, an in-depth review of these appraisals is useful to enable a deeper understanding of economic modeling considerations for SMA. Areas covered This review is a detailed analysis of NICE appraisals for SMA and aims to compare the economic modeling evidence from the three RDTs. This is done by examining differences and similarities and by discussing critical outstanding issues across the economic evaluations of the appraisals. Expert opinion This article aims to contribute to the development of evidence that can be used as guidance to inform resource allocation decisions for RDTs for SMA, but also to be a resource about approaches for the generation, analysis and interpretation of economic modeling evidence for RDTs more broadly.


Introduction
Health Technology Assessment (HTA) has been used to evaluate the properties and effects of health technologies and to establish their value in terms of benefits, risks and costs [1,2]. In England, HTAs are conducted by the National Institute for Health and Care Excellence (NICE) to inform resource allocation decisions in the healthcare system. Treatments for rare diseases (RDTs) are also appraised by NICE but in contrast to treatments for diseases which are more prevalent, RDTs pose significant challenges to HTA processes [3]. This is because they are typically associated with clinical and economic uncertainties [3], which complicate judgments about their benefits in comparison to other alternatives. At the same time, there is no approved treatment for the vast majority (95%) of rare diseases, leaving a large unmet medical need [4].
In this context, NICE appraisals for spinal muscular atrophy (SMA) are a relevant case study for analysis. SMA is a severe neuromuscular disease which affects motor neurons in the spinal cord [5]. The disease is caused by deletion, conversion or mutation of the survival motor neuron (SMN) 1 gene which limits expression of the SMN protein [6]. The resulting degeneration of motor neurons leads to progressive muscle weakness, paralysis and death, with SMA being the leading genetic cause of death in infancy [6]. Much information is available for SMA from the NICE appraisals, compared with the vast majority of rare diseases where no authorized treatments exist. This merits a detailed, comparative assessment of these appraisals. Moreover, SMA as a disease displays important characteristics which are also present in other rare diseases, including genetic origin, childhood-onset, a chronically debilitating and life-threatening nature and unmet need [7]. Lastly, the three treatments for SMA reflect common challenges of RDTs at the time of product launch: there is only limited clinical evidence available and they are very expensive in terms of cost per treatment for healthcare systems, resulting in high cost-effectiveness ratios [8,9]. Thus, SMA appraisals can provide insight into resource allocation decisions for RDTs in England.
To the best of the authors' knowledge, this is the first detailed analysis of the economic modeling evidence of the three RDTs in NICE SMA appraisals. Thus, by comparatively analyzing the appraisals with a focus on the economic model, survival modeling, cost and healthcare resource use, the measurement and valuation of health effects, and the committee recommendation, this review aims to enable a deeper understanding of economic modeling considerations for SMA. This is done by examining differences and similarities, and by discussing critical outstanding issues across the economic evaluations of the appraisals.

Economic uncertainties
Economic uncertainties in the evidence base of RDTs often relate to economic modeling considerations regarding cost and health benefit. Typically, only limited information about the direct and indirect costs of rare diseases is available [10], which complicates quantifying resource use. For SMA it has been demonstrated that only limited evidence for the cost of illness exist and that costs across settings and disease phenotypes are variable [11,12]. Moreover, prices of RDTs are typically high because 1) manufacturers must recoup their incurred costs from a limited number of patients which results in high per-patient acquisition costs [8], 2) RDTs often cover conditions where unmet need is usually high and so they are considered to have high value [13], and 3) some RDTs represent innovative breakthrough therapies, for example gene therapies, and thus promise major, potentially life-long clinical benefits [14]. These considerations are also likely to be reflected in the high prices of the three RDTs for SMA; for example, the UK list price for onasemnogene abeparvovec was given at £1.79 million per injection upon approval, and as such, this RDT was labeled as the most expensive pharmaceutical worldwide at that time [15]. Further, due to the myriad of challenges associated with the use and development of patient-reported outcome measures (PROMs) for rare diseases in HTA [16], generating robust health state utility values to inform economic models is often difficult. There is also no agreement on the most appropriate means by which to estimate utility values [17]. Similarly, it has been argued that for SMA robust utility data are absent and that the available utility data often fail to meet reference cases of HTA bodies [18]. Lastly, modeling survival for rare disease patients on new treatments is often associated with uncertainty, including for SMA [19], and primarily due to small sample sizes and limited long-term follow-up of patients.

Available treatments for SMA
Nusinersen was the first treatment for SMA recommended by NICE in the single technology appraisal (TA) guidance in July 2019 [20]. It is an antisense oligonucleotide that targets the SMN2 gene so that it produces higher levels of functional survival motor neuron (SMN) proteins [21]. It is administered by intrathecal injection which mainly limits its effect on central nervous system (CNS) tissue [21]. Onasemnogene abeparvovec is a gene therapy recommended by NICE in the highly specialized technology (HST) guidance in July 2021 [22]. This was followed by a partial review of HST15 and the subsequent draft recommendation of onasemnogene abeparvovec for pre-symptomatic SMA patients in March 2023. [25] Onasemnogene abeparvovec uses an adeno-associated virus (AAV) serotype 9 vector to induce a copy of the SMN1 gene into motor neurons which supplements them with SMN protein [23]. The gene therapy is administered by a one-time, intravenous infusion resulting in a systemic expression of SMN protein [21]. Risdiplam is a small molecule and the most recently authorized treatment for SMA recommended by NICE in the TA guidance in December 2021 [26]. Risdiplam and nusinersen both function as splicing modifiers which ultimately leads to a higher amount of functional SMN protein created by the SMN2 gene [24]. One key difference between nusinersen and risdiplam is the oral administration of the latter which enables the treatment to affect SMN levels in other systemic tissues beyond the motor neurons in the CNS [23]. Figure 1 shows the indication approved by the European Medicines Agency (EMA) and the subsequent NICE recommendation for the three RDTs.

Methods
The documents available for the NICE appraisals of the three RDTs for SMA were reviewed. Relevant documents for each appraisal include the final scope, the company evidence submission, the report by the External Assessment Group (EAG), the final appraisal or evaluation document, and documents relating to respective managed access agreements (MAAs), are all publicly available and were retrieved from the NICE website.
Data for different aspects of the cost-effectiveness evidence were extracted from the respective HTA reports. Data extraction was based on NICE's evidence submission templates for manufacturers and reports by the EAG, and it focused primarily on aspects that were discussed by NICE in the final appraisal or evaluation document of the respective RDT. Therefore, data was extracted for five categories: 1) the economic model, 2) survival modeling, 3) cost and healthcare resource use, 4) measurement and valuation of health effects, and 5) the committee recommendation. For each category, data sources, assumptions by the manufacturer, and comments by the appraisal committee and the EAG were extracted. Data was extracted by a single person (LW) which may be a potential limitation.

Economic models
For all RDTs Markov models were submitted to NICE modeling costs and health benefits. Model health states were based on motor function milestones measured by a range of assessment scales to evaluate different types of SMA. All models were updated several times throughout the appraisal process. Table 1 summarizes the main aspects of the economic analysis for each RDT.
All models employed different structural assumptions. The model structure for nusinersen was very complex which hindered a thorough understanding [27], and the nusinersen and risdiplam models were limited by the inability to reflect appropriate stopping rules accurately [27,32]. Particularly for risdiplam the committee did not accept the assumption of continued benefits, including no change in overall survival (OS) and additional on-treatment benefits, after stopping therapy [32]. Overall, the limitations in the model structure contributed to increased uncertainty of costeffectiveness results (nusinersen) [27], and led to the requirement for an updated model structure for the guidance review (risdiplam) [32]. Further, the conclusions by the committee regarding the model structure of the onasemnogene abeparvovec models in HST15 highlighted that the modeling for the pre-symptomatic population was not appropriate because the manufacturer erroneously assumed that all pre-symptomatic patients would develop type 1 SMA [33]. In the partial review of HST15, the manufacturer provided a new model for this population, including an analysis assuming an equal chance of developing type 2 and 3 SMA for presymptomatic patients aged 12 months which as accepted by the committee [25,34].
Overall, it was likely that there were benefits not captured by the models [27,32,33]. References were made to other factors that also affect health-related quality of life (HRQoL), including participating in activities, respiratory function, pain, physical impairment, and the benefits of gaining specific motor skills such as independence, the ability to self-care, learning to write, or going to school (nusinersen) [27], interim motor milestones, speech and non-verbal communication, reduced fatigue, increased stamina, better respiratory function, ability to swallow and fine motor skills (onasemnogene abeparvovec) [33,35], and fine motor skills, and respiratory and bulbar function such as the ability to swallow, speak and communicate (risdiplam) [32]. Despite these limitations, all RDTs were recommended by the committee (see section 5.5).

Survival modeling
While clinical data demonstrated improvements in survival and motor function of SMA patients for all three RDTs [34][35][36][37], long-term survival outcomes remained a key area of uncertainty [25,27,32,33]. Table 2 gives an overview of key assumptions for modeled survival for the three RDTs.
The modeling of survival proved challenging in all appraisals, primarily owing to a lack of data. In the nusinersen and risdiplam appraisals the modeling included a mortality adjustment, whereby those attaining better health states (such as, sitting and walking) were assigned a weighted survival probability, 75% based on the mortality risk of type 2/3 SMA patients and 25% based on a higher mortality risk observed in type 1 SMA patients in the ENDEAR and additional state for patients experiencing later onset SMA and health sub-states for sitters who lose the ability to sit, and walkers and patients experiencing later onset SMA who lose the ability to walk -These sub-states assume differential cost and QoL inputs applicable to the BSC arm in the base case analysis Type 1: -6 states based on HINE-2 -Plateau at 66 months Type 2/3: -6 states based on MFM-32 and HMFSE -Plateau at 26 months Stopping rule: -Restriction of risdiplam use to a maximum of 50 years (type 1) and 30 years (type 2/ 3) -After the treatement plateau, patients in the non-sitting and permanent ventilation states stop treatment with no effect on OS, utility values or transition probabilities (type 1), and patients in the non-sitting and sitting supported states stop treatment with a linear loss of motor milestones so that transition probabilities equal those for BSC after 120 months, but with no effect on OS and utility values SHINE trials [36,76]. The 0.75 adjustment factor was based on clinical opinion which suggested a range from 0.5 to 1.0, again highlighting the great uncertainty [38]. No mortality adjustment factor was applied in the case of onasemnogene abeparvovec, but in a broadly similar fashion patients reaching the sitting and walking health states were assumed to have the same life expectancy as type 2 and 3 SMA patients respectively [34,35]. The final models in both the nusinersen and risdiplam appraisals also assumed that a proportion of patients 'plateau,' that is, they no longer further improve although they continue to receive treatment [39,76]. While accepting that this is clinically plausible, the committee considered this to be a further source of uncertainty, because there was no robust data by which to determine the timing of the plateau, nor whether a patient's condition worsens after the plateau [27]. Further comments by the EAG and the committee related to survival modeling can be found in the online supplementary material.

Cost and healthcare resource use
For the nusinersen models, the manufacturer conducted a real-world-evidence (RWE) survey [76], but only included a fraction of the data collected in the final models due to concerns about the representativeness of the data [40]. Also, total care costs might have been underestimated [27]. For the risdiplam models, the manufacturer conducted a UK Burden of  Illness study to collect cost data from SMA patients and their caregivers through online surveys [36]. However, resulting healthcare costs were lower than those used in the final nusinersen models and thus cost data from the RWE survey were used for modeling [36], alongside additional costs for permanent ventilation (type 1) [36] and SMA complications [41]. The EAG questioned the appropriateness of these assumptions and noted that assuming additional costs was not in line with the assumptions in TA588 [42,43]. For the onasemnogene abeparvovec type 1 model, the EAG considered the manufacturer's approach to convert estimates from the UK healthcare resource utilization (UK HCRU) into cost categories for the model as being complex, also noting that cost categories could have been designed at the start of the study [77]. Instead, the EAG proposed the SHELF methodology as an alternative study design in which clinical experts agree on a consensus about the true value of each cost category to prevent having substantial outliers [77]. Furthermore, the EAG noted a substantial difference between cost estimates used by the manufacturer and those used in a report of the US Institute for Clinical and Economic Review (US ICER) which reviewed the clinical evidence and cost-effectiveness of nusinersen and onasemnogene abeparvovec [77]. However, as these differences were assumed to be due to differences in setting and perspective (US vs UK NHS), the EAG agreed with the manufacturer to not use the estimates by the US ICER [77]. None of the models explicitly included age-dependent health state costs. Table 3 gives an overview of the sources and assumptions of the health state costs for the three RDTs.

Measurement and valuation of health effects
For type 1 patients, no HRQoL data was collected in clinical trials across appraisals. For type 2/3 patients, PedsQL data was collected in the CHERISH trial (nusinersen) and the SUNFISH trial (risdiplam) and mapped to the EQ-5D-3L format [36,37]. However, due to limited face validity, these mapped utility values were only included in the original models for nusinersen and were excluded a priori by the manufacturer from the risdiplam models [36,79]. All economic models excluded adverse events due to treatment [34][35][36][37], but models accounted for complications, including scoliosis, associated with SMA as a disease [35,41,76].
In each iteration of the nusinersen models different utility values were used. This included utility values obtained by mapping PedsQL data from CHERISH to EQ-5D-3L, utility values from Lloyd et al. [45], a vignette study based on clinician-proxy EQ-5D assessments, and non-preference based utility values estimated by the manufacturer's clinical experts in the original [37], the post-ACD [46], and the final models [76] respectively. The values of the manufacturer's clinical experts represent mid-points between the estimates provided by the EAG's clinical advisors and Lloyd et al. [45] and were considered more appropriate by clinical experts [76]. The EAG considered it appropriate to use estimates derived from clinical experts due to the limited face validity of existing preference-based utility estimates [38]. Caregiver utility values were defined on a range between the average utility from Spanish caregivers in López-Bastida et al. [47] and the EQ-5D score for the general population of the UK [46]; they were implemented as disutilities being dependent on patient health status, assuming a smaller caregiver disutility in better patient health states. The EAG and the committee noted that the inclusion of caregivers increased the incremental cost-effectiveness ratio (ICER) in the type 1 model and decreased the ICER in the type 2/3 model [27]. These ICERs were counterintuitive as they suggested that it would be more cost-effective to treat type 2/3 than type 1 [27]. Thus, a life-extending treatment, particularly for type 1, seemed to be less cost-effective [27]. Lastly, the nusinersen models did not apply additional utility values for patients on treatment compared to the control group. Overall, the committee concluded that utility values in the nusinersen models were uncertain and might have not captured all benefits related to gaining specific motor function skills [27].
For the onasemnogene abeparvovec models, a mix of preference-based and non-preference-based utility values was chosen [34,35]. The utility values generated by Novartis' de novo UK utilities study, in which 100 UK adults from the general population valued four health state vignettes representing the model health states, were not used in the model because they resulted in negative QALYs which was considered to lack face validity [48]. Despite the uncertainty around the chosen utility values, the committee accepted them for decision-making [33]. In contrast to the risdiplam and nusinersen models, the onasemnogene abeparvovec models did not include caregiver utility values because no robut estimates were available according to the manufacturer [35]. The EAG tested the inclusion of caregiver disutility in a scenario analysis for the onasemnogene abeparvovec appraisal and noted that this increased the ICER substantially [33,77]. Lastly, the type 1 model applied additional utility values for patients on treatment compared to BSC to capture interim motor milestones within health states, based on the US ICER report and following EAG preferences [35]. The committee accepted the additional ontreatment utility values reflected benefits demonstrated in the clinical studies but not captured in the modelled health states [33].
In the original risdiplam models, non-preference-based utility values from the EAG's clinical experts in TA588 and values from Lloyd et al. [49] were used for the type 1 and type 2/3 population respectively [36]. The EAG noted that the available preferencebased and non-preference-based utility values tended to lack face validity and scientific rigor respectively [43]. The EAG also commented that it was inconsistent to use preference-based utility values for one SMA population (type 2/3) and non-preferencebased values for the other (type 1) [43]. Eventually, the EAG asserted that the utility values obtained from the manufacturer's clinical experts used in TA588 represented the most appropriate source for patient utility values and the manufacturer subsequently updated the type 1 and type 2/3 models accordingly [43,50]. To account for caregiver utility, the original risdiplam models adopted an additive approach in which caregiver HRQoL increased with patient motor milestone achievement [36]. This approach also assumed that caregiver HRQoL returned to zero after the patient died and did not consider effects of bereavement on caregiver HRQoL which led the EAG to reject this approach [43,50]. Eventually, the final risdiplam models adopted the EAG's disutility approach for type 2/3 patients and an amended disutility approach for type 1 patients [41,42,50]. The committee considered that the EAG's disutility approach increased the ICER in the type 1 model substantially [32], suggesting counterintuitive results with a similar effect as in the nusinersen models. The EAG did not accept the manufacturer's amended disutility approach for type 1 patients, also because it was inconsistent to assume that caregivers are affected only up to a specific timepoint (10.2 years) but not beyond [32,42]. The committee noted the limitations of these approaches and concluded that the amended disutility approach for type 1 patients was not appropriate [32]. Lastly, to account for the benefits of risdiplam in fine motor skills, additional utility values for patients on treatment compared to BSC were added after technical engagement [39] and increased after consultation [42]. The EAG did not support the proposed additional utility values as they were based on assumptions, and there was uncertainty around the number of patients receiving risdiplam that incure these utility gains and around the duration thereof [32]. Moreover, the manufacturer added disutilities due to SMA complications [41]. The EAG noted that this approach was limited due to possible double-counting and implausible clinical assumptions and net utility values [32]. While the committee was sympathetic to the argument that some benefits might not have been captured in the economic modeling, it concluded that the approach to account for additional benefits of risdiplam from fine motor skills and fewer complications was not appropriate due to the resulting implausible utility values [32]. Eventually, the committee highlighted its preference for an elicitation approach, similar to the approach used in TA588, to generate more robust utility values [32]. Table 4 summarizes the sources and assumptions of the measurement and valuation of health effects for the three RDTs.

Cost-effectiveness estimates and committee recommendations
All cost-effectiveness analysis (CEA) estimates were associated with uncertainty. For nusinersen and risdiplam, the CEA estimates were above the range of what NICE usually considers a cost-effective use of NHS resources [27,32]. However, both RDTs were recommended for all populations with a MAA [27,32]. Following the publication of the original MAA for nusinersen, three variations to the MAA have been introduced. They specify broader treatment eligibility criteria considering the risdiplam guidance and the avialability of onasemenogene abeparvovec, and extend the duration fo the MAA by 12 months [84].
In both appraisals the committee was confident that the end-of-life (EoL) criteria were fulfilled for type 1 SMA: life expectancy for patients without treatment was less than two years and both RDTs extended life by three or more months [27,32]. Meeting the EoL criteria allowed treatments priced above the standard NICE cost-effectiveness threshold (£20,000 -£30,000 per QALY) to be recommended up to a £50 000 per QALY threshold. This is important because aside from SMA, only advanced oncology treatments have ever met these criteria [51].
For nusinersen, the committee also noted that there were substantial differences in the CEA estimates, with higher ICERs for type 1 and lower ICERs for type 2/3 patients [27]. In addition, the modification of parameters, such as assuming higher resource costs or including caregiver utility, increased the inconsistency between the ICERs for the two models [27]. Such counterintuitive ICERs were also an issue in the EAG's disutility approach for caregivers proposed in the risdiplam appraisal and in the EAG's scenario analysis including caregiver disutility in the onasemnogene abeparvovec appraisal [32,77].
For onasemnogene abeparvovec, the committee considered the undiscounted QALY gain of 18.62 as the most plausible scenario but agreed to a lower QALY weight than 1.86 due to uncertainties in the modeling and limited evidence for long-term effectiveness [33]. No QALY weighting was applied in the partial review of HST15 as all ICER estimates were below £100,000 per QALY gained [25]. Moreover, the committee agreed to reduce the discount rate for benefits and costs from 3.5% to 1.5% for the type 1 model, despite noting that in this case, the uncertainties around costs and long-term outcomes would have a greater impact on the ICER [33]. Further, the committee confirmed that a 1.5% discount rate may be applied when treatments were associated with very high up-front costs, but benefits of the treatment likely accrued over the long-term potentially restoring patients to normal or near-full health [33]. Eventually, treatment for type 1 SMA patients was recommended without an MAA but the committee noted that a key limitation of the evidence base was that it included only babies younger than 6 months [33]. Treatment for the pre-symptomatic population was first recommended under a MAA [33], and following the partial review of HST15 without MAA [25]. This was because all Abbreviations: CEA = cost-effectiveness analysis; EoL = end-of-life criteria; MAA = managed access agreement; N/A = not applicable; SMA = spinal muscular atrophy. ✓ = yes; ╳ = no economic modelling results, including scenario analyses varying the age at treatment, suggested the likely cost-effectivness for onasemnogene abeparvovec for pre-symptomatic patients. [25]. Additionally, treating pre-symptomatic patients with onasemnogene abeparvovec dominated treating type 1 patients with onasemnogene abeparvovec and type 2/3 patients with BSC [25]. Table 5 gives an overview of the committee recommendations for the three RDTs.

Discussion
This review demonstrated that the appraisals of nusinersen, onasemnogene abeparvovec, and risdiplam have mostly captured relevant costs and benefits. Nonetheless, there are critical outstanding issues that relate to the classification of SMA health states, survival modeling, resource use data, patient utility values, caregiver utility values, and additional utility values for patients on treatment compared to BSC. Achieving a consensus on how these issues should be approached in economic evaluations for SMA can enable more consistency across appraisals.

Classification of SMA health states
The current SMA classification system assigns patients to different SMA types (0-4) based on the age of symptom onset and the attainment of motor milestones [52]. Disease severity and life expectancy differ by SMA type [53]. This classification system was used in all appraisals. However, due to the availability of treatments and advances in technologies used for supportive medical care, extended survival and improved motor function of SMA patients result in a changing natural history of SMA and new phenotypes, particularly if patients are treated pre-symptomatically [52,54,55]. Similarly, the possibility of treating type 1 patients may lead to an increase in prevalence of potentially milder SMA phenotypes which has implications for resource use in healthcare systems, including in relation to the type and amount of medical care needed by patients [53,54,56]. The committee acknowledged the limitations of the current classification system in the appraisals and was aware that due to the blurry and subjective boundaries delimiting SMA types the full extent of the disease might not be reflected [27,32,33]. Given these limitations, it has been proposed to classify SMA phenotypes according to motor function status and their response to therapy, with patients being considered non-sitters, sitters and walkers [52]. Thereby, disease severity is considered on a continuum on which both improvement and deterioration is possible [52,57]. While the adoption of a revised SMA classification system has the potential to reflect a patient's motor function status more accurately, it also may have implications for economic modeling. This issue is particularly interesting because NICE's Managed Access Oversight Committee (MAOC) has extended nusinersen treatment from ambulant type 3 SMA patients to include non-ambulant type 3 SMA patients thereby overturning the initial negative recommendation of the External Assessment Center (EAC) [58]. One reason for this decision was that, given that the biology of SMA is the same for all patients, the SMA classification system was not created with the intention to differentiate between patients to inform commissioning decisions [58]. Rather than being a barrier for patients to access treatment, it was argued that the classification system should help improve understanding of SMA states [58]. Further, it was acknowledged that despite the heterogenous phenotype of type 3 SMA, creating further smaller subgroups within SMA types would lead to challenges in future re-appraisals as the size of the relevant patient group reduces [58]. Against this background, reaching a consensus on a revised classification system that can be used as a basis for economic evaluations would be beneficial. Similarly, a consensus on model structure, including relevant health states and appropriate approaches to assess motor milestones, would improve consistency and comparability of economic evaluations for SMA across different settings.

Long-term survival
This review highlighted the uncertainty associated with modeled long-term survival outcomes for conditions with relatively small patient populations. It confirms that predicting survival outcomes for rare disease patients is challenging because data available for modeling is typically limited by small sample evidence from short-term clinical trials. For example, in the original risdiplam model for type 1 patients, the choice of the survival distribution and associated parameter estimates was based on only eight events for event-free-survival and five events for OS from the FIREFISH trial [43]. Additionally, follow-up of patients in the pivotal studies for all three treatments was short and thus long-term survival estimates remained uncertain [27,32,33,25]. It is not uncommon in such circumstances to look to real-world-data (RWD) for assistance, particularly when MAAs are an increasingly important feature of the commissioning of new technologies. An integral part of the recommendation for nusinersen, onasemnogene abeparvovec and risdiplam was the requirement to follow the conditions of the MAA. While observational data collection in MAAs will contribute to improved knowledge of SMA as a disease, its management, and the disease-modifying treatments available, further data collection from ongoing clinical trials as required in the MAA for onasemnogene abeparvovec and risdiplam may potentially provide more robust survival evidence. Experience with the Cancer Drugs Fund (CDF) in England also suggests that longer follow-up of trial participants contributes much more to reducing uncertainty regarding survival than real world data collection [59].

Collection and quantification of resource use data
RWE studies, such as the RWE costing survey conducted for the nusinersen models, are useful to inform HTA decisions due to their potential to reduce uncertainties in the evidence base. Recently, the growing interest in the use of RWD to generate RWE to confirm the value of a drug has led NICE to develop a RWE framework as guidance for the development and use of different RWD types, including resource use and costs [60]. Thus, this may be taken into consideration when devising further studies to estimate resource use of SMA patients.
Another costing issue was the absence of age-adjusted health state costs in the models. In the onasemnogene abeparvovec appraisal, the EAG's clinical experts stated that the assumption of constant costs over a lifetime horizon was not reasonable, rather health state costs would potentially increase with age due to poorer mobility [77]. However, the manufacturer stated that age-dependent costs were not included due to the lack of evidence for variation in costs by age and the EAG agreed [77]. Nonetheless, a significant economic burden is associated with rare diseases [61], and it is estimated that rare disease average per person per (PPP) year costs are approximately 3-5 times higher than for a healthy age-matched control [62]. Even though PPP year direct medical costs are higher for type 1 patients than for other SMA types [11], it remains unclear to what extent SMA patients who already received treatment, for example with onasemnogene abeparvovec, will incur costs as they age and how the burden on the healthcare system changes. The absence of age-adjusted costs also exemplifies the lack of robust longer-term data which adds uncertainty to the model results.

Patient health state utility values
The appraisals for SMA reflect some of the challenges associated with measuring robust utility values for rare conditions. This review supports the findings of a recent systematic literature review demonstrating the absence of robust utility data for SMA [18]. Moreover, as utility measurement instruments are typically designed for adults, validated measures for pediatric patients are often lacking [63]. While NICE recommends the generic EQ-5D measure to estimate HRQoL in adults, no specific measure is recommended for children and adolescents [64]. Nonetheless, NICE stipulates that a validated generic preference-based measure should be used if suitable [64]. It has been argued that the value set of the EQ-5D-Y (the version of the EQ-5D for children and adolescents) should be used in future SMA utility studies [18]. While the EQ-5D-Y uses more child-friendly wording, its five dimensions are still the same as those in the EQ-5D-3L which was developed for adults [65]. Thus, it can be questioned if it is appropriate for children and adolescents to indicate their health using dimensions which were developed for adults, as the way children describe their health may be different. Moreover, at least for decisionmaking in England, it is a limitation that the available value sets for the EQ-5D-Y are based on how adults in Japan, Slovenia, and Spain value the health of a hypothetical tenyear old child [66].
For the model iterations of the three RDTs analyzed in this study, HRQoL data from clinical trials, from different published studies, estimates by the EAG's and the manufacturer's clinical advisors, and a newly conducted study (de novo UK utilities study for onasemnogene abeparvovec) were considered as potential sources for patient utility values. This reflects that even though different techniques to estimate utility values for rare disease patients exist, there is no consensus about the most appropriate method [17], including for SMA.
Further, there is the tendency to accept the use of nonpreference-based utility values for modeling SMA. Utility values proposed by the manufacturer's clinical experts in TA588 were also used in the risdiplam models [39,76]. Further, a non-preference-based utility value based on the opinion the EAG's clinical experts in TA588 was also used in the onasemnogene abeparvovec model for the sitting state, while preference-based utility values were used in other states [35]. While the EAG preferred using either preference-based or non-preference-based utility values for type 1 and type 2/3 patients in the risdiplam models [43], the use of both preference-based and non-preference-based utility values in different states in the onasemnogene abeparvovec model was not highlighted by the EAG. However, utility values for all model health states should usually be derived from the same data source and collected by the same measurement instrument [67]. With regards to the utility values generated by clinical experts, it is important to note that NICE stipulates that if it is not possible to measure HRQoL in patients, data should be obtained from caregiver rather than clinicians [64]. In addition, NICE recommends that the valuation of health states should reflect public preferences [64]. Both requirements are not fulfilled when estimates proposed by clinical experts are used in economic modeling.

Caregiver health state utility values
Based on the results of this review it continues to be unclear whether and how caregiver utility, including the impact of bereavement, should be valued in economic evaluations assessed by NICE. A recent systematic literature review to assess economic evaluations in SMA also identified differing approaches regarding the inclusion of caregiver utility values across economic evaluations [57]. Among the three RDTs analyzed here, caregiver utility (implemented as a disutility) was included in the final models for nusinersen [76] and risdiplam [41], but excluded in the onasemnogene abeparvovec models [35]. So far, NICE guidelines stipulate that caregiver utility can be included in the analyses but is not necessarily required [64]. However, given that the inclusion of caregiver utility can substantially increase the ICER as shown in the present appraisals for SMA, the issue of how caregiver utility should be valued can be decisive in determining whether treatments are costeffective or not. Thus, while the importance of caregiver utility was acknowledged in the appraisals, it remains unclear whether NICE have a preferred approach. As such, uncertainty remains about how this issue should be approached by manufacturers developing NICE appraisal submissions for SMA, a disease which has a severe effect on individuals surrounding the patient. Therefore, to increase consistency among appraisals, it could be useful to provide guidance whether or not, and if so how caregiver utility should be valued in SMA, severe diseases more broadly, or in the context of pediatric populations where a larger caregiver burden may be expected. Alternatively, reference cases could be specified that include analyses conducted both with and without caregiver utility. Such guidance should be evidence-based, and ideally preceded by further research examining, for example, the potential caregiver HRQoL improvement from already funded interventions, the evidence and impact of caregiver HRQoL across diseases and clinical areas, and the change of caregiver HRQoL over time [68,69]. In the absence of official guidance by HTA bodies, recommendations by Pennington et al. [85] may be considered.

Incorporation of additional utility values for patients on treatment compared to BSC
For all RDTs the potential exists for benefits which are not reflected in the model structure, but there is no consensus on how additional utility for patients on treatment should be modeled to account for these benefits. For the risdiplam models and the onasemnogene abeparvovec type 1 model additional utility gains were applied [35,41]. Nonetheless, the manufacturer's approach used to model these gains was only accepted by the committee in the onasemnogene abeparvovec appraisal [33] but not in the risdiplam appraisal [32]. In the nusinersen models no additional on-treatment utility was added, even though the committee agreed that certain benefits of gaining specific motor skills might not have been captured in the utility values [27]. Therefore, it could be useful to provide guidance on how additional on-treatment benefits should be modeled, particularly when the model health states and associated utility values are not able to reflect achievement of these benefits. This could prevent situations such as with the risdiplam models in which utility values were amended to account for uncaptured benefits of risdiplam, but eventually resulting utility values for each health state were considered implausible by the committee [32].

Conclusion
This study has analyzed the differences and similarities in the NICE appraisals for nusinersen, onasemnogene abeparvovec and risdiplam, and discussed critical outstanding issues across the economic evaluations. It sought to contribute to the development of evidence that can be used as guidance for resource allocation decisions for rare diseases. The findings can inform HTA bodies about approaches for the generation, analysis, and interpretation of economic modeling evidence for RDTs for SMA specifically. As many issues discussed here are also recurring across appraisals for other rare diseases, this review may also be useful for stakeholders in the rare disease appraisal space more generally. To facilitate decision-making for RDTs for SMA, increased consistency in economic modeling is needed. In this context, further analyses could focus on the extent to which new evidence, for example from respective MAAs, reduces uncertainties in economic modeling. In addition, comparative studies of how uncertainties in economic modeling for SMA are considered in HTA processes in different countries merit investigation.

Expert opinion
The advances in the development of disease-modifying treatments provides SMA patients with different active treatment options. As clinical evidence has demonstrated that early treatment, ideally pre-symptomatically, results in better outcomes, patients eligible for treatment could be identified using new-born screening. While new-born screening is currently not routinely available in England [70], genetic testing is offered to siblings of a child that has received a diagnosis of symptomatic SMA [33]. However, a population-based new-born screening study has been initiated in 2022 [86]. The availability of a screening program and the possibility for subsequent treatment may also reduce the prevalence of severe SMA types and result in more SMA patients with potentially milder phenotypes and a longer lifespan. This may also change the nature of the demand for healthcare resources required by these patients. However, due to the uncertainty surrounding long-term outcomes in all available treatments, the implications for patient health and resource use in healthcare systems in the future remain unclear.
Moreover, in its updated manuals covering methods, processes and topic selection which have been published by NICE in 2022, a new severity modifier has been introduced replacing the EoL criteria [64]. It remains to be seen how the severity modifier will be used in guidance reviews or future appraisals of SMA treatments. Currently, clinical research focuses on how the consequences of SMN loss in patients can be addressed, particularly with therapeutic agents that are in development or are already approved for other neuromuscular diseases [73]. This also includes therapies for milder phenotypes, for example the SMN-independent asset SRK-015 which is currently being tested in a phase 3 trial for later-onset SMA patients receiving nusinersen or risdiplam [88,89]. Thus, as future treatment options may include combinations of both SMN-based and SMN-independent treatments, HTA bodies will most likely face more complex economic modeling and appraisals for SMA in the future. Moreover, it is possible that future SMA treatments may qualify for managed access through the recently launched Innovative Medicines Fund (IMF). Having a similar set-up as the CDF, the IMF aims to fund innovative, nononcology health technologies while further data is collected [87]. Lastly, there is an ongoing debate about how gene therapies, some of which promise potentially life-long benefits, should best be evaluated by HTA bodies. In the case of onasemnogene abeparvovec the committee decided to apply a reduced discount rate of 1.5% to benefits and costs to reflect the impact of the potential treatment benefits. However, whether this approach is also taken for future appraisals of gene therapies most likely depends on the strength of the evidence submitted by the manufacturer. As robust HTA processes can facilitate an efficient and equitable use of scarce healthcare resources, HTA can help maximize health outcomes of rare disease patients in the context of budget constraints and ultimately contribute to better health and wellbeing overall.

Funding
L Wiedmann has received general support through the exposé scholarship scheme funded by the German Academic Scholarship Foundation. The funder was not involved in any aspect of the study conduct or the decision to submit the paper for publication.

Declaration of interest
The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

Reviewer disclosures
Peer reviewers on this manuscript have no relevant financial or other relationships to disclose.