Consensus for Statements Regarding a Definition for Spinal Osteoarthritis for Use in Research and Clinical Practice: A Delphi Study

To determine consensus among an international, multidisciplinary group of experts regarding definitions of spinal osteoarthritis for research and for clinical practice.


INTRODUCTION
The spine is a structure composed of multiple motion segments connected in series, with its total motion being a composite of that in the individual segments. Degenerative changes similar to knee osteoarthritis (OA) have been visualized in the synovial spinal facet joints (1), and degenerative OA-like changes of the broader spinal segment such as disc space narrowing and vertebral osteophytes are common individual radiographic features of spinal degeneration (2). It is estimated that 57% of adults age 65 years and older have cervical facet joint OA (3) and that 89% of adults age 65 years and older have moderate to severe lumbar facet joint OA (4). Estimates of the prevalence of lumbar spinal segment OA range 40-85%, with a large range reflecting lack of consensus in definitions and distribution across ages (5). Although there is preliminary evidence of an association between symptoms and the structural aspects of spinal degeneration, without a standardized and accepted definition, spinal degeneration will continue to receive far less critical study than that of appendicular OA (hip, knee, and hand), which precludes the preclinical development of novel therapies. Furthermore, the concept of spinal degeneration remains a complex and challenging condition to treat, often leading to suboptimal treatment and poor outcomes for patients.
Genome-wide association has identified OA-associated variants that are common to both appendicular and spinal degeneration, as well as those that are joint specific (6). Similarly, serum biomarkers reflecting turnover of different joint tissues and synovial inflammation that are elevated in appendicular OA were also increased in spine degeneration, but the correlations were specific to facet joint OA versus disc-narrowing (7). These findings support similarities in pathophysiology of appendicular and spinal degeneration but also the need for better defining the latter and its different phenotypes. For appendicular OA, there is a presumed pathology as measured by structural and symptomatic changes, and the diagnostic label "OA" is recognized. Although similar to appendicular OA, degenerative changes at the functional spinal unit and the cascade of structural and symptomatic manifestations are not uniformly recognized as OA. Various terms relating to spinal degeneration exist, including spondylosis, osteoarthritis, osteophytosis, spondylolisthesis, spinal stenosis, and others, highlighting the need for a definition of spinal OA (1,2,8). Recently, there have been calls to define spinal degeneration for the purposes of targeting prevention, disease modification, characterizing phenotypes, treatment, and ultimately improving clinical outcomes (9).
Delphi studies collect and aggregate informed judgments from a group of experts via multiple iterations. Consensus techniques are useful for research when generating evidence is not possible nor feasible and for establishing research priorities (10)(11)(12). Furthermore, consensus techniques can inform clinical practice based on the collective clinical experience and knowledge available from participants. Therefore, the aim of this study was to seek consensus for statements regarding definitions of spinal OA for use in research and for clinical practice.

MATERIALS AND METHODS
The process to establish consensus on statements for the definition of spinal OA, for research and clinical practice, was composed of the following steps: 1) incorporation of experts' perspectives on spinal OA at the 2017 Osteoarthritis Research Society International (OARSI) World Congress and subsequent establishment of a spinal OA discussion group; 2) generation of spinal OA definition statements; 3) subject matter expert selection; and 4) an online, modified Delphi study. This study received human research ethics approval through Macquarie University in Australia (Project ID: 7835; 52020783517249). A DelphiManager license (£850 plus VAT) was purchased by the Department of General Practice, Erasmus MC-University Medical Centre Rotterdam.
Incorporation of experts' perspectives and establishment of a spinal OA discussion group. In 2017, an in-person meeting was held at the OARSI World Congress to discuss existing gaps in the fields of spinal OA research and treatment. Three authors met to establish a working group for a spinal OA Delphi study (KdL, SMAB-Z, and MLF), and together with 12 international multidisciplinary clinicians and scientists, a steering committee was established. The steering committee represents the areas of rheumatology, rehabilitation medicine (physiatry), musculoskeletal radiology, biology, neurosurgery, epidemiology, general practice, physical therapy, and chiropractic.  clinicians and scientists working in areas of back pain and/or OA. First, experts who had most extensively published in spinal OA within the last 10 years (2010-2020) were identified by one reviewer (AC) through a search in Web of Science (accessed March 2nd, 2020). Secondly, experts were recommended by the steering committee to try to ensure a broad range of relevant professional disciplines and geographic regions. Members of the steering committee were also invited to participate in the Delphi study so that they could express their vote on statements for definitions. Finally, experts invited to round 1 were invited to identify clinicians and scientists in their respective field for invitation to the Delphi study.
The Delphi study. From July 2020, a 3-round, online, modified Delphi study using the online software program Delphi-Manager (13) was conducted. A similar Delphi approach has been used by international studies aiming to reach consensus on core outcome domains or measures in patients with low back pain and OA (14,15). Eligible participants were invited by an email containing information on the study background, the need for engagement in each of the 3 rounds, the likely time to complete each round, and a link to the online survey. Each round was open for approximately 4 weeks, and email reminders were sent at 1 week, 2 weeks, and 48 hours before closing.
Upon registering for the study, providing consent, and entering round 1, participants answered questions on their personal and professional attributes. They were then directed to rate statements. Participants indicated their agreement on a 9-point Likert scale, where 1-3 represented "disagree," 4-6 represented "neutral," and 7-9 represented "agree," as well as the option "unable to score." In round 1, an open-ended question provided the opportunity for participants to add statements for consensus in round 2.
In round 2, participants were provided with all 117 statements, as well as additional statements recommended from round 1. Frequencies of responses together with participants' previous ratings were distributed back to each individual participant so they could reconsider their opinion based upon the responses of the group. In round 3, frequencies of responses and the participants' previous ratings of statements that had not achieved consensus were again distributed back to each individual participant. In all rounds, an open-ended question provided the opportunity for participants to add feedback on the statements for consideration by the steering committee.
Statistical analysis. Arbitrary consensus cutoffs were set a priori at ≥70% of participants agreeing or disagreeing with the statement, with <15% of participants providing the opposite response for the same statement (16). Descriptive statistics (counts and proportions) were used to descriptively analyze data from each of the 3 Delphi rounds. The data are presented as frequency distributions as appropriate. Open-ended responses provided by participants were used to create a Word Cloud (17) i.e., a graphic representation of words, concepts, and phrases meaningful to the proposed statements for a definition of spinal OA.

RESULTS
Delphi study design and participants. A flow chart of Delphi study participation is presented in Figure 1. In total, 255 experts were identified and invited via email to participate. Out-of-office replies were received from 64 experts and notices of failure to deliver email from 18. Round 1 opened on July 7th, 2020 and ran until August 20th, 2020. A total of 173 experts were invited by email, and after 116 consented to participate, 103 participants completed the survey (89% response rate). Round 2 was opened on August 24th, 2020 and ran until September 20th, 2020. Round 2 consisted of the original 117 statements and 14 additional statements. After 5 participants withdrew from the study, of the 111 participants who were invited by email, 85 completed the survey (77% response rate). Round 3 was opened on October 12th, 2020 and ran until November 11th, 2020. A further 10 participants withdrew from the study, and of the 101 participants invited by email, 87 completed the survey (86% response rate). The characteristics of participants in each round can be seen in Table 1, and the Word Cloud of common words provided by Delphi participants is shown in Supplementary  Figure 1 proportions were for the statements "there should be international agreement for a scale of severity of structural changes of the functional spinal unit on imaging, for research" (97%), "An internationally recognized definition of spinal osteoarthritis is important, for research" (97%), and "MRI is the preferred imaging method for intervertebral disc changes, for research" (96%). Also achieving consensus and a high proportion for agreement was the statement "spinal osteoarthritis should be considered across a spectrum of structural changes and patient symptoms, for clinical practice" (95%). Although 75% of respondents agreed that "CT (computed tomography) scan is the preferred method of imaging for facet joint changes," 70% also agreed that "MRI (magnetic resonance imaging) is the preferred method of imaging for facet joint changes." Regarding individual parameters for a structural definition of spinal OA, 2 of 12 (loss of intervertebral disc eight and osteophyte formation) achieved consensus for disagreement in being isolated pathologies and to be considered on their own for clinical practice (Figure 2  consensus with the highest proportion of agreement (95%). Higher consensus proportions were found for the statements that symptomatic spinal OA should be expressed in a scale of progressive deterioration "across several symptoms" for research and clinical practice (90% and 83%, respectively) than "for each symptom" for research and clinical practice (89% and 81%, respectively). Regarding individual parameters for a symptomatic definition of spinal OA, 10 of 20 statements regarding individual parameters to be considered in a symptomatic definition of spinal OA achieved consensus, 8 for agreement and 2 for disagreement ( Figure 3).

DISCUSSION
An international group of multidisciplinary subject matter experts agreed that a future definition of spinal OA should be considered across a spectrum of structural changes and patient symptoms and expressed on a progressive scale. For research, a definition of spinal OA should be considered across a collection of different phenotypes for which pathology is present in some, but not necessarily all, anatomical structures. Although clinically distinct phenotypes have been identified in knee OA (18), concerted efforts to identify spinal OA phenotypes and endotypes (19) may assist in faster progress toward targeted therapeutic developments to improve patient care. Finally, a high proportion of agreement was achieved for the statement that severe structural spinal OA on imaging can be present without significant symptoms.
Statements on structural pathology and progression at the functional spinal unit, as well as symptomatic manifestations such as pain and loss of function, were offered. Often there were conflicting results between statements that achieved consensus. For example, although there was consensus for agreement that limitations in physical function was considered the most important symptom for a symptomatic definition of spinal OA, 3 proposed definitions for which there were consensus did not include physical function limitations. Importantly, participants agreed that, for a definition of structural spinal OA, there should be a scale of severity of pathoanatomical changes at the functional spinal unit (on imaging) for research and clinical practice. Similarly, there was consensus for agreement that symptomatic spinal OA should be expressed on a scale of progressive deterioration across several symptoms for research and clinical practice.
Symptoms that met consensus for agreement to be considered in a symptomatic definition of spinal OA included limitations in physical function, chronic and/or recurrent spinal pain and its intensity, and self-reported morning stiffness. Although the existence of a weak to moderate relationship between pain and structural features of spinal OA is well-known (20,21), the relationship between other symptoms and structural manifestations of spinal OA is less extensively studied. One population-based cohort and one primary care cohort reported that self-reported morning stiffness is clearly associated with features of spinal OA (22,23). Interestingly, whereas painful range of motion has recently been found to be associated with poor prognosis of back pain (24), it did not reach consensus in this Delphi process to be included in a symptomatic definition of spinal OA. These results suggest discerning levels of pain severity for both the structural aspects and symptoms. Future observational and diagnostic accuracy research should further explore and/or confirm which symptoms are most strongly related to structural spinal OA.
A limitation of plain radiographic imaging is that it only captures late-stage disease that may not be amenable to therapeutic interventions other than surgery. As a result, there has been greater interest in the use of MRI for detection and assessment of early degenerative changes and the capacity to visualize joint structural abnormalities beyond gross changes in bone and joint morphology (25). Our results demonstrate consensus for disagreement that radiographs are the preferred method of imaging for facet joint changes. Debate arises, however, for the preferred method of imaging for facet joint changes because both computed tomography (CT) scans and magnetic resonance imaging (MRI) met consensus for agreement. There was consensus for agreement that MRI is the preferred imaging method for intervertebral disc changes for research, with a smaller but still large proportion agreeing that MRI is the preferred imaging method for intervertebral disc changes also in clinical practice. Although a definition for knee OA on MRI has been developed by Hunter et al (26), further consensus is required to develop a preferred, core set of propositions for the definition of structural spinal OA on MRI and/or CT. An accepted definition may guide and facilitate the application of appropriate advanced imaging for diagnosis, staging, and assessing disease progression in spinal OA. Although achieving consensus, there was a low proportion of agreement for any form of imaging for a structural definition of spinal OA in clinical practice. Experts may have concerns that inappropriate imaging, including overuse when imaging is not indicated and underuse when imaging is indicated, as well as findings of limited clinical significance, may lead to overdiagnosis, increase downstream health care use, and create unnecessary patient concern (27,28).
Concurring with the statements for structural spinal OA, there was consensus for agreement that imaging is completely necessary for a definition of symptomatic spinal OA for research. This would suggest that a patient included in a research study on symptomatic spinal OA would undergo some imaging under a nuanced thought process in which study design is needed to align research questions and outcome measures that may not be required in a clinical setting. Although there is evidence that routine imaging should not be performed on people with spinal pain in clinical practice, diagnostic imaging may be useful in identifying specific pathology and structural phenotypes. Imaging, as necessary for a definition of symptomatic spinal OA for research purposes, needs further clarification and should have a broad focus across pathophysiology, structural and symptomatic manifestations, clinical outcomes, costs, and adherence.
Consensus was not achieved that imaging is completely necessary for a definition of symptomatic spinal OA in clinical practice, with over half of the participants disagreeing with this statement. Interestingly, qualitative research has shown that clinicians believe that diagnostic imaging is an important tool with which to try to locate the source of low back pain (29). Although the majority of participants' responses align with world-wide consensus that a diagnosis for clinical knee and hip OA can be based purely on symptoms (30), going forward, it needs to be considered whether spinal OA management should align to other guidelines, for instance, those used for appendicular OA (31), spinal stenosis (32), and back pain (33).
Consensus was achieved, with a high proportion of agreement, that age is an important consideration when defining structural and symptomatic spinal OA, and structural changes may be present in asymptomatic individuals. Although there is still uncertainty regarding the prognostic value of imaging in patients with low back pain (34), a recent report highlighted that multilevel osteophyte formation and disc space narrowing may contribute to the prediction of long-term persistence of back pain in older adults (35). More research is needed to identify the structural features that may be most strongly related to long-term trajectories of back pain because, at the moment, there is still a lack of highquality evidence on this subject.
Consensus was rarely achieved for statements that equated spinal canal stenosis with spinal OA. For example, approximately three quarters of participants agreed that spinal stenosis is a standalone condition that should not be considered in a symptomatic definition of spinal OA, yet more than half of the participants also agreed that neurogenic claudication associated with spinal stenosis should be considered in a symptomatic spinal OA definition. Interestingly, for structural spinal OA, consensus for agreement that all tissues of the functional spinal unit should be considered in a definition was achieved. However, there was also agreement that degeneration of the facet joint should be considered in a definition for spinal OA in the absence of pathology in other tissues. This may be because the facet joint is a true synovial joint, like appendicular OA sites, and therefore it may seem more worthy of the term OA. The etiologic similarities and differences among symptomatic spinal stenosis, facet joint OA, and back pain require further investigation.
First, pertaining to appropriate labeling for spinal OA, pertinent to this discussion and to future efforts to define spinal OA, Bedson et al debate the labeling of chronic illness as either beneficial or detrimental to patient outcomes (36). Knee OA, for example, represents a structural pathologic process of the synovial joint from a biomedical viewpoint; however, in clinical practice, knee OA is a syndrome of persistent joint pain. There was consensus for agreement for the participant-offered statement that severe structural spinal OA on imaging can be present without significant symptoms. Because there is no clear evidence that imaging improves patient outcomes, caution is heeded that labeling a patient as having "spinal OA" may, without appropriate health care management, lead to negative attitudes and inappropriate behaviors, as well as increased health care costs due to unnecessary and/or harmful interventions (36). Secondly, because 12% of participants were from low-and middle-income countries, it is important to consider that advanced radiology services such as MRI and CT scans are not widely available. Therefore, an emphasis on a definition for neck or back pain in clinical practice, without imaging, might be most helpful in these countries.
Delphi study methodology is appropriate for obtaining consensus by using a series of questionnaires delivered using multiple iterations (37). The strengths of this Delphi study include a systematic approach to collecting data on opinions from objectively identified, international experts with expertise in back pain and/or OA and the short time frame within and between Delphi study rounds to maintain participant engagement. This success was reflected by a 77% response rate in round two and 86% response rate in round three, demonstrating the importance of the topic of the Delphi study to the group of multidisciplinary experts. The steering committee was assembled based on workshops at OARSI and at the International Forum on Neck and Back Pain meetings. DelphiManager provided participants with appropriate feedback from others, allowing them the ability to reflect on their scores considering other participant's scores. Limitations of the study included the length of each round of the Delphi study, with an initial 131 statements offered for agreement, and the lack of randomization in the order of presentation of the statements. The purposive sampling identified experts who had extensively published on back pain and/or OA over the last 10 years. By doing so, opinions from clinicians in the field, consumers of health care, back pain and/or OA patients, and back pain and/or OA stakeholders may not have been fully captured. In future research that will determine consensus for a definition of spinal OA, it is essential that these groups be more heavily involved. A further limitation is the introduction of participant bias due to the large proportion of physiotherapists and chiropractors as participants (40%).
Although there was consensus for statements for definitions that were analogous to definitions for OA in appendicular joints, elements of a future definition for spinal OA still need to be refined. Importantly, this Delphi study highlighted that future definitions should be considered across a spectrum of structural changes and patient symptoms and expressed on a progressive scale. In addition, whereas consensus proportions for the use of imaging for structural and symptomatic spinal OA for research were often high, there is lack of consensus for the use of imaging for a definition of spinal OA in clinical practice. A research focus on developing a symptomatic definition of spinal OA may have the highest priority, as this would be applicable to both research and clinical settings.