ASSESSMENT PROCEDURE Cross cultural adaptation, validity, and reliability of Central Sensitization Inventory in Arabic language

Purpose: The Central Sensitization Inventory (CSI) is a tool that aid in identifying symptoms associated with nociplastic pain. The aim of this study is to adapt CSI to Arabic language, and to examine its psychometric properties. Methods: Adaptation process followed recommended guidelines. Participants with self-reported chronic pain completed a web-based survey. The internal consistency was calculated. Test–retest reliability was examined by allowing 7–9 day gap between two rounds of measurements. Convergent validity was examined by measuring the correlation with Pain Catastrophizing Scale (PCS), EQ-VAS, and EQ-5D-3L. Discriminant validity was examined by testing four priori hypotheses. Factor analysis with principal components extraction was conducted. Results: CSI-Arabic (CSI-Ar) was successfully produced. Its internal consistency and test–retest reliability were excellent (Cronbach’s a1⁄4 0.88 and ICC2,11⁄40.94). The standard error of measurement and minimal detectable change 95% were 3.45 and 9.57, respectively. CSI total score correlation with PCS, EQ-5D-3L, and EQ-VAS was moderate. The results lend support to the four hypothesis related to discriminant validity. Factor analysis revealed a four-factor structure of CSI-Ar. Conclusions: CSI-Ar showed an internal consistency, test–retest reliability, and validity that are comparable to similar studies. The results support the use of CSI-Ar in assessing chronic pain in Arabic-speaking population. � IMPLICATIONS FOR REHABILITATION � Central sensitization (CS) mechanisms are thought to contribute to chronic pain. � Identifying the presence of CS would personalize management. � The Central Sensitization Inventory (CSI) is a valid and reliable tool to aid in identifying symptoms associated with CS. � The Arabic version of the CSI is valid and reliable to use in Arabic speaking patients suffering from chronic pain. ARTICLE HISTORY Received 31 March 2021 Revised 6 November 2021 Accepted 10 November 2021


Introduction
Chronic pain is a condition that approximately affects one in five adults worldwide [1][2][3]. It is defined as any pain that persisted for a period longer than three months [4]. It can adversely impact mobility, employment, and quality of life of sufferers [5][6][7]. It is also associated with other clinical symptoms, including fatigue, poor sleep, cognitive deficits, headaches, depression, and anxiety [8]. Among the mechanisms that contribute to pain chronicity is central sensitization (CS) [9]. CS is defined by the IASP as "Increased responsiveness of nociceptive neurons in the central nervous system to their normal or subthreshold afferent input" [10]. Recently, the term nociplastic pain has been proposed to replace CS. In this mechanism-based classification, there is a disturbed neural processing without actual or potential tissue damage [11]. This involves impairments in pain circuits in which the central nervous system amplifies the neural signaling; leading to increased responsiveness to varied stimuli like pressure, light, sound, cold, heat, and stress [12].
These changes have been identified in a spectrum of chronic pain conditions including low back pain, neck pain, osteoarthritis, fibromyalgia, headache, whiplash, and temporomandibular disorders [13]. In relation, the term "central sensitivity syndrome" (CSS) has been proposed to describe a group of disorders that have no clear structural pathology and have overlapping features and symptoms such as pain, fatigue, poor sleep, and hypersensitivity, in which CS is considered one of the significant mechanisms [14]. Identifying those patients would guide clinicians to select appropriate treatment strategies which targets the expected neurophysiological changes [15].
While there are objective tools to assess the presence of CS such as imaging techniques [16] and Quantitative Sensory Testing [17], these tools lack feasibility in clinical practice because of being expensive or time-consuming. The Central SensitizationInventory (CSI) is relatively cheap, effective, and timesaving screening tool that was recently developed to help in identifying those patients with symptoms associated with CS and in quantifying the degree of their symptoms [18]. It showed satisfactory psychometric properties (test-retest reliability and internal consistency) in healthy participants and in patients with multiple chronic pain conditions including fibromyalgia, chronic widespread pain, and chronic low back pain [19]. It was also found to be valid and reliable in other conditions like migraine, temporomandibular joint pain, osteoarthritis, whiplash, and cancer [20,21]. CSI is composed of two parts: A and B. Part A of the CSI comprises 25 questions pertaining to CS symptoms. Patients are asked to rate the frequency of experiencing each symptom on a fivepoints scale, with 0 indicating "never" and 5 indicating "always". The points of all items are then summed into a score ranging from 0 to 100. Part B asks patients if they have been previously diagnosed with one or more specific disorders related to CS. A total score of 40 was identified as a cut-off score that indicates the presence of CS or CSS (sensitivity: 81%, specificity: 75%) [22]. The original English version identified excellent internal consistency (Cronbach's alpha ¼ 0.879) and test-retest reliability (r ¼ 0.817) [18]. CSI total scores are used to classify CS into five level of severity, namely: sub-clinical (0-29), mild (30)(31)(32)(33)(34)(35)(36)(37)(38)(39), moderate (40)(41)(42)(43)(44)(45)(46)(47)(48)(49), severe (50)(51)(52)(53)(54)(55)(56)(57)(58)(59), and extreme (60-100) [19].
The original English version of CSI was previously cross-culturally adapted to more than 15 languages. An Arabic version of this tool is currently not available. Arabic language is spoken by 467 million individual in 60 countries, and ranked the fourth widely spoken language among others [23]. Translation, cross cultural adaptation, and validation of the CSI in Arabic will benefit health care professionals and researchers who provide assessment and treatment of Arabic-speaker patients. It can expand the understanding about CS in Arabic population. Therefore, the aim of this study was to cross culturally adapt the CSI into Arabic and to assess the psychometric properties (internal consistency; test-retest reliability; convergent, discriminant, and structural validity) of the adapted version in a sample of patients with various chronic pain conditions.

Methods
The study was given ethical approval by the Institutional Review Board at the Hashemite University (ref: 14/15/2009/2020). It consisted of two phases. Phase 1 involved the adaptation process of CSI into standard Arabic language. Phase 2 focused on psychometric properties analysis of the adapted inventory.

Phase 1: CSI adaptation process
The adaptation process was conducted according to the guidelines of the American Association of Orthopaedic Surgeons [24] where: (1) two native Arabic professional translators independently translated the English version of CSI, producing T1 and T2 versions. One translator did not have medical background. (2) Synthesis: a bilingual physiotherapist compared and integrated the two translations. Discrepancies were discussed with the translators. Consensus led to the synthesized version T12. (3) Two bilingual physiotherapists, who were not familiar with CSI, independently translated T12 version to English language, producing BT1 and BT2 versions. (4) Feedback on BT1 was sought from, and given by, one of the original developers of CSI. During this stage of adaptation, it was suggested to edit two words in BT1 version; the word "unrefreshed" in item 1 and the word "restless" in item 22. In T12 version, the word "tired" was used instead of "unrefreshed". The literal translation of "unrefreshed" exists and is understood in Arabic language but it was thought that it is not commonly used in clinical practice. The word "discomfort" was used instead of "restless" for the same reason. Both of the suggested words were kept for discussion. (5) A committee that contained two of original translators, the first two authors, a patient with chronic low back pain, a patient with chronic carpal tunnel syndrome, neck and low back pain, a therapist working in a university hospital, and a therapist working in a governmental hospital reviewed various translations and the feedback of CSI developer. The committee produced the CSI-Arabic (CSI-Ar) prefinal version, which was tested for clarity on 15 patients (nine females) with various types of chronic pain (four low back pain, four migraine, three visceral pain, two neck pain, and two multiple joints pain). The time needed to complete it was around 4 min. Those patients found the inventory easy to understand and no issues were raised at this point; therefore, the final CSI-Ar was approved (Supplementary Appendix 1).

Participants
Native Arabic patients, from Jordan, with self-reported confirmed diagnosis of chronic pain, which lasted for three months and over, were asked to complete a web-based questionnaire. Patients who were under 18 years of age were excluded. No exclusions were made based on the nature or stage of treatment. Recruitment of participants occurred through social media during the period of Covid-19 pandemic. The aim of the study and inclusion criteria were explained to potential participants before they offered their consent. They were then asked to confirm their eligibility first before moving to the main questionnaire. Data collection lasted from 9 October 2020 to 15 March 2021. The collected participants' demographics included their age, sex, and chronic pain complaints as per chronic pain classification in ICD-11 [4].

Questionnaires Pain Catastrophizing Scale (PCS).
A valid and reliable Arabic version of PCS was used [25]. PCS consists of 13 items that explore patient's thoughts, feelings, and behaviors that are indicative of catastrophizing on a five-points Likert scale from 0 (never) to 4 (always). It shares some features of CSI. Weak [26] to moderate [27] level of correlation between CSI and PCS was expected.
EQ-5D. EQ-5D is a tool developed by the EuroQol Group to assess quality of life. It includes the EQ Visual Analogue Scale (EQ-VAS) and the EQ-5D-3L descriptive system. EQ-VAS asks the patients to rate their overall health on a scale from 0 (worst imaginable health) to 100 (best imaginable health). It was expected that individuals with poor quality of life would score more on CSI [28]. EQ-5D-3L is a generic measurement of quality of life that asks respondents to grade their health status on a three level scale of 1 -no problems, 2 -some problems, and 3 -extreme problems across five dimensions of: mobility, self-care, usual activities, pain/ discomfort, and anxiety/depression. Its outputs can be converted into either a single index value for health status or a descriptive profile. A valid and reliable Arabic version of EQ-5D was used in the study [29]. Due to the absence of Jordanian value set that allows generating a single index value, the Arabic value set of Tunisia [30] was used based on the EQ-5D developers recommendations [31]. The value set ranged between 1 for complete health and À 0.79 for being dead. It was expected that the correlation between EQ-5D and CSI to be moderate [21,28,32,33].

Verbal Rating Scale (VRS).
A four-point VRS of pain was used. VRS has strong correlation with multiple pain assessment tools [34,35]. It was found to measure catastrophizing thoughts, in addition to measuring pain intensity [36]. Participants were asked to report their bodily pain at the time of data collection.

Data collection and statistical analyses
Reliability study. Internal consistency and test-retest reliability methods were used to estimate the reliability of CSI-Ar. Internal consistency was measured using Cronbach's alpha. Cronbach's alpha ranges from 0 which means the items do not correlate with each other and 1 which means all items perfectly correlate with each other. A value range of 0.80-1.0 was considered to be high internal consistency [37]. Cronbach's alpha if the item was deleted and the item-total correlation were also calculated to understand the coherency of a single item with the total scale. A value above 0.3 was considered acceptable item-total correlation [38]. Test-retest reliability (i.e., repeatability and agreement) of CSI-Ar was assessed with 8 ± 1 day gap between two rounds of measurements. This gap was considered to be long enough for the participants to forget their initial response, but not long enough for a real change in total score to occur [39]. With the aim of recruiting 50 participants to the second round [40], and expecting a 40% response rate, the survey was sent again to 125 randomly selected participants. Intraclass correlation coefficient (ICC), twoway random effects model, was used to assess test-retest reliability. Reliability was considered moderate, good, and excellent if ICC values were 0.5-0.75; 0.75-0.9; and greater than 0.90, respectively [41]. The standard error of measurement (SEM) and minimal detectable change (MDC) were calculated. SEM indicates the extent of test's scores accuracy. The formula SEM¼SD pooled � ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi 1 À ICC p was used. The MDC is the smallest amount of change in scores that is not attributed to error in measurements. The MDC 95% was estimated using the formula MDC 95% ¼SEM � 1:96 � ffi ffi ffi 2 p [42].
Validity study. Convergent, discriminant, and structural validity analysis was conducted. Convergent validity identifies the association between variables in a new scale and another of similar construct [43]. Since psychosocial risk factors such as catastrophizing and anxiety are present in CS, and it is known to influence the quality of life, convergent validity was examined by using VRS, PCS, EQ-VAS, and EQ-5D-3L. Correlation between the numbers of chronic pain complains and CSI-Ar total scores was also calculated and expected to be moderate [27]. Spearman's rho correlation coefficient was used to identify correlations. The strength of the correlation was interpreted as: weak, if r s between 0 and 0.3; moderate, if r s between 0.4 and 0.6; and strong, if r s between 0.7 and 1.0 [44]. Discriminant validity identifies dissimilarities in measurements of unrelated constructs [43]. That is, examining differences between subgroups of participants. In this study, discriminant validity was examined by testing the following hypotheses.
(1) Participants with more than one chronic pain complaint would have higher scores than participants with only one complaint.
(2) Participants with confirmed diagnosis of CS or CSS would have higher scores than participants with no confirmed diagnosis.
(3) Participants with clinically relevant degree of catastrophizing (i.e., PCS total score �30 [45]) would have higher CSI-Ar scores. (4) Participants' categorized based on EQ-5D-3L scale responses would have differences in their CSI-Ar scores. To test these hypotheses, the independent samples T and one-way ANOVA tests were used. Sixty-three participants in each group were required to detect mean difference of five points, based on 10 points standard deviation, 80% power, and 0.05 alpha. To meet the normality of data assumption of both tests, Kolmogorov-Smirnov and Shapiro-Wilk tests were conducted. When significant, ANOVA was followed by a Bonferroni test. Difference in sex and age between compared groups was identified if the tested hypothesis was accepted. Structural validity: Factor analysis with principal components extraction was performed to determine whether the items were loading on different factors. Bartlett's test of sphericity and Kaiser-Meyer-Olkin's test were used to analyze the sample adequacy for factor analysis. To determine the number of factors extracted, a parallel analysis was used to compare the calculated eigenvalues from data to a randomly generated eigenvalues from a web-based parallel analysis engine [46]. To determine the pattern of rotation of factors, the factors correlation matrix was examined. If the factors correlations were less than 0.5, then factors were considered orthogonally related, and the pattern of rotation was decided to be Varimax. However, if the factors correlations were more than 0.5, factors were considered obliquely related, and the pattern of rotation was decided to be Oblimin.
Floor and ceiling effects. Floor (i.e., achieving the lowest possible scores) or ceiling (i.e., achieving the highest possible scores) effects were deemed present if �15% of the patients reported CSI-Ar value of 0 or 100, respectively [39].
Data processing. All data were analyzed using IBM SPSS statistics 25.0 (Armonk, NY). The significance level was set to 0.05.

Results
One hundred seventy-one patients with chronic pain completed the first round of psychometric property analysis. Their demographic data and CSI-Ar scores are summarized in Table 1. There were no missing data except for one participants who did not complete EQ-5D scales. Missing data were not included in validity analysis. Ninety-two participants (53.8%) had two or more chronic pain complaints. Kolmogorov-Smirnov and Shapiro-Wilk tests of normality demonstrated that CSI-Ar scores are normally distributed: D (171)¼0.04, p ¼ 0.20 and W (171)¼0.99, p ¼ 0.89. The number and percentage of patients in each of CSI-Ar subcategories were: subclinical, 6 (3.5%); mild, 13 (7.6%); moderate, 38 (22.2%); severe, 49 (28.7%); extreme, 65 (38.0%). Floor and ceiling effects were not detected (min: 16, max: 89). The data showed no significant variation in pain severity between the two round of reliability study (Z¼ À 1.069, p ¼ 0.285).

Validity
The correlation between CSI-Ar and other scales is detailed in Table 3. Correlation was significantly moderate with most variables, and significantly weak for PCS-rumination. The results lend support to the four hypothesis related to discriminant validity.
(1) Patients with more than one chronic pain complaint had significantly higher scores than participants with only one complaint: t(169)¼6. 17 Bartlett's test of sphericity was significant (p < 0.001) indicating that at least two items were highly correlated. Kaiser-Meyer-Olkin measure of sampling adequacy was 0.829 suggesting that the sample size was appropriate for factor analysis. The number of factors extracted from the data was the number of eigenvalues greater than the corresponding random eigenvalues from the web-based parallel analysis. Four eigenvalues calculated from factor analysis were larger than the randomly generated eigenvalues obtained from parallel analysis with 25 items included. Therefore, there were four factors extracted from the factor analysis accounting for 48.2% of the items' variance. Factors correlation matrix (Table 5) showed weak correlations between factors; therefore, Varimax rotation pattern was used. Table 6 shows the loading of the items of the SCI and the contribution of each specific item to the factors. Seven items were loading on the first factor and all of them related to emotional distress accounting for 27.3% of the variability. The second factor was loaded with six items related to physical symptoms accounting for 7.8% of the variability. Six items related to headaches and jaw symptoms were loaded on the third factor accounting for 6.7% of the variability. The fourth factor was loaded with six items related to urological symptoms accounting for 6.4% of the variability. Internal consistency analysis of the items constructing each factor demonstrated acceptable Cronbach's alpha values for all factors (Table 7). The deletion of any item within each factor did not increase Cronbach's alpha values.

Discussion
The study aimed to adapt CSI to Arabic language and examine the psychometric properties of the adapted version, CSI-Ar. The adaptation process followed the universally accepted guidelines of the American Association of Orthopedic Surgeons [24], which have been cited in more than 1350 cross-cultural adaptation studies only in 2020. Although not part of the guideline, two patients suffering from chronic pain were part of the expert committee discussion of various translated versions. This step was vital in terms of promoting patient engagement [47], and to successfully keep feedback to minimum during the pilot stage. While cultural adaptation of a tool can be sometime bounded to the context in which the tool was adapted (i.e., Jordan in this study), standard Arabic was used in this study which should be understood by the majority of Arabic-speaking individuals, especially those residing in the Middle East and North Africa [48]. CSI-Ar showed excellent internal consistency and test-retest reliability. The significant moderate association with other scales and the acceptance of most hypotheses lend support for both convergent and discriminant validity, respectively. The mean of CSI-Ar scores of 54.9 (±13.0) was the highest of among other adaptation studies. This might be attributed to cultural variation in expression of pain-related signs and symptoms. For example,    35) in the Japanese [28] and Nepali [49] studies, respectively. In the European Portuguese study [27], which reported CSI mean value of 23.30 (±14.42), all participants were adolescent with musculoskeletal pain complaints. In these three studies, the reported CSI mean scores were even less than the score of healthy participants in the English (28.90) [9] and Brazilian Portuguese studies (37.14). These cultural variation, whether country-or age-related, might explain the experience of CS and its related symptom [50,51], and warrant further research. The internal consistency of CSI-Ar is comparable to other studies which ranged between 0.88 and 0.91, except for the Greek study [52] where Cronbach's a was 0.99. The identified item-total correlation and Cronbach's alpha if an item is deleted gave no reason to exclude any item. Test-retest reliability was excellent and similar to European Portuguese [27], French [53], and Serbian [54] versions of CSI. The highest ICC values for test-retest reliability were 0.98 and 0.99 and recorded respectively in the Nepali [49] and Greek [52] studies. Researchers in the Nepali study suggested that the high and almost perfect reliability could be attributed to reminding participants of their scores in the first round. Although no explanation was offered by the Greek researchers, this could be attributed to the short 5-7 days gap between the two rounds of measurement.
As hypothesized, CSI-Ar has a moderate conceptual relationship to PCS, EQ-VAS, and EQ-5D-3L. With regards to correlation with PCS and its subscales, the correlation coefficient was almost identical to PCS, and its helplessness and magnification subscales that are reported in the European Portuguese study [27]. This correlation was higher than the value (r s ¼0.27) reported in a Dutch study that examined the convergent validity of the CSI [26]. However, other cross cultural adaptation studies reported higher, yet moderate, correlation that ranged between 0.50 and 0.68 [49,52,55]. This moderate correlation was also identified in patients with chronic nonspecific spinal pain [56,57]. Such variation in the amount of correlation might be attributed to the size and characteristics of participants in each study as well as the variability of CSI and PCS scores reported in each study, among other explanations [58].
With regards to correlation between CSI-Ar and pain intensity, the study's findings were not at odd with the findings of other CSI adaptation studies that used more adequate pain such as scale Numerical Rating Scale or Von Korff Pain Scale. In the Nepali study, positive weak associations with pain intensity (r ¼ 0.25) was identified. Similarly, in the German study, associations with pain intensity were weak (s ¼ 0.27) [59]. This weak correlation between pain intensity and CSI was also identified in a population of chronic whiplash-associated disorders (r s ¼0.187) [60] and in patients with hip osteoarthritis (r ¼ 0.348) [61]; but not in patients undergoing total knee arthroplasty (r ¼ 0.496) [21] nor in survivors of breast cancer (r ¼ 0.60) [62]. Pain location [63] and patients' sex [33] have been found to contribute to these variabilities in correlation between pain intensity and CSI scores. This weak to moderate correlation can be explained in light of recent evidence which suggests that pain catastrophizing drive emotional changes that potentially alter central pain processing [64][65][66], and directly affect pain intensity without the mediation of CS [67]. Additionally, it is widely known that the relationship between pain and the degree of tissue pathology is not proportional [68][69][70].
Discriminant validity was examined by identifying the differences between subgroups of patients. CSI-Ar was able to discriminate individuals with pain in more than one location, has clinically relevant pain catastrophizing; has confirmed CS or CSS related diagnosis; and has moderate to severe levels of anxiety/depression, and functional activities. These findings are consistent with the literature. For examples, multiple studies identified a difference between patients with, and without, confirmed CS or CSS related diagnoses [28,54,55]; and a difference between patients with, and without, multiple complaints [53,71], with patients suffering from fibromyalgia has frequently highest CSI scores [18,28,52,72]. The ability to differentiate between levels of selfcare dimensions in EQ-5D was not evident though, probably due limited variability of patient responses to this dimension.
The factors analysis revealed a four-factor structure for CSA-Ar, which are similar to the original English study [18]. Similar structure has been identified in some of other adaptation studies [54,55,73]. However, other studies identified one-factor [72] and five-factor structure [28] of CSI. An international study that used pooled sample from multiple countries demonstrated that CSI consists of one general factor and four latent factors [74]. This was confirmed in the recent German adaptation study [59]. Given these discrepancies in the literature, it is recommended that the total CSI-Ar score to be used and reported.
This study is not without limitations. First, the online format of the study allowed reaching participants with multiple chronic pain conditions. It is likely that, with this format, participants were honest in reporting their pain and related symptoms [75]. However, some of the limitation of this method of recruitment are the inability to determine the response rate and to validate the data provided by the participants. Second, the study is limited by drawing on few outcome measures. An extensive review of  18,22) 0.77 Factor 3: headaches/jaw symptoms (4, 6, 7, 10,19,20) 0.70 Factor 4: urological symptoms (5, 11,14,21,24,25) 0.63 pain history was not done. The rationale was to avoid missing data and to minimize the time needed to fill the online survey. This was successful to some extent since the required sample size was achieved and there was missing data from only one participant. The study recruited individuals aged above 18 and with chronic pain, and therefore CSI-Ar scores for younger and healthy participants were not established. Future research could establish CSI-Ar for these groups. In light of the moderate association between CSI-Ar and self-reported outcome measures, future research could explore the association between CSI-Ar and objective assessment of physiological changes in the central nervous system.

Conclusions
Findings of the study support the use of CSI-Ar as an easy to understand, valid, and reliable questionnaire. CSI-Ar has psychometric characteristics that are comparable to the similar studies. This would have implication to the assessment and treatment of chronic pain mediated by CS, and in facilitating research of CS in Arabic-speaking population.