Evidence of internal construct validity of SOC-13 total score, for use in hand therapy

Abstract Purpose The objective was to examine whether the 13-item Sense of coherence scale (SOC-13) can be reported as a unidimensional interval-scale metric, when new approaches based on the Rasch model to address local item dependency are applied, and to determine whether an interval-scale scoring can be made available. Methods Data were derived from two samples of patients with hand-related disorders (merged n = 915). Rasch analyses of the SOC data were conducted using item-level analysis and a testlet approach. Results Initial item-level analysis of the SOC-13 confirmed previous findings of misfit to the Rasch model. In resolving local dependency by constructing three testlets, which corresponded to the three components of the SOC construct, fit to the Rasch model (χ 2 (df) = 43.11 (27), p = 0.163) and unidimensionality of the SOC-13 could be established. A transformation table was successfully created to convert the SOC-13 raw ordinal score to corresponding Rasch interval-scaled values. Conclusions The results of this study indicate that data obtained by the SOC-13 can be regarded as essentially unidimensional, and an interval-scale transformation table of the SOC-13 total scores was developed, for use in clinical practice and research on coping resources in patients with hand-related disorders. IMPLICATIONS FOR REHABILITATION The 13-item Sense of coherence scale (SOC-13) comprises three complexly interrelated components To assess coping resources in patients with hand-related disorders, an interval-scale transformation table of the SOC-13 total scores can be used


Introduction
The number of people living with diseases or consequences of injuries is increasing.The current demographics and health shifts are contributing to a rapid increase in those experiencing disability or decline in functioning in their everyday lives.These changes might lead to health services having to prioritize and expand rehabilitation services, for a number of reasons [1].The goal of effective rehabilitation may be seen as "to optimize a patient's self-rated quality of life and degree of social integration through optimizing independence in activities, minimizing pain and distress, and optimizing the ability to adapt and respond to changes in circumstances" [2].Hence, rehabilitation should be individually targeted.A hand disorder and its impact on the patient's everyday life are often multifaceted.It may have a significant impact on health and can continue to affect domestic work, paid work, leisure activities, and health-related quality of life (HRQoL) years after the injury has taken place [3][4][5][6].
Within hand therapy, it has been found that a patient's sense of coherence (SOC) influences rehabilitation outcomes, such as functioning, HRQoL, and service satisfaction, after hand-related injuries [3,7,8].SOC is defined by Antonowsky (1987) [9] as "a global orientation that expresses the extent to which one has a pervasive, enduring though dynamic feeling of confidence that (1) the stimuli deriving from one's internal and external environments in the course of living are structured, predictable, and explicable; (2) the resources are available to one to meet the demands posed by these stimuli; and (3) these demands are challenges, worthy of investment and engagement" (p.19) [9].SOC is considered a personality trait related to how an individual responds to and copes with adverse experiences, such as the event of an orthopedic injury [9].A SOC is associated with physical and psychological health.Individuals with a strong SOC experience life as meaningful and engagement as worthwhile.They are characterized by the fact that they experience their life situation as manageable and coherent as opposed to chaotic or random.Accordingly, they have resources to meet challenging situations in everyday life and feel able to be actively involved and overcome them.In contrast, individuals with a weak SOC may be in need of help to cope, to make use of their own strategies, resources and resources available to them in their environment, e.g.social relations [9,10].
Studies have reported that patients with a weak SOC feel more affected by their hand-related disorder in activities and participation in everyday life than those with a strong SOC, independent of the severity of the injury [3,7,8].Therefore, the SOC construct could be usefully included as a key element in tailored interventions and prioritizing or planning health services, identifying patients in most need of support and supervised rehabilitation after hand-related disorders.
Based on the definition of SOC, Antonovsky developed a patient-reported outcome measure, with 29 items distributed according to the three SOC components: (1) comprehensibility (CO), represented by 11 items, (2) manageability (MA), represented by 10 items, and (3) meaningfulness (ME), represented by 8 items [9].In addition, the items also reflect features of four facets included in the definition of SOC: the modality of the stimulus (instrumental, cognitive or affective), its source (internal, external, or both), the nature of the demand (diffuse, or abstract), and a time reference (past, present, or future) [9].This 29-item scale also exists in a short version, with 13 items (SOC-13) [9,11], which is often used in clinical research, given its more manageable length.The rating scale is an ordinal scale ranging from 1 to 7, where higher scores indicate strong SOC.For both scales, Antonovsky argues that they reflect a single-factor solution and that the three components should not be used as subscales, because they are dynamically interrelated.Thus, the SOC total scale expresses the global orientation of SOC [11].
When using the SOC-13, the scale must provide valid and meaningful measurements.Therefore, it has to be based on the arithmetical properties of interval scales and not ordinal level scales [12,13].The Rasch measurement model provides the means to transform ordinal counts into linear measures.Measurement by items that fit the Rasch model has two fundamental properties: specific objectivity and statistical sufficiency.This implies that comparisons between items parameters are independent of the person parameters, and vice versa [14,15].For fit to the Rasch model, item responses must satisfy some basic assumptions, such as: unidimensionality (i.e. the items of the scale assess one single underlying latent construct (i.e.SOC), in the sense that a single latent variable accounts for the common variance among item responses), monotonicity (i.e. the scale items function hierarchically from easy to difficult, in the sense that increased item scores are expected with increasing degree of the measured construct), homogeneity (i.e. the ordering of the items from easy to difficult is the same for all respondents), local independence (i.e. a score on one item does not depend on another item's score), and absence of differential item functioning (DIF) (i.e. an item's score does not differ because of other factors, such as age or gender for persons with an equal degree of the measured latent variable) [14,15].
As mentioned, the underlying structure of the SOC construct is highly complex, and it has been discussed in the literature whether it should be seen as a uni-or multidimensional scale.Some studies find support for Antonovsky's theory, that SOC measures one global factor with three lower-order factors, represented by the three components [16][17][18][19], and others find that the scale seems to be multidimensional, consisting of two or three factors [19][20][21].Sakano et al., for example, found that a second-order factor model composed of two factors comprised of comprehensibility-manageability and meaningfulness had an acceptable goodness-of-fit [22].Four studies that conducted analysis by the Rasch model found some degree of multidimensionality, which was interpreted as being due to the three underlying components [23][24][25][26].The studies suggested various solutions for changes to the SOC-13 so that it would fit the Rasch model, such as eliminating misfitting items or changing the structure of the rating scales [23][24][25][26].However, removing items from a scale solely because of statistical misfit is not consistent with Rasch measurement theory, because such procedures might distort the assessment and content validity of the scale [14].One study by Holmefur et al. [25] found indication of local item dependence (LD) between two items, of which one was deleted.However, it is possible that smaller magnitudes of LD across more items were undetected in Holmefur et al. [25], because the number of items relative to the sample size was not considered in the interpretation of the results of the analysis of LD.
LD is a problem as it may lead to inflated reliability estimates and can arise due to either response dependency or multidimensionality [27,28].Response dependency occurs when items are linked in some way, such that the response to one item governs the response to another, because of similarities in item content or response format [29].Another form of LD could be caused by redundancy, where the degree of overlap within the content of items is substantial.LD through multidimensionality is typically seen for instruments composed of various dimensions of a broader latent construct [27,28,30], such as the three components in the SOC-13.It is argued that, for most measurements based on theoretical concepts, unidimensionality might be difficult to achieve, given that every test and every set of responses by real individuals is to some degree multidimensional [31].However, it is useful to think of dimensionality as a continuum, which extends from theoretically unidimensional through essentially unidimensional to multidimensional item response data [31,32].
Scales that are essentially unidimensional can be added into total scores, whereas scales too multidimensional to fit the definition of essential unidimensionality have to use subscales to accurately reflect variance from minor dimensions.
Current procedures have been suggested, to model LD for achieving essential unidimensionality and maintaining the integrity and content validity of well-known health scales [33,34].These procedures use a bifactor model/testlet approach, in which interrelated items can be summated together and treated as one item, to absorb the LD [35][36][37].In a bifactor model, each item loads on one general latent dimension, as well as additional orthogonal secondary dimensions [31].This approach seems to match Antonowsk� ys conceptualization of the SOC construct.However, no studies have reported on this within item response theory.The general dimension is usually the main focus of the scale and accounts for the commonality among all of the items.The secondary dimensions, which are specific to subsets of items, reflect item response covariation not explained by the general dimension.Typically, the general dimension is a broad construct (e.g.SOC) and the secondary dimensions are restricted in scope to specific concepts (e.g.ME, CO, MA).It is suggested that a bifactor model can be thought of as a helpful tool in measuring the dimensionality of scales [35].
Therefore, the objective of this study was to examine whether the SOC-13 can be reported as a unidimensional interval-scale metric within a population of patients with hand-related disorders, without deleting any items, and to determine whether an interval-scale scoring can be made available for the SOC-13 version, to be used in research and clinical practice.

Materials and methods
This study uses baseline data from two earlier studies.Study one was a randomized controlled trial investigating the effectiveness of an occupation-based intervention versus a physical exercisebased and occupation-focused intervention, for patients with hand-related disorders [7,38].Study two was a cohort study with the overall aim of identifying psycho-social factors that can preoperatively predict functioning in patients with hand-related disorders three months after surgery (unpublished observations).

Participants and setting
In total, 915 patients with hand-related disorders participated.Sample 1 consisted of 503 patients referred to hand therapy at a Danish outpatient clinic between 2014 and 2016 [7,38].They had a broad spectrum of hand-related disorders.Fifty-six percent were women (n ¼ 282).Mean age was 47 (SD 16.1) years.Most participants had a vocational education or medium-length, third-level education; however, they were educated along a spectrum from elementary school to long-term, third-level education (see Table 1).
The second sample involved 412 patients who were due to undergo elective surgery in relation to any hand-related disorder between 2020 and 2021 (unpublished observations).Fifty-eight percent were women (n ¼ 241), mean age 50.5 (SD 13.5) years.Most participants had an elementary school education or medium-length, third-level education; however, here, too, they were educated along a spectrum from elementary school to longterm, third-level education (see Table 1).
All participants were informed about the projects verbally and in writing before enrolment in the main study [7,38] (unpublished observations) and they all gave their written consent.Study one was registered at ClinicalTrials.gov,no.: NCT02098564.The study was approved by the Regional Committee on Health Research Ethics for Southern Denmark, Project-ID 20120123 and by the Danish Data Protection Agency, no.: 14/1845 and 18/57355.Study two was registered by the Danish Data Protection Agency, no.: 19/26731.Due to the nature of the study, approval by the Research Ethics Committee was not required, in accordance with Danish legislation on research ethics.The Danish Data Protection Agency approved the current study, no.: 21/44186.Permission to use the SOC-13 was given by the copyright holder.

Data collection
The SOC-13 and demographic data about gender, age, and level of education were collected at baseline when referred to hand therapy in Sample 1, and at referral to surgery in Sample 2.

The SOC-13 scale
The SOC-13 consists of 13 items, of which four items (1, 4, 7, and 12) represent the meaningfulness component, four items (3, 5, 10, and 13) represent manageability, and five items (2, 6, 8, 9, and 11) concern comprehensibility.The items are rated by the patients on a seven-point Likert scale, where only the extremes are labelled.Different labels are used across items.Eight items share the same label, with a range from "Very often to very seldom or never."The additional labels vary to suit the question.The total score ranges from 13 to 91 [9], where a higher score indicates a stronger SOC, understood as greater ability to handle stressful situations.Of the 13 items, five items (1, 2, 3, 7, and 10) have to be reversed before summing the total score.Due to copyright, the entire scale cannot be displayed; instead, a few examples are given below.
� The meaningfulness items refer to the extent to which demands in life are considered by the participants to be challenges, worthy of investment, and engagement.For example, item 7 is formulated as "Doing the things you do every day is … … … :", which is rated from 1 (a source of deep pleasure and satisfaction) to 7 (a source of pain and boredom) [9].� The manageability items refer to the degree to which one feels one has resources available to meet the demands posed by the stimuli with which one is exposed to.For example, item 3 is formulated as "Has it happened that people whom you counted on disappointed you?", which is rated from 1 (never happened) to 7 (always happened) [9].� The comprehensibility items refer to the extent to which stimuli deriving from one's internal and external environments in the course of living are structured, predictable, and explicable, as opposed to disordered, chaotic, and unexplained [9,19].For example, item 6 is formulated as "Do you have the feeling that you are in an unfamiliar situation and don't know what to do?", which is rated from 1 (very often) to 7 (very seldom or never) [9].

Statistical analysis
The analysis by the Rasch model was undertaken with the RUMM2030 software [39] that uses a pairwise conditional maximum likelihood algorithm for the estimation of the item and person parameters, where the unit of measurement is logits (log-odd units) that approximate an equal interval level of measurement.The labelling of the response options of the SOC-13 items varies across items, and a likelihood ratio test was found significant (v 2 (df) ¼ 494.14 (117), p < 0.001), indicating that the Partial Credit Model (PCM) should be adopted [40].In the case of SOC-13, the Rasch model specifies that the probability of a response of 1 to 7 is a logistic function of the difference between the respondent's degree of the measured latent variable (SOC) and the degree of the latent variable represented by the item.Thus, persons (i.e.respondents) and items are located on the same measurement scale, with the mean item location set at zero logits [14,15].Accordingly, the ordinal scores from the SOC-13 items are expressed as linear measures, where more negative values reflect lower levels of SOC and more positive values reflect higher levels of SOC.
The analysis strategy included both baseline analyses and the testlet approach to improving fit to the Rasch model.

Baseline analysis
Fit to the assumptions of the Rasch model was examined statistically and graphically and followed recommended procedures [14,41,42].A series of fit statistics are used to indicate satisfactory fit, and their ideal values are shown at the bottom of the overall summary fit table (Table 2) and the individual item fit table (Table 3).
The overall summary fit statistics include a summary item-trait interaction v 2 statistic, which should be non-significant (p > 0.05), reflecting invariance of the items across different ability groups (class intervals).Also, summary fit residual statistics across items and persons are included.Both item and person fit statistics are transformed by RUMM2030 to approximate a Z-score representing a Standardized Normal Distribution.When the data fit the model, the overall distribution statistics for item and person fit should approach a standardized mean value of zero and an SD of 1.0 [14,43], though an SD < 1.4 is sometimes considered acceptable [33,34].As part of the overall summary model fit, reliability and unidimensionality of the scale are reported.Reliability was examined using Cronbach's alpha and the Person Separation Index (PSI), which is the Rasch equivalent of Cronbach's alpha, except that it is calculated from the logit scale person estimates [43].The PSI indicates the power of the latent variable to discriminate amongst persons [44].Values � 0.7 for both indices are considered acceptable [45].Unidimensionality of the SOC scale was assessed by the principal component analysis (PCA)/t-test protocol [41].The PCA takes the information regarding the fit residuals for each person, for each item, and puts them into a PCA.The loading between items and the first residual factor were examined, and the pattern was used to define two subsets of items, with positive and negative loadings, respectively [41].The difference of the person location estimates for each person from these two subsets of items was investigated using a series of paired t-tests [46].If less than 5% of the sample from the two subsets shows a significant difference in person location estimates or if the value of 5% falls within an exact binomial 95% confidence interval (CI) of proportions, it is reasonable to assume that the scale is essentially unidimensional [41].
For model fit at item and person-level, fit residual values between ± 2.5 and a non-significant v 2 statistic are regarded as acceptable [43].Too high item fit residuals suggest under-discrimination and might reflect multidimensionality, and too low fit residual values indicate over-discrimination and might reflect potential redundancy or dependency within the item set [14,15].Also, the monotonic function of the seven response categories of the SOC-13 items was assessed using threshold ordering.Thresholds are the boundaries between adjacent categories, which should increase monotonically (i.e.ordered thresholds) [14,15,40].The ordering of the six thresholds for each of the SOC-13 items was examined using a threshold map and category probability curves.The presence of disordered thresholds might be due to too many response options, unclear item and category descriptions, LD, or multidimensionality [14,47].
LD was assessed using a residual correlation matrix of the items.A value of 0.2 above the average correlation is suggested as a guiding cut-off value for detecting LD; though the indices might be lower when there are relatively few items and many respondents within a data set [29], as in the current study.At times, the presence of LD will also be reflected in the fit of data to the model.In general, over-discriminating items indicate response dependence or redundancy and under-discriminating items indicate multidimensionality [14].Although it might be difficult to distinguish between trait and response dependence, the magnitude of the dependence will be reflected by a decrease in the reliability when modeling LD by using testlets to achieve fit to the Rasch model [14,15].
DIF was examined by gender, age (four groups according to the interquartile range), injury type (acute injury/elective surgery).and educational level (two groups according to length of higher education (<3/�3 years)).These exogenous variables were chosen because it is hypothesised that these might influence SOC [10].For the merged sample, DIF by Sample 1 and 2 was also addressed.In RUMM2030, the analysis of DIF uses a two-way ANOVA on the residuals for each item across the subgroups and across the class intervals.DIF can occur as either uniform DIF, where item responses differ uniformly across the measured variable (i.e. a main effect), or as non-uniform DIF, where differences in item responses between subgroups vary across the measured variable (i.e. an interaction effect) [14,48].
The baseline analysis was carried out on data from the total SOC-13 scale in three samples, Sample 1, sample 2, and as merged sample.Additional baseline analyses were also undertaken for the three separate SOC components, which is presented in supplemental file 1.Before continuing to the testlet approach, different solutions for achieving fit to the model were assessed at item level, which is also presented in supplemental file 1.
For all analyses, the Bonferroni correction was used to adjust for multiple testing, keeping the Type I error to 5%.

Testlet approaches to achieving fit
The testlet approach involves combining a set of associated items into a new item and is often undertaken for the purpose of absorbing the presence of LD [27,28,30].The creation of testlets should preferably be based on a combination of the evaluation of the indices of LD and theoretical considerations [14,30].As such, it was expected, a priori, that the items within each of the three components of the SOC-13 would be interrelated and could be combined into three testlets.
When creating testlets, the data are re-analyzed using the testlets instead of individual items.In addition, supplementary indicators are available in RUMM2030 and provide information on the latent correlation among the testlets and on the explained common variance (ECV), defined as the ratio of variance explained by the general factor divided by the variance explained by the general plus the testlet (dimensional) factors [14,35].The higher the ECV, the "stronger" the general factor relative to the dimension factors and thus, the more confidence in applying a unidimensional measurement model to multidimensional data is achieved.An ECV value of 1 indicates that all non-error variance is contained within the latent estimate, and a value > 0.90 is considered sufficient to indicate that the general factor is unidimensional [14,35].An ECV value < 0.70 indicates substantial multidimensionality [31].
DIF was also analysed during the testlet approach, and in the presence of DIF, the testlets showing DIF for the included group factors were split based on the strongest DIF and continued until no further DIF was present [42].Hereafter, the person location estimates from the split and unsplit solutions were compared, anchored to each other with a testlet that is free of DIF [49].To determine the effects of DIF, an effect size calculation, based on the mean of the person estimates and their standard deviations, and on the correlation of the split and unsplit version [50] was applied.If the effect size was below 0.2, DIF was considered small [51], and no action was taken to adjust for DIF in the transformation table.
Finally, a transformation table from the SOC-13 raw ordinal total scores to the corresponding interval-scaled values was developed, based on the respective person location estimates according to the Rasch model.The person location estimates from the testlet approach that achieved overall fit to the Rasch model were taken as the basis for this transformation.In addition, the scale to sample targeting of the final SOC-13 was evaluated using the person-item thresholds distribution map, which visually depicts person locations against item locations.A well-targeted rating scale will have both item and person mean locations of around zero and there will be enough item thresholds of varied difficulty (i.e.measuring varying degrees of SOC) to match the spread of scores among respondents [43].

Sample size
For a well-targeted rating scale, a sample size of around 250-500 usually provides accurate and stable person and item estimates and a good balance for statistical interpretation of the fit statistics [52].However, the number of item thresholds also influences the required sample size, and a ratio of 10-20 respondents for each threshold is suggested [14].This rule of thumb is satisfied for the current study, which has 915 respondents.To ensure the robustness of the analysis, the baseline analysis and the testlet approach were performed for the two samples separately and for the merged sample, which was the final validation sample and constituted the basis for the transformation table.

Baseline analysis
Table 2 (Analyses No. 1 to 3) displays that no overall fit to the Rasch model was achieved during the baseline analysis of the SOC-13 across the three samples.In all analyses, the item-trait interaction was significant (p < 0.001), indicating absence of homogeneity, and the fit residual SD for items exceeded 1.40, indicating some misfitting items.The fit residual mean (SD) values for persons ranged from À 0.32 (1.39) to À 0.42 (1.46), which reflects no serious misfit among the respondents in the sample.Very few of the responses were at the ceiling of the scale (1% for sample 1, 2.7% for sample 2, and 1.75% for the merged sample).The t-test indicated that the person estimates from the two subsets of items with positive and negative loadings were significantly different for approximately 6-9% of the persons across the three samples.The lower confidence interval bound overlapped or just exceeded the critical value of 5%, which is considered a minor violation of unidimensionality.The PSI was 0.87-0.88 and Cronbach's alpha was 0.90, indicating good to excellent reliability, which implies that the SOC-13 discriminates well between persons with different degrees of SOC (i.e. three distinct groups).In the analysis of the individual SOC components, significant item-trait interactions were found for all three components, although the results of the t-tests indicated no violation of unidimensionality (see Supplemental File 1).
Table 3 displays the item-level fit statistics.All three sample analyses presented misfitting items, of which the majority were under-discriminating, which indicates multidimensionality.Disordered thresholds were found in 11 items for sample 1, in 4 items for sample 2, and in 8 items for the merged sample, which indicates absence of monotonicity.Uniform DIF by gender and by injury type was displayed for a few items across the three sample analyses (see Table 3).For the merged sample, one item (3MA) displayed uniform DIF by sample.Using the guiding cut-off values, LD was found across some item pairs (Figure 1).In addition, positive residual correlations were also detected for several item pairs within each of the three components, albeit at low levels (Figure 1).
Several strategies to achieve model fit for the SOC-13 were undertaken before proceeding to the testlet approach (see Supplemental File 1).However, the final scale resulting from different solutions did not achieve overall fit to the model and the reliability became too low (PSI ¼ 0.64).

Testlet approach
Although only low levels of LD were identified, the SOC items were combined into three testlets, according to the three hypothesized components (T1_ME ¼ meaningfulness, T2_CO ¼ comprehensibility, and T3_MA ¼ manageability).Table 2 (Analysis No. 4-6) shows that, in all sample analyses, the item-trait interaction became non-significant (p > 0.01) and the PCA/t-test protocol showed that less than 5% of the person location estimates differed significantly on the two most divergent testlets.The magnitude of PSI decreased slightly and was maintained > 0.80, which indicates that it is still possible to discriminate among three distinct groups of SOC.Across the three sample-analyses, the average latent correlation between the testlets ranged from 0.90 to 0.93, and when adding the three components together to make a total score, 96-97% of the total non-error variance was found to be common.These results indicate that it is possible to summarize the responses to the items within each component into a single total score.
Despite overall fit to the Rasch model for the 3-testlet solution when merging samples 1 and 2, DIF by gender and educational level was present (Table 4).The most significant DIF was found for gender in T1_ME, which had to be split into T1_ME_female and T1_ME_male.This resolved DIF by gender for T3_MA.However, DIF by education remained for T2_CO, which had to be split into T2_CO_shorter education and T2_CO longer education.This procedure maintained the overall fit to the Rasch model (Table 2, Analysis No. 7).Hereafter, T3_MA served as the anchor for the comparison of the measures of the unsplit 3-testlet solution (person location mean (SD): 0.858 (0.820)) and the DIF free split solution (person location mean (SD): 0.888 (0.847)).With a sample size of n ¼ 915, this resulted in an effect size of 0.036 and a correlation of r ¼ 0.997, indicating that there was no need to split the final interval-scale transformation into different subgroups.
Based on the 3-testlet solution, an interval-based transformation table was created for the SOC-13 total raw scores (13-91) (Table 5), which can be used to transfer the ordinal SOC-13 scores into interval-scaled SOC-13 scores.
Figure 2 shows that the distribution of item thresholds and person location estimates along the common logit scale are reasonably aligned, though the sample on average was of a higher level than the average of the scale, there was item-thresholds clustering and a few gaps at the highest end of the continuum.The peak of the test information function was at 0 logits of the continuum, indicating the best point of measurement of the SOC-13 (Figure 2).This reflects a pattern of better targeting for the respondents with lower SOC than those with higher SOC.

Discussion
This study presented a solution to make data obtained by the SOC-13 unidimensional enough to be characterized by a total score.Using the testlet approach, it was possible to retain all 13 items of the scale and to determine that the data were essentially unidimensional.The baseline analyses and traditional approaches to improving an assessment scale did not result in fit to the unidimensional Rasch model.The results from our baseline analysis of SOC-13 align with earlier findings in terms of a lack of unidimensionality, misfit of items, lack of consistent scaling properties, signs of LD and DIF by gender [23][24][25][26].In the current study, items 2CO, 3MA, 8CO, and 12ME had significant misfit in the merged sample.LD was found between 2CO (Has it happened in the past that you were surprised by the behavior of people whom you thought you knew well?) and item 3MA (Has it happened that people whom you counted on disappointed you?).The combinations of the four facets are identical for those two items, and they are the only items with this combination of facets (instrumental, external, diffuse, and past) [9].The specific combination of facets may have led to complex constructs and be the cause of their LD, even though they belong in different components of SOC.Item 8CO has a unique facet combination (cognitive, internal, diffuse and present) [9], which may explain the misfit.This also goes for item 12MA (instrumental, external, concrete and present) [9].Moreover, the Danish translation may have affected the participants understanding of the items inappropriately, coursing misfit.In addition to LD for item pair 2CO and 3MA, we also detected LD at different magnitudes between item pairs within and across the components.Given that model fit was achieved and the reliability indices decreased when creating the three testlets, corresponding to the three SOC-components, this might reflect that the effect of LD has been modeled adequately.Holmefur et al. [25] also found LD for item pair 2CO and 3MA as the only pair, but might have failed to fully address the effect of LD.They combined the two items to a super item, but still found misfit to the model.
The effect of LD has been shown to inflate reliability and distort interpretations of analysis by the Rasch model [14,53].Local dependence can also make thresholds disordered [14].In current analysis of the merged sample and Sample 1, thresholds were disordered for several items, which also were locally dependent on one or more other items.Generally, LD might have caused potential misinterpretation of the construct validity of well-known scales [54].The testlet approach used in our study to absorb the effect of LD has led to positive results in earlier Rasch analyses of well-known health scales, such as the functional independence measure (FIM TM ) [33,54], the extended Barthel Index [34], and the disabilities of the arm, shoulder, and hand (DASH) questionnaire [53].
The testlet approaches with three testlets corresponding to the three SOC components revealed fit of the SOC data to the unidimensional Rasch model without the need to re-score the items with disordered thresholds or remove misfitting items from the scale.By using the testlet approach in this study, the original, intended variable of assessment as developed by Antonovsky [9] was not altered; whereas, eliminating items would decrease the content validity of the scale.This indicates that a SOC-13 total score could be considered sufficiently valid and reliable when used in research and clinical practice for patients with handrelated disorders.It is worth noting that the testlet approach is best suited to assessment instruments in which item responses are primarily expected to reflect a strong common latent trait, in which there is concurrent multidimensionality caused by welldefined clusters of items from different subdimensions [30].
Measures that have been shown to fit confirmatory correlated factors and higher-order models, as is the case for the SOC-13 [16][17][18][19], are good candidates for consideration of testlet modeling [30].A drawback of the testlet approach is the reduced level of details at item level, although this is consistent with how the scale is used in practice and research when relying on a summated score.
In our current study, sample 2 has a better fit to the Rasch model than Sample 1 in the baseline analysis.One reason might be that sample 2 are homogenies by injury type, given that injury type created DIF in Sample 1 and in the merged sample.All patients in sample 2 were due to undergo elective surgery in relation to their hand-related disorder; for this reason, DIF by injury type was not present.Our final analyses found no need to split items in the transformation table because of DIF.Based on the results in the current study, it was possible to provide an intervalscale transformation table of the SOC-13 total score for use in future research to support a client-centred practice with an individual approach.
It has been argued that a task for future research is to develop a scale to measure SOC, where the three components are measured relatively independent of each other [55].However, the complexity of the SOC and the inter-item relationship across components might make this task challenging.Moreover, the factorial structure is affected by the four additional facets in the SOC-13 (the modality of the stimulus, its source, the nature of the demand, and a time reference The successful summation of the 13 SOC items in the current study might reflect the complex higher order construct of SOC, incorporating comprehensibility, manageability, and meaningfulness components.

Limitations of the study
A limitation in our study is that the results are based on secondary data analyses, which limits the choice of person factors for the DIF analyses.However, the most essential factors, such as age, gender, and education, are present.Given that we cannot determine how the 13 items in SOC-13 were selected, it might have been better to use the SOC-29.Nevertheless, SOC-13 is found to be as valid as SOC-29 [56] and is well used in clinical studies, because of its more accessible length [3,8].For this reason, we considered that SOC-13 was appropriate to our study objectives.
Another limitation lies in the three testlet approach, which provides the basis for the transformation table.Despite achieving model-fit combining the items into three testlets corresponding to the three components, the approach does not allow a description of the hierarchical distribution and difficulty of the items because the overall locations of the testlets are on average, across all the items contained within the testlets.However, though the purpose of the SOC-13 scale is to provide a total score for sense of coherence, data collection is conducted at item level, allowing researchers and clinicians to get a more qualitative description of the profile of a single respondent in a certain item or groups of items.In addition, it could be argued that to obtain further insight into the inter-relatedness of the three separate components, pairwise analyses of the three components could have been undertaken (e.g. the Meaningfulness testlet vs. the manageability testlet).However, this approach was not carried out in the present analysis.If the three testlet approach had resulted in model misfit and indication of multidimensionality in the PCA/Ttest protocol, then alternative combinations of items and testlets could have been tested as in Maritz et al. [33,34].It is also worth noting, that although disordered threshold was found at item level, it is not possible to assess the threshold ordering with the testlet approach.
We used RUMM2030 software; however, other software could have provided other appropriate analysis of dimensionality and LD in terms of a multidimensional Rasch model, a testlet Rasch model or a log linear Rasch model [15].However, this would require a larger sample [14].This could advantageously be done in a future study with a large sample from different populations using SOC-29.With this method, a person estimate could be given that would be more accurate than that in our transformation table.

Conclusion
The results of this study indicate that data obtained by the SOC-13 can be regarded as essentially unidimensional, and that it is appropriate to use a SOC total score.Based on the current results, an interval-scale transformation table of the SOC-13 total score for use in both clinical practice and research on coping resources in patients with hand-related disorders could be provided.Further validation is still needed.This to support a client-centred practice with an individual approach.
) [9].On the other hand, a SOC scale where the three components are independently measured may facilitate health professionals in addressing specific elements related to the three components, but might leave out details related to the interaction between components.Using the full SOC construct in clinical practice gives individualized and in-depth knowledge of the patient and may contribute to client-centred rehabilitation planning and thereby support the individual person's resilience to various stressors [9,55].

Figure 2 .
Figure 2. Person-item threshold distribution of the SOC-13 data.

Table 1 .
Demographics of respondents.
� Not included in DIF analysis by education.

Table 3 .
Item level fit statistics for the SOC-13 scale for two subsamples and the merged sample.

Table 4 .
Differential item function (DIF) analysis and strategy for the testlet approach (merged sample).

Table 5 .
Sense of coherence (SOC-13) total score transformation table: original scores to interval scores.
NOTE the transformation table is valid for complete data were all 13 items have been completed.