Measurement properties of a patient-reported outcome measure assessing psoriasis severity: The psoriasis symptoms and signs diary.

Abstract Background: Collecting reliable and valid symptom information from patients is critical for assessing psoriasis severity in clinical research. Objective: To evaluate measurement properties of a new patient-reported outcome (PRO), the Psoriasis Symptoms and Signs Diary (PSSD). Methods: One hundred six US patients with moderate-to-severe plaque psoriasis completed two versions of the PSSD [a 24-hour recall (PSSD-24h) and 7-day recall (PSSD-7d)] using a 0–10 numerical rating scale. Reliability (test-retest and internal consistency), validity (convergent, divergent and known-groups), responsiveness, and version equivalence were evaluated. Minimally important difference was estimated. Results: Based on exploratory factor analysis and clinical input, symptom, sign, and total severity scores were established. Internal consistency (Cronbach’s alpha ≥ 0.944) and test-retest reliability (intraclass correlation coefficients ≥ 0.824) were acceptable. Correlations with Dermatology Life Quality Index (DLQI) (0.489 to 0.644) indicated convergent validity, while low correlations (< 0.30) with several Short Form (SF)-36 scales indicated divergent validity. PSSD scores differed when patients were categorized by Body Surface Area, DLQI, and Psoriasis Area Severity Index scores. PSSD-24h and PSSD-7d versions were equivalent (Pearson correlations ≥ 0.953). Limitations: PSSD responsiveness should be evaluated in patients receiving treatment. Conclusion: The PSSD is reliable and valid in measuring symptoms/signs of patients with moderate-to-severe plaque psoriasis.


Introduction
Psoriasis is a symptomatic disease that can have a significant impact on health-related quality of life (HRQoL) (1-7). Collecting information directly from patients through the use of patient-reported outcomes (PROs) has become more common in psoriasis research in recent years (8) and can provide valuable information when assessing psoriasis severity and the change of severity resulting from effective therapies (9,10). Currently, available psoriasis-specific PROs are often limited in scope (11)(12)(13)(14), or were not developed based on best practices included in the US Food and Drug Administration (FDA) guidance document for PRO Development (10).
Recently, an 11-item PRO measure, the psoriasis symptoms and signs diary (PSSD), was developed for use in assessing patients with moderate-to-severe plaque psoriasis (15,16) per the FDA's PRO Guidance (10). The PSSD assesses symptoms (itch, pain, stinging, burning, skin tightness) and patient-observable signs (skin dryness, cracking, scaling, shedding/flaking, redness, bleeding). Two versions are available: one with a 24-h recall (PSSD-24h) and one with a 7-day recall (PSSD-7d). The aim of the current study was to evaluate the psychometric properties of the PSSD, including the scale structure and its item and scale variability, reliability, validity and responsiveness to change. Additionally, we sought to estimate the instrument's minimally important difference (MID) and evaluate the equivalence of the two recall versions.

Study patients and procedures
Patients signed informed consent forms, and human subjects' research approval for this project was provided by Copernicus Group (Cary, NC). Patients aged ! 18 years with moderate-tosevere plaque psoriasis for ! 6 months not currently enrolled in a psoriasis clinical trial were recruited from seven US dermatology practices. Eligible patients were required to have a body surface area (BSA) rating ! 10%, a psoriasis area severity index (PASI) ! 12 and a physician's global assessment (PGA) ! 3. Patients were excluded if they had a dermatologic condition other than psoriasis or a medical condition or treatment that could interfere with participation. Patients completed a background demographic questionnaire. Participating sites completed clinical case report forms for each patient at baseline and day 14, which included disease and treatment histories, and BSA, PGA and PASI scores.
Patients were randomly assigned to one of two groups (''A'' and ''B'') in an alternating fashion to assess the equivalence of the two versions of the PSSD. Patients in Group A completed the PSSD-24h version daily for the 2-week period (days 1 through 14), while patients in Group B completed it daily during only the second week (days 8 through 14). All patients completed the PSSD-7d, dermatology life quality index (DLQI) (17), short form-36 (SF-36) (18) and patient global impression (PGI) on a weekly basis on days 7 and 14; patients in Group A also completed these assessments on day 1. Patients in both groups were provided with a date and time stamper to track compliance for timely completion of the assessment.

Study measures
The PSSD-7d and PSSD-24h both contain the same 11 items assessing the presence and severity of symptoms and observable signs. Response options use an 11-point numerical rating scale ranging from 0 (absent) to 10 (worst imaginable). In addition, the PSSD-7d d asks about frequency of each symptom on a 5-point scale: ''None of the days'' (0 days), ''A small number of days'' (1-2 days), ''Some days'' (3-4 days), ''Most days'' (5-6 days), ''All of the days'' (7 days). The DLQI contains 10 items to assess impact of skin disease on QoL. The SF-36 is a 36-item general health measure that produces eight domain scores and two summary scores: a physical component summary (PCS) and mental component summary (MCS). The PGI assesses baseline severity (PGI-B), change in symptoms from baseline (PGI-Sx) and change in HRQoL since baseline (PGI-QoL).
Descriptive analyses and equivalence of PSSD-24h and PSSD-7d Comparisons were made between Groups A and B on demographic and clinical characteristics using Fisher's exact test or Chi-square test for categorical measures and independent samples t-tests or Mann-Whitney U-tests for continuous measures. The equivalence of the PSSD-24h and PSSD-7d versions was evaluated using Pearson correlations and scatterplots.

Item-level and exploratory factor analyses
Descriptive measures, including means, standard deviations (SDs), skew and floor and ceiling effects, were calculated for individual items. Exploratory factor analysis (EFA) was performed to determine scale structure and the feasibility of creating  separate PSSD scales for symptoms and signs. The final factor structure was based on the number of eigenvalues greater than 1.0, examination of the SCREE plot, factor loadings ! 0.40 and crossfactor loadings 50.30. EFA analyses were performed separately for severity and frequency (for PSSD-7d only).

Scale-level analysis
Severity scale scores (total severity, symptom severity, sign severity) for both the PSSD-24h and PSSD-7d were created by averaging items when 450% of the items in that scale were answered. Frequency scale scores were created in a similar fashion for the PSSD-7d. For the PSSD-24h, frequency scales counted the number of severity items with a rating 40, since the PSSD-24h does not contain items for frequency. Scale scores were converted into 0-100 scoring, with 0 representing the least severe or least frequent and 100 the most severe or most frequent. Descriptive information on PSSD-24h scales (week 1) and PSSD-7d scales (day 7) included means, standard deviations (SDs), skew, frequency and percentages of floor and ceiling responses.

Reliability and validity
To assess test-retest reliability, the intraclass correlation coefficients (ICCs) of each scale at week 1 and week 2 (for the PSSD-24h) and from day 7 and day 14 (for the PSSD-7d) were calculated. Analyses were first performed using only respondents who indicated on the PGI that they had not changed since baseline and then repeated using only patients who indicated no change from baseline based on BSA. An ICC ! 0.70 was considered to be acceptable (19). To evaluate internal consistency reliability of the PSSD-24h, Cronbach's alpha coefficient was calculated for each scale, with ! 0.70 considered acceptable (20,21). Interscale correlations of the PSSD scales with two collateral measures, the SF-36 and DLQI, were calculated to assess divergent and convergent validity. A correlation ! 0.30 (22) was required for the evidence of convergent validity, while a correlation 50.30 represented evidence of divergent validity. To assess known-groups validity, patients were categorized based on scores for collateral measures. Week-1 PSSD-24h scores were compared using one-way analysis of variance (ANOVA) with Tukey's b post hoc comparisons. Similar analyses were performed using day-7 PSSD-7d scores.

Responsiveness and minimally important difference estimation
For responsiveness, the standardized effect size (SES), (22) standardized response mean (SRM), (23) and responsiveness statistic (or Guyatt's statistic) (24) were calculated for week-1 and week-2 PSSD-24h scores, where ''stable'' patients were defined as those who rated themselves as unchanged on the PGI at day 14.
Responsiveness for the PSSD-7d was assessed similarly, using day-7 and day-14 scores. A combination of anchor-and distribution-based methods was used to estimate MID (see Supplementary Information).

Baseline demographic and clinical characteristics
One hundred six patients were enrolled (Table 1), which provided adequate power to perform the necessary analyses. The mean (SD) age was 50.1 (12.1) years, 61% were male, and 72% were Caucasian. The mean BSA was 21.2%, and mean scores for PASI and PGA were 16.4 and 3.3, respectively, indicating moderate-tosevere psoriasis. There were no significant differences between Groups A and B for demographic or clinical characteristics. Completion rates for the PSSD exceeded 98% for Groups A and B.

Item-and scale-level analyses
Descriptive analyses of individual item responses revealed that severity scores were relatively symmetrically distributed. In the PSSD-24h, floor effects (Table 2) were present for bleeding (42%), burning (31%), stinging (26%) and pain (26%). Scale-level analysis of PSSD-24h severity scores did not exhibit substantial ceiling or floor effects. The PSSD-7d version had similar results for severity scores (data not shown). However, for the analyses on symptom and signs frequency in the PSSD-7d, substantial ceiling effects were found for eight frequency items (itch, dryness, cracking, skin tightness, scaling, shedding/flaking, redness, pain). Among these frequency items, the percent of responses listed as ''all of the days'' ranged from 30% to 61%. Given the high ceiling effects for the frequency items and limited usefulness of the frequency scores, particularly for the PSSD-24h, further results for these items on frequency are not presented.

Exploratory factor analysis (EFA)
EFA of the week-1 severity items resulted in single-factor solutions with eigenvalues 41.0 for both the PSSD-24h and PSSD-7d, with no indication that any item should be dropped. Multilevel EFA results provided strong empirical evidence for combining symptom and sign items into a single unitary scale. However, when split into sign and symptom scales based on clinical judgment, the subsequent EFA analyses were strong with eigenvalues 41.0 for each scale. Therefore, two separate scales for sign and symptom severity were retained, and a total severity score was also preserved. The characteristics of severity scores are presented in Table 2.

Reliability
Cronbach's alpha coefficients (Table 3) were ! 0.954 for the PSSD-24h and ! 0.944 for the PSSD-7d, suggesting excellent internal consistency. Among patients with no PGI change, ICCs for all PSSD-24h scales were ! 0.886, and among patients with no change in BSA, ICCs for all PSSD-24h scales were ICCs ! 0.824, suggesting excellent test-retest reliability. Results for PSSD-7d scales were similar: ICCs were ! 0.894 among patients with no change in PGI and ! 0.886 among those with no change in BSA.

Validity
The PSSD-24h was moderately-to-strongly correlated with several collateral measures (Table 4). Correlations with the DLQI ranged from 0.489 (symptom severity) to 0.521 (sign severity) and those with SF-36 bodily pain ranged from À0.624 (sign severity) to À0.682 (symptom severity). Correlations were highest between the PSSD-24h scales and the symptom item in the DLQI versus the nonsymptom items in the DLQI. For example, the correlations with item #1 of the DLQI (''How itchy, sore, painful, or stinging has your skin been?'') were 0.516 (total severity), 0.489 (symptom severity), and 0.521 (signs severity). Comparable results were obtained between day-7 PSSD-7d scales and collateral measures (data not shown). Divergent validity for the PSSD Score Known-groups validity was explored using several comparisons. Patients were grouped according to baseline PASI score (513, 13-16.9, ! 17) ( Figure 1A), baseline PGI rating ( Figure 1B), and day-7 DLQI score ( 6, 7-15, ! 16) ( Figure 1C). In each case, patients in the most severe disease group produced the highest PSSD scores. Data from the PSSD-7d were similar (data not shown).

Responsiveness and MID estimates
Among those rating themselves as improved from week 1 to week 2 on the PGI, there was a decrease (improvement) in PSSD-24h severity scores (Table 5).
Small changes in scores were seen in those rating themselves as unchanged, while those rating themselves as worse demonstrated increases in severity scores, indicating worsening. Similar results were found for the PSSD-7d (data not shown). MID estimates for PSSD-24h and PSSD-7d severity scores ranged from 10-12 points (additional details are provided in the Supplementary Information).

Discussion
In clinical research of psoriasis, severity is primarily evaluated using physicians' assessments (e.g. BSA, PGA and PASI), based on the extent of the involved area and presence of erythema, induration and scaling. Symptoms reported by patients are not typically part of the assessment. However, PROs, such as the DLQI, can capture additional benefits beyond clinician's assessments, and may be more sensitive to detect early treatment responses than clinical measures (such as the PASI and PGA) alone (25).
The PSSD differs from existing PROs used in psoriasis research (such as the DLQI), since it only contains items specific to symptoms and patient-observable signs of psoriasis. Because symptoms are more relevant to the patient experience than signs, the PSSD symptom scale can be used to supplement a physician assessment of severity, such as the PASI and BSA, and be considered as a primary endpoint in clinical trials. Also, the PSSD sign severity scale score can be used as a secondary endpoint for assessing psoriasis-associated signs from patients' perspectives based on their daily experiences.
The results of the current study indicated excellent internal consistency and test-retest reliability of the PSSD. Moderate-tostrong correlations with collateral measures affirmed the tool's convergent validity, while weaker correlations (50.300) with several SF-36 scales indicated adequate divergent validity. When patients were split into groups known to differ, PSSD scores differed significantly. Adequate responsiveness was demonstrated by moderate changes in scores for patients who rated themselves as improved or worsening. Given the lack of sufficient change in both the anchors and PSSD scores, MID estimates should be interpreted with caution. These estimates need to be further explored using data from clinical trials of efficacious therapy. The results from the frequency analysis showed high ceiling effects on individual items as well as scale scores. Therefore, the use of the scores based on frequency should be interpreted with caution.
There are additional limitations to this study. This is a USbased study, and the majority of patients were Caucasian. As such, a multinational study is needed for the validation of the PSSD in culturally diverse populations. In addition, the study was conducted over a 2-week time period among patients who were not required to be receiving active treatment, which may not have allowed for measureable change in the PSSD scores for an informative analysis of responsiveness or in the selected anchors for MID estimation.

Conclusions
As the efficacy of medications for the treatment of patients with psoriasis has improved over the last several years (26), a reliable and valid assessment of psoriasis-related symptom severity has become even more important. The PSSD is a brief but comprehensive measure created in accordance with the FDA's PRO Guidance, using patient and clinician input, well suited for use with electronic data capture. The PSSD allows for the assessment of symptoms on either a daily or weekly basis, and both versions have been found to be reliable and valid.