The utility of the Edinburgh Depression Scale as a screening tool for depression in Parkinson's disease

This study aimed to evaluate the Edinburgh Depression Scale (EDS) as a screening tool for use in a Parkinson's disease (PD) population. Many commonly used depression scales include items relating to somatic symptoms that also occur in PD, which could potentially result in inaccurate reporting of depressive symptoms. The EDS is a scale that incorporates no somatic items.


Introduction
Depression is commonly associated with Parkinson's disease (PD)-it is estimated that up to 50% of people who have PD are affected by depression (Reijnders et al., 2008) although these estimates vary according to the sample studied, the assessment methods used and the diagnostic criteria applied. Importantly, depression has been shown to be associated with greater motor and cognitive decline (Cubo et al., 2000), deterioration of functional ability (Weintraub et al., 2004) and reduced quality of life (The Global Parkinson's Disease Survey Steering Committee, 2002).
The diagnosis of depression in PD is not straightforward and is complicated by the fact that many of the somatic symptoms of depression are also symptoms that occur in PD. This means that it is difficult for clinicians to identify when a depressive illness is present-for example, complaints of fatigue, impaired concentration and insomnia could be attributed to PD, and consequentially, depression may be under-recognised and undertreated in PD (Weintraub et al., 2003). In the UK, current clinical practice guidelines for depression produced by the National Institute for Health and Clinical Excellence recommend screening in patient groups at high risk of depressive illness (National Institute for Health and Clinical Excellence, 2004), and it is possible that routine screening in patients with PD would improve the identification of patients with depression.
The choice of an appropriate screening measure in PD is not simple. Because of the overlap of many of the somatic symptoms of depression with the symptoms of PD, the usefulness of commonly used depression scales in PD may be compromised. Therefore, a depression scale that has few or no items relating to somatic symptoms of depression should have greater utility when used in this patient group. The Edinburgh Depression Scale (EDS) was originally developed and validated to identify depression in women during the postnatal period (Cox et al., 1987). The scale has no items relating to somatic symptoms of depression and instead relies upon identifying cognitive-affective symptoms. It is a 10-item scale designed for self-completion by the patient (each item scores 0-3, with a higher score indicating more depressive symptoms). The scale has been validated in studies of non-postnatal women (Gibson et al., 2009) and in palliative care (Lloyd-Williams et al., 2000;Lloyd-Williams et al., 2004) and would seem well suited for use in a PD patient population. A shorter, six-item version of the EDS has also been validated in palliative care, the Brief EDS (BEDS) (Lloyd- Williams et al., 2007). The aim of this study was therefore to explore the utility and validity of the EDS and the BEDS in an outpatient PD sample.

Methods
This study was part of a project already described (Baillon et al., 2013) and was approved by the local research ethics committee. Participants were either newly diagnosed or ongoing patients attending the PD clinics at Leicester General Hospital, UK. All participants had a diagnosis of PD according to diagnostic criteria defined by the UK Parkinson's Disease Society Brain Bank (de Rijk et al., 1997), were English speaking and were able to give informed consent to take part in the research. Participants received two visits at their home for the purposes of the study. At the first visit, they completed Section 21 of the Present State Examination-Schedules for Clinical Assessment in Neuropsychiatry (SCAN) (World Health Organisation, 1999), which incorporates the mini mental state examination (MMSE) (Folstein et al., 1975). Those who scored less than 24 on the MMSE were excluded from the remainder of the study; this cut-off was chosen as this is the most widely used in research to exclude patients with dementia, with sensitivity (Se) of 87% and positive predictive value (PPV) of 79% (Tombaugh and McIntyre, 1992;Folstein et al., 2001;Lopez et al., 2005). Next, the sections of the SCAN interview pertaining to depression (Sections 6, 7 and 8) were completed with the participant. A DSM-IV diagnosis (American Psychiatric Association, 1994) of major, minor or no depression was derived from this standardised interview and established the gold standard in this study. The diagnosis of major depression was obtained by running the diagnostic algorithm of the SCAN software. The diagnosis of minor depression was derived from the responses to the items of the SCAN interview and adhering to the DSM-IV criteria for minor depression.
The participant was visited by another researcher within 5 days of the initial SCAN assessment and completed the EDS. This second researcher was blind to the outcome of the SCAN interview. Participants were given the choice of completing the questionnaire themselves, reading the items and indicating their response verbally or having the items and responses read to them verbatim. The length of time since diagnosis of PD and the severity of PD at the time of the study, as indicated by the Hoehn and Yahr stage (Hoehn and Yahr, 1967), were retrieved from the participant's clinic notes.
The Se, specificity (Sp), PPV and negative predictive value (NPV) were calculated at various different cut-off points for the EDS to identify the optimal threshold scores for patients with PD. A receiver operating characteristic (ROC) curve was generated in order to calculate the area under the curve (AUC), and the positive and negative likelihood ratios and diagnostic odds ratio were calculated. These test characteristics were calculated for the identification of those participants with either major or minor depression and for the identification of those with major depression. Data were analysed using the Statistical Package for the Social Sciences version 18.0. Non-parametric methods were used throughout as the data were not normally distributed.

Results
A total of 136 patients participated in the study, but 15 were excluded because of likely cognitive impairment (i.e. MMSE score less than 24) and one declined to continue in the study following the SCAN interview. The characteristics of the remaining 120 participants are shown in Table 1. Hoehn and Yahr ratings were only available for 59 (50%) participants, of which 55 (93%) were Hoehn and Yahr stages 1 and 2 (minimal disability). Nineteen (15.8%) participants met DSM-IV criteria for a depressive disorder-14 (11.7%) had a diagnosis of major depression, and five (4.2%) of minor depression, and none received a diagnosis of dysthymia. Participants who were identified as depressed were not significantly different in terms of gender (Pearson χ 2 (two-tailed) = 1.7, df = 1, p = 0.19), age (Mann-Whitney U-test, Not Significant (NS)), length of time since diagnosis of PD (Mann-Whitney U-test, NS) nor MMSE score (Mann-Whitney U-test, NS) but had significantly higher EDS (Mann-Whitney U-test, p < 0.001) and BEDS (Mann-Whitney U-test, p < 0.001) scores. Table 2 summarises the performance characteristics of the EDS when different threshold scores are used to identify cases with minor or major depression according to DSM-IV criteria. The optimal cut-off for the scale in this sample was taken to be that which maximises the Se and Sp of the scale. In this instance, the optimal cut-off is 10/11, which correctly identified 14/19 (74%) participants who were diagnosed with depression (11/14 of those with major depression) and incorrectly identified 8/101 (8%) non-depressed participants. Often, when using a scale to screen for a condition, a different cut-off score is used in order to optimise the Se and the NPV. The threshold score of 8/9 gave a better combination of these characteristics. At this cut-off, 15/19 (79%) of depressed patients were identified. Figure 1 shows the ROC curve for the EDS identification of DSM-IV minor or major depression. The AUC was 0.904 (p < 0.001, 95% confidence interval 0.834-0.974). Table 3 summarises the performance of the BEDS across a range of thresholds. The optimal cut-off for the scale in this sample is 4/5, which correctly identified 17/19 (90%) participants who were diagnosed with depression (13/14 of those with major depression) but incorrectly identified 32/101 (32%) non-depressed participants. The AUC for the BEDS, calculated from the ROC curve analysis, was 0.882 (p < 0.001, 95% confidence interval 0.808-0.956) ( Figure 1).

Identification of major depression
As for the identification of 'any depression', the optimal cut-off on the EDS for the identification of DSM-IV major depression was found to be 10/11 (Table 4). Although a lower cut-off gives higher Se, it is at the cost of poor Sp-the 10/11 cut-off appears to give a better balance. At this cut-off, 11/14 (79%) cases of major depression were correctly identified, and 11/106 (10%) of non-depressed participants were incorrectly identified as cases. The threshold that gave the optimal combination of Se and NPV (while maintaining a reasonable Sp) was 8/9, which identified 12/14 (86%) of patients diagnosed with major depression. The AUC for the EDS identification of DSM-IV major depression was 0.888 (p < 0.001, 95% confidence interval 0.805-0.972).     Table 5 shows the test characteristics of the BEDS for the identification of those participants diagnosed with major depression. The optimal cut-off for the identification of major depression was 4/5. At this cut-off, 13/14 (93%) cases of major depression were correctly identified, and 36/106 (34%) of non-depressed participants were incorrectly identified as cases. The AUC for the BEDS identification of DSM-IV major depression was 0.857 (p < 0.001, 95% confidence interval 0.768-0.945).

Discussion
This study was designed to investigate the validity and utility of the EDS as a depression screening measure in patients with PD. The hypothesis was that the EDS, as a scale with no items that relate to the somatic symptoms of depression, would have superior performance in a PD sample to other commonly used depression screening scales. In this study of PD outpatients, the cut-off that represented the optimal combination of Se and Sp on the EDS was 10/11 and had Se of 74%, Sp of 92% and PPV of 64% for the identification of any depression (minor and major depression) and Se of 79%, Sp of 90% and PPV of 50% for the identification of those with major depression. The shorter version of the scale, the BEDS, showed Se of 89%, Sp of 68% and much lower PPV of 35% (and, therefore, a higher rate of misclassification). In practice, a higher or lower threshold may be preferable, depending upon the relative importance of minimising false positives or false negatives.
The performance characteristics of the EDS are comparable with but not significantly better than those of other depression scales that have been validated in PD. A review of depression scales was carried out by a group of experts in PD, and the authors suggested that self-report scales are more practical for routine clinical screening than clinician-rated scales, despite slightly poorer performance characteristics (Schrag et al., 2007). Of the self-report scales considered in the review, the Beck Depression Inventory, the Hospital Anxiety and Depression Scale and the Geriatric Depression Scale (GDS) (both the 30-item and 15-item versions) were felt to be the most  useful for screening for depression. Since that review, a recent study has evaluated and compared the performance of several different depression scales in PD (Williams et al., 2012). The study demonstrated comparable results with previous studies of the scales, and the authors felt the GDS-30 to be the most efficient for screening (in terms of psychometric properties and use of clinician time) with an Se of 72% and Sp of 82% (PPV 73%, NPV 81% and AUC 0.83). Although not reported in this paper, the participants in the current study completed the GDS-15 in addition to the EDS (Baillon et al., 2013). The GDS-15 was shown to have similarly strong performance characteristics (Se 84%, Sp 89%, PPV 59%, NPV 97% and AUC 0.92) as those demonstrated in the study by Williams et al. (2012). The performance of the EDS in terms of identification of minor or major depression appears to be as good as many other self-report measures (Williams et al., 2012) provided that the PD-specific cut-off score is used, but the fact that the scale has no items relating to the somatic symptoms of depression does not seem to give it superior performance. It is a short self-report questionnaire and as such would be more suitable for routine clinical screening than longer scales (such as the Beck Depression Inventory or GDS-30) or those that require completion by a trained member of staff, and it may have the advantage of greater face validity (with both patients and clinicians) because of the absence of items relating to symptoms that also occur in PD. Additionally, there is an item relating to self-harm that should alert the clinician to enquire further and seek appropriate advice about management if necessary; older people with physical ill health, disability and depression are of particular high risk of suicide (Dennis, 2009).
The optimal cut-off on the EDS identified in this study for screening for any depressive disorder is higher than the recommended cut-off of 9/10 reported in studies of different patient groups (postnatal women (Cox et al., 1996) and palliative care patients (Lloyd-Williams et al., 2004)), which illustrates the importance of validating a scale in a new patient group rather than applying the cut-off identified in another sample. In this study, the participants were given the option to respond to the items verbally rather than completing the questionnaire, which many chose to do because of problems with tremors and handwriting. It is not known how this could have impacted upon the results. Research into oral versus written presentation of a depression scale has suggested that respondents report less depressive responses when answering orally (Cannon et al., 2002), so it is possible that it may have resulted in lower scores on the EDS and thereby a higher optimum cut-off for identifying clinically important depression.
Some might question the use of DSM-IV criteria for depression as the gold standard in this study because of the problems associated with the attribution of somatic symptoms to PD. However, there is no better standardised criterion available (other standardised criteria also include somatic symptoms), and if the results of this study are to be comparable with studies reporting the performance of other screening measures, it is best to use the most widely used gold standard. The SCAN diagnostic algorithm for DSM-IV major depression enables the clinician to indicate when they feel that there should be a clear attribution of a somatic symptom to PD-otherwise, an inclusive approach is taken (as recommended by Marsh et al., 2006).
The 16% prevalence of depression was lower than reported in many similar studies of depression scales in PD and is significantly lower than the mean prevalence in outpatient samples reported in the review of prevalence studies carried out by Reijnders et al. (2008). In our study, 22 (18%) participants were prescribed with antidepressant medication, suggesting a high prevalence of treated depression, although eight still met depression criteria. Out of those identified as depressed, 58% were not prescribed an antidepressant illustrating the potential importance of screening.
It is important to note that the sample studied here had predominantly mild to moderate PD, and therefore, results cannot be generalised to patients with more severe PD. However, it is important to validate a screening measure in the type of population in which the screening would take place, and this sample was typical of a PD outpatient clinic where the majority of patients are in the mild to moderate stages of the disease. The study also excluded patients who performed less well on cognitive testing, and so, the results cannot be generalised to patients who have dementia in PD. As a significant proportion (up to 80%) of people with PD will experience cognitive impairment to some degree (Aarsland et al., 2003), it is important that consideration is given to whether a depression screening measure is effective in such patients. The Cornell Scale for Depression in Dementia (Alexopoulos et al., 1988a(Alexopoulos et al., , 1988b has been shown to be effective in PD for patients with and without dementia, but as a clinician-rated scale, it is not as practical for routine clinical screening.
This study shows that the EDS could be used in clinical practice to screen for depression in an outpatient PD clinic, provided that an appropriate cut-off score is used. While no substitute for thorough clinical diagnostic assessment, use of such a measure can highlight those patients who may warrant further investigation regarding their mood. It is important that screening for depression should take place in a clinical environment that can provide adequate collaborative care management of those patients identified as having significant depressive symptoms in order to ensure improved clinical outcomes-without this screening for the condition is of limited value.
Although there is significant support for the use of antidepressant medications from both open-label trials and clinical practice, until recently, the benefit remained to be established in rigorous randomised controlled trials for each of the categories of antidepressant medication available (Ghazi-Noori et al., 2003;National Institute for Health and Clinical Excellence, 2006). However, a recent review of treatments for depression in PD carried out by the Movement Disorder Society concluded that some treatments are efficacious and others likely to be, which suggests that good quality evidence of treatment benefit for PD patients is mounting (Seppi et al., 2011).
Although the EDS performed well, the published literature suggests that other measures, in particular the GDS (either the 30-item or 15-item version), may have equally good Se and Sp but better PPV (which would mean fewer false positives) (Schrag et al., 2007;Williams et al., 2012). However, the EDS is a short self-report questionnaire that does not require a trained staff to complete and does not have potentially confounding items relating to somatic symptoms that also occur in PD. Further research validating the measure in a more clinically diverse sample of PD patients would enable further assessment of the utility of a scale in such patients. The EDS also has the benefit of being validated across the age spectrum in different general hospital patient groups (postnatal women (Cox et al., 1996), menopausal women (Becht et al., 2001) and palliative care (Lloyd- Williams et al., 2000;Lloyd-Williams et al., 2007)). If one instrument is valid and used routinely in multiple health settings within the general hospital, clinicians will gain familiarity and confidence in its use, which should improve the identification of depression in patients throughout the hospital setting.