Utility of a Spanish version of Three Words-Three Shapes Test to detect memory impairment in primary progressive aphasia

Abstract Introduction Three Words-Three Shapes (3W3S) is a bedside test that assesses verbal and non-verbal memory and has proven useful in staging memory decline in amnestic disorders and primary progressive aphasia. Given its simple structure, the 3W3S can be easily adapted to other languages maintaining the original shapes and only modifying the words. We aim to validate a Spanish version of the 3W3S test and establish whether memory loss patterns present in amnesic disorders associated with Alzheimer's etiology and PPA were correctly characterized. Method The translation and adaptation of the 3W3S were performed according to standardized guidelines and applied to a cohort of patients with Dementia of Alzheimer's type (DAT = 20), mild cognitive impairment (aMCI= 20), primary progressive aphasia (PPA = 20), and healthy controls (HC = 20). Results In verbal memory performance, PPA patients' score was lower than that of MCI and HC and similar to DAT's in the effortless encoding (p < 0.001), delayed recall (p < 0.001), and recognition (p < 0.012). For non-verbal performance, PPA patients performed better than DAT and similar to HC and MCI subjects (p < 0.001). Conclusions Results show good applicability of 3W3S to determine memory function in PPA patients, independently from language ability. Visual and verbal components of memory are dissociated in PPA.


Background
Since it was originally described, primary progressive aphasia (PPA) (Mesulam, 1982) has been defined as a clinical dementia syndrome with isolated language impairment at onset and preserved function of other cognitive domains. In everyday practice, PPA diagnosis may be challenging because the salient language disorder may mask preservation of function in non-language domains when tests require language use. Since the most common cause of dementia, namely Alzheimer's disease, causes significant memory loss, memory evaluation is considered critical for mental status examination. Therefore, clinical neuropsychologists must have a simple tool to assess memory in the context of severe language impairment to tease out the contribution of aphasia and focus on the primary ability of learning and retaining information.
The current diagnostic consensus (Gorno-Tempini et al., 2011) considers memory disorders an exclusion criterion. However, subsequent research indicates episodic and working memory disorders are frequent in patients with PPA (Butts et al., 2015;Flanagan et al. 2014;Ramanan et al., 2016).
Reports on PPA progression (Ferrari et al., 2019;Van Langenhove et al., 2016) have shown an amnesic decline in the logopenic variant and behavioral decline in the semantic variant. Reviews, such as Rogalski's (Mesulam et al., 2014), have suggested these deficits occur very late in the syndrome's natural course. Advanced cases with extensive language impairment also often suffer a functional disability, raising the question of impairment in other cognition areas.
The development and validation of appropriate tests are relevant to differentiate whether memory impairments are independent or secondary to language impairment in PPA patients. Because aphasia can affect performance on many mental status tests, it is especially important to have an instrument that can circumvent language deficits to establish preservation or loss of other cognitive domains (Eikelboom et al., 2018).
In 1985, Mesulam and Weintraub developed a simple test to overcome this problem; the Three Words-Three Shapes (3W3S) test (Weintraub & Mesulam, 1985), a bedside test assessing verbal and non-verbal memory in the visual modality. Typically, verbal memory is assessed auditorily, through presentation and repetition of word lists and stories; and nonverbal memory by copying and reproducing shapes (i.e., in the visual modality). The 3W3S test keeps all visual modality stimuli since the task requires writing and drawing for all its components (words and shapes). 3W3S has proven useful in staging memory decline in dementia of the Alzheimer's type (DAT). However, there is still no evidence of its utility for the characterization of memory deficits in MCI. The test has also been used to characterize memory impairment in primary progressive aphasia (PPA) (Weintraub et al., 2013;Kielb et al., 2016) as well as assessing memory in Korsakoff's amnesia (Weintraub et al., 2000). The test evaluates different stages of episodic memory (i.e., effortless and effortful encoding, recall/retrieval, and recognition) in both verbal and non-verbal modalities. Given its simple structure, the 3W3S test can be easily adapted to other languages using the original shapes and translating the words. This test has already been validated in a Turkish translation (Kudiaki & Aslan, 2007), but not Spanish.

Goal
The present study's main purpose was to validate a Spanish version of the Three Words-Three Shapes test (Weintraub et al., 2000;Weintraub & Mesulam, 1985). We also aim to confirm whether memory loss patterns present in DAT, MCI, and PPA were correctly characterized, compared to results observed in cognitively healthy controls.

Description of the 3W3S test
Since its original description, 3W3S has been subject to modifications. We selected the current version for this adaptation and followed instructions as provided by the authors (https://www.brain.northwestern.edu/scientists-students/testrequest.html). The test includes six items: three abstract figures (called shapes) and three words, printed on a paper sheet. The first task requires the patient to copy the items. After copying them, the sheet is removed, and the subject is immediately asked to reproduce all six stimuli (Effortless Encoding or Incidental Recall). After this, the subject is reexposed to the stimuli and asked to study them for 30 seconds in order to be able to reproduce them immediately from memory. Following memorization, the subject is asked to write down all the stimuli he/she remembers. Three attempts are allowed. (Effortful Encoding Trials). Ten to fifteen minutes after the three attempts, the subject is required to write down the words and draw the shapes (6 items) (Delayed recall). After this, a multiple-choice sheet is presented where the six stimuli are displayed together with distractors, the subject is required to single out the previously memorized stimuli (Recognition).

Translation of the 3W3S test
Instructions translation Forward translation. Translation and validation were performed following the International Test Commission Guideline (International Test Commission, 2017). The first stage in the adaptation process of 3W3S was translating the instructions. A bilingual experimental psychologist from Argentina, familiar with both cultures, translated the instructions into Spanish.
Blind back translation.. A second independent translator, a clinician, and native English and Spanish speaker, blind to the original version, translated the instructions back into English.

Stimuli translation and adaptation
The translation and adaptation of the stimuli (target words and distractors in both versions) required words to be equivalent in Spanish in terms of meaning, length, and frequency within the language. To study word frequency in English, we used SUBTLEXus (Brysbaert & New, 2009), and for Spanish, the Sebastian Carreiras & Cuetos Spanish frequency dictionary (Sebasti an-Gall es et al., 2000).
Two versions of the original 3W3S test are available in English. Each includes three target words, namely, two abstract nouns of low and medium frequency and one concrete noun of high frequency. In the multiple-choice word recognition section, each target word has two distractors; one is semantic and the other phonological. Each group of three is composed of one high, one medium, and one lowfrequency word. A tenth distracter, not related to any of the other words, is listed at the end.
Equivalent words in Spanish were selected using these same guidelines. The Spanish version shapes were identical to those of the original English version (see Figure 1).

Sample and design
Sixty outpatients from our Ageing and Memory Clinic in Buenos Aires, Argentina meeting the criteria for one of three clinical diagnoses, namely: amnestic mild cognitive impairment (aMCI; n ¼ 20), dementia of Alzheimer's type (DAT; n ¼ 20), and primary progressive aphasia (PPA; n ¼ 20) were recruited for this study. The sample size was calculated for a one-way ANOVA test assuming unequal variances based on previously published results (Weintraub, 2013) comparing PPA with DAT and HC. After calculations, the required sample size was eight subjects per group. The aMCI diagnosis was based on Petersen's criteria (Petersen, 2004). Patients needed to report memory complaints, confirmed both by an informant and by neuropsychological testing, and a score of 0.5 on the Clinical Dementia Rating Scale (CDR) while maintaining the ability to function in daily living activities. Clinical diagnosis of DAT required patients to meet criteria established by McKhann (McKhann et al., 2011) and a CDR score of 1, to exclude severely impaired patients. For PPA diagnosis, patients had to fulfill clinical criteria as defined by Mesulam (Mesulam, 2001) and Gorno-Tempini (Gorno-Tempini et al., 2011). In order to only include patients with low impairment, we selected PPA subjects with less than two years of disease duration. Patients showing clinical or neuropsychological deficits when copying figures were excluded in order to avoid visuospatial deficit interference during memory assessment. Patients with normal clinical assessment and normal scores on the MMSE drawing task were nevertheless excluded if their score on the 3W3S test for shapes was less than 12 points (80% total copy score).
Twenty healthy control (HCs) subjects with a MMSE score above 28 and matched for age and education level were recruited from a local volunteer group. The local ethics committee approved the protocol, and all subjects signed an informed consent form before assessment.

Cognitive assessment
Subjects were administered: the Mini-Mental Status Examination (MMSE) and the Boston Naming Test (BNT 30-item version),

3W3S test administration
All 5 phases of the 3W3S test, namely: Copying, Effortless Encoding (Incidental Recall), Effortful Encoding (3 Learning Trials), Delayed Recall (after a 10-15 minute delay), and Recognition were completed by each of the three patient groups as well as the HC. Subjects were administered both versions of the test (A and B) using stratified randomization by clinical group.

Data analysis
Data were analyzed using SPSS software. We hypothesized different performance outcomes between groups for visual and verbal memory and expected aMCI and DAT patients to perform worse than healthy controls for both types of memory impairment. In PPA, we anticipated verbal learning loss with preservation of visual learning. Based on this hypothesis, group comparison was performed as well as effect size measured to establish impact in each clinical group.
The normality assumption was evaluated using graphs and Kolmogorov-Smirnov and Shapiro-Wilk tests, with Lilliefors correction, when appropriate. For parametric variables, a group means comparison was performed with oneway ANOVA, and Bonferroni's method applied for post hoc comparison. A non-parametric approach and a Kruskal-Wallis test were applied when indicated. The effect size was calculated with Cohen's d. Chi-squared test was used to compare proportions for categorical variables.
As in prior differential memory performance analyses, shape and word tasks were examined separately. Sensibility and specificity for differentiation of PPA from DAT were studied with a ROC curve.

Demographic results
Demographic data of study participants are presented in Table 1. Groups did not differ significantly in age, sex, educational level, and handness. Seventeen of PPA subjects, 85%, of the sample fulfill criteria for the logopenic variant; semantic and agrammatic variants represent 10 and 5% of the sample respectively. There were significant differences in the MMSE scores were HC performed significantly better than PPA and DAT patients and, MCI patients performed better than the other patient groups. On BNT, PPA patients performed significantly worse than HC, MCI, and DAT subjects.

3W3S performance
The results of each group's performance on each stage of the test are included in Table 2 and Figure 2.

Effortless encoding
On the effortless encoding of words, PPA patients performed worse than HC (p ¼ 0.001, Cohen's d ¼ 2.24), but similar to DAT patients (p ¼ 1). In contrast, for shapes, scores were not significantly different from HC (p ¼ 0.286) and were better than DAT patient group scores (p ¼ 0.014).

Effortful encoding (learning trials)
The scores in the effortful encoding of words of aMCI and DAT patients were similar to those of HC. However, PPA patient group performance was significantly poorer than that of HC (p ¼ 0.003; Cohen's d ¼ 1.28).
The DAT group showed significantly lower performance than HC (p < 0.001), aMCI (p < 0.001), and PPA (p ¼ 0.001) patients on the effortful condition for shapes. Effect size (Cohen's d) of DAT diagnosis on learning was 1.52. No significant differences were present between PPA, MCI, or HC groups on effortful encoding for shapes.

Delayed recall
DAT patients showed deficient performance for delayed recall of words compared to HCs (p ¼ 0.001); and, PPA patients showed significantly worse performance than HCs (p < 0.001) and aMCI (p ¼ 0.016). There were no significant differences between DAT and PPA (p ¼ 1). Effect magnitude (Cohen's d) between DAT and controls was 2.59, and between PPA patients and controls, 2.699. As a screening test, delayed recall of shapes is able to differentiate PPA from DAT (AUC ¼ 0.90 p ¼ 0.004) with a 90% sensitivity and a specificity of 63% using a cutoff of 6. In the same analysis, BNT reports an AUC ¼ 0.87 and MMSE, AUC ¼ 0.50. Figure 3 in the Supplementary Appendix shows a more detailed ROC analysis.
On delayed recall for shapes, PPA patients' performance did not differ significantly from patients with aMCI or HCs. However, the DAT group showed worse performance than controls (p < 0.001), aMCI (p < 0.001) or PPA (p < 0.001) patients. Effect size (Cohen's d) of DAT diagnosis over HC was 3.27.

Recognition
In shape recognition, DAT subjects scored significantly lower than other groups. However, in word recognition, DAT and PPA had equivalent performance but performed significantly lower than HC (DAT vs HC, p ¼ 0.03; PPA vs HC, p ¼ 0.04) and MCI (DAT vs MCI, p ¼ 0.04, d ¼ 1.08; PPA vs MCI, p ¼ 0.045, d ¼ 0.87).

Discussion
This study reports results from the validation of a Spanish language version of the 3W3S test to evaluate verbal and non-verbal episodic memory. This Spanish version of the test proved useful for assessing non-verbal memory and its dissociation from verbal memory in PPA patients (Weintraub et al., 2013;Kielb et al., 2016).
The test can be easily used for routine, outpatient clinical assessments and was designed to demonstrate how information type (verbal and non-verbal), can influence retrieval and retention of explicit memory in patients with dementia and prominent language disorders. In the effortless encoding condition, patients had to reproduce six items immediately after copying them, with no forewarning. Patients with DAT markedly failed this task compared to controls and aMCI patients. PPA patients had a contrasting performance, performing as well as normal controls for shapes, but more like DAT patients for words; these results are similar to those previously reported by Weintraub (Weintraub et al., 2013). Effortless encoding is primarily a working memory task. This would explain its preservation in purely amnestic MCI patients and its impairment in DAT patients, in whom the number of affected domains is greater (Kirova et al., 2015). On the other hand, this finding also implies verbal working memory is selectively impaired in PPA.
To overcome this suspected impairment of working memory, the initial design proposed by Mesulam incorporated an effortful learning condition (3 learning trials). The learning curves reported in the results section show how evoking the six items improves throughout the trials, overcoming the difficulty that immediate evocation represents. Nevertheless, total learning in patients with DAT remains significantly impaired compared to that of the other groups, consistent with the known difficulty in learning, despite repeated exposures. In contrast, PPA patients showed a discrepancy between difficulty learning words due to their aphasia and ability to reproduce shapes, which remained intact.
Delayed recall of words is impaired in DAT patients, a characteristic feature of dementia associated with Alzheimer's disease. However, patients with PPA syndrome also show deficient performance on delayed recall for words, with scores equivalent to those of DAT patients. Cohen's d was calculated to look for differences between groups and calculate the impact of the diagnosis. Our results were   similar to those obtained when the original version of the test was applied in previous studies (Weintraub et al., 2013), showing a similar magnitude of impoverishment in delayed recall for words in DAT and PPA patients, entirely different to what happens with delayed recall for shapes. In the recall for shapes, PPA patients performed better than DAT and MCI patients in the delayed recall, showing that deficits in delayed recall of words reflect a language impairment and not an authentic memory problem.
In the test's recognition stage, DAT performance was significantly lower than the other groups for the recognition of shapes. However, when looking at the word recognition, PPA had a similar performance to the DAT group showing lower scores than the HC and the aMCI group. This indicates language disorders in PPA also interfere with word recognition.
Our findings agree with those of other research (Kielb et al., 2016;Nilakantan et al., 2017;Weintraub, et al., 2013). Selective alteration of memory at the verbal level (words) with conservation of the non-verbal (shapes) implies that memory difficulties are a secondary manifestation of language impairment.
Episodic memory assessment in PPA is of significant clinical relevance and has critical diagnostic implications for many reasons. On the one hand, there is evidence that progression to dementia in the logopenic variant of PPA conoccurs with patients' difficulties in episodic memory (Funayama, 2019). And on the other hand, there is also evidence that general cognition can predict the progression to dementia at 5 years in early PPA of semantic and agrammatic variants (O'Connor et al., 2016). Memory assessment is therefore critical when studying both functionality and prognosis in a PPA patient. Even though there are sophisticated tools to evaluate memory in subjects with language disorders, we must not lose sight that the clinician needs to evaluate it and make decisions within the limited time of a medical consultation. In this scenario, 3W3S is presented as a practical solution that meets three relevant characteristics: simplicity, celerity and good diagnostic performance.
Currently, there is no available tool in Spanish, similar to the 3W3S test. Most of the tasks validated or developed in Spanish involve verbal instructions, or verbal stimuli or verbal responses. Therefore, the Spanish version of the 3W3S collaborates to extend the spectrum of useful tasks to be applied in patients with language disorders.
In summary, we can confirm that PPA patients have similar performance to DAT patients regarding memory for words. This result is entirely different for non-verbal memory in which PPA patients perform similarly to cognitively healthy individuals. Current screening tools available in Spanish (such as MMSE and MoCA) do not include visual memory assessment and are highly influenced by aphasia's severity. Our findings highlight the importance of including visual and verbal memory assessment during initial clinical evaluation of PPA patients as well as during follow-up visits, using a procedure that is relatively free of language processing requirements.