Addenbrooke’s Cognitive Examination III (ACE-III) and mini-ACE for the detection of dementia and mild cognitive impairment

Background The number of new cases of dementia is projected to rise significantly over the next decade. Thus, there is a pressing need for accurate tools to detect cognitive impairment in routine clinical practice. The Addenbrooke's Cognitive Examination III (ACE-III), and the mini-ACE are brief, bedside cognitive screens that have previously reported good sensitivity and specificity. The quality and quantity of this evidence has not, however, been robustly investigated. Objectives To assess the diagnostic test accuracy of the ACE-III and mini-ACE for the detection of dementia, dementia sub-types, and mild cognitive impairment (MCI) at published thresholds in primary, secondary, and community care settings in patients presenting with, or at high risk of, cognitive decline. Search methods We performed the search for this review on 13 February 2019. We searched MEDLINE (OvidSP), Embase (OvidSP), BIOSIS Previews (ISI Web of Knowledge), Web of Science Core Collection (ISI Web of Knowledge), PsycINFO (OvidSP), and LILACS (BIREME). We applied no language or date restrictions to the electronic searches; and to maximise sensitivity we did not use methodological filters. The search yielded 5655 records, of which 2937 remained aLer we removed duplicates. We identified a further four articles through PubMed 'related articles'. We found no additional records through reference list citation searching, or grey literature. Selection criteria Cross-sectional studies investigating the accuracy of the ACE-III or mini-ACE in patients presenting with, or at high risk of, cognitive decline were suitable for inclusion. We excluded case-control, delayed verification and longitudinal studies, and studies which investigated a secondary cause of dementia. We did not restrict studies by language; and we included those with pre-specified thresholds (88 and 82 for the ACE-III, and 21 or 25 for the mini-ACE). Data collection and analysis We extracted information on study and participant characteristics and used information on dementia and MCI prevalence, sensitivity, specificity, and sample size to generate 2×2 tables in Review Manager 5. We assessed methodological quality of included studies using the QUADAS-2 tool; and we assessed the quality of study reporting with the STARDdem tool. Addenbrooke’s Cognitive Examination III (ACE-III) and mini-ACE for the detection of dementia and mild cognitive impairment (Review) Copyright © 2019 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd. 1 Cochrane Library Trusted evidence. Informed decisions. Better health. Cochrane Database of Systematic Reviews Due to significant heterogeneity in the included studies and an insuOicient number of studies, we did not perform meta-analyses. Main results This review identified seven studies (1711 participants in total) of cross-sectional design, four examining the accuracy of the ACE-III, and three of the mini-ACE. Overall, the majority of studies were at low or unclear risk of bias and applicability on quality assessment. Studies were at high risk of bias for the index test (n = 4) and reference standard (n = 2). Study reporting was variable across the included studies. No studies investigated dementia sub-types. The ACE-III had variable sensitivity across thresholds and patient populations (range for dementia at 82 and 88: 82% to 97%, n = 2; range for MCI at 88: 75% to 77%, n = 2), but with more variability in specificity (range for dementia: 4% to 77%, n = 2; range for MCI: 89% to 92%, n = 2). Similarly, sensitivity of the mini-ACE was variable (range for dementia at 21 and 25: 70% to 99%, n = 3; range for MCI at 21 and 25: 64% to 95%, n = 3) but with more variability specificity (range for dementia: 32% to 100%, n = 3; range for MCI: 46% to 79%, n = 3). We identified no studies in primary care populations: four studies were conducted in outpatient clinics, one study in an in-patient setting, and in two studies the settings were unclear. Authors' conclusions There is insuOicient information in terms of both quality and quantity to recommend the use of either the ACE-III or mini-ACE for the screening of dementia or MCI in patients presenting with, or at high risk of, cognitive decline. No studies were conducted in a primary care setting so the accuracy of the ACE-III and mini-ACE in this setting are not known. Lower thresholds (82 for the ACE-III, and 21 for the miniACE) provide better specificity with acceptable sensitivity and may provide better clinical utility. The ACE-III and mini-ACE should only be used to support the diagnosis as an adjunct to a full clinical assessment. Further research is needed to determine the utility of the ACEIII and mini-ACE for the detection of dementia, dementia sub-types, and MCI. Specifically, the optimal thresholds for detection need to be determined in a variety of settings (primary care, secondary care (inpatient and outpatient), and community services), prevalences, and languages. P L A I N   L A N G U A G E   S U M M A R Y How accurate are the Addenbrooke's Cognitive Examination III (ACE-III) and mini-ACE for the screening of dementia and mild cognitive impairment (MCI)? Why is recognising dementia important? The number of people being diagnosed with dementia is expected to increase significantly over the next 10 years. There is therefore an increasing need for tools that can assess memory and learning to aid the diagnosis of dementia and MCI. The ACE-III and mini-ACE are currently used in clinical practice, but the evidence for their accuracy to identify dementia has not been fully established. What was the aim of this review? The aim of this review was to find out how accurate the ACE-III and mini-ACE are in identifying dementia and MCI across a range of healthcare settings. The test is performed on a patient who is suspected to have dementia. What was studied in this review? The ACE-III has 21 questions, with a total score of 100. The test is performed with the patient who presented with, or is suspected to have, dementia. The questions cover five diOerent areas of brain function, and a higher score indicates better function. The mini-ACE is shorter, with only five questions, and a total score of 30. The thresholds describe the score at which a diagnosis of dementia should be considered and these are usually 82 or 88/100 for the ACE-III and 21 or 25/30 for the mini-ACE. The ACE-III and mini-ACE are not used on their own to make a diagnosis of dementia, but help clinicians when used in addition to other clinical information and investigations. What are the main results of the review? This review included seven studies with a total of 1711 patients; four studies examined the ACE-III, and three examined the mini-ACE. We did not combine the study information statistically due to significant diOerences between the studies. The ability of both the ACE-III and the mini-ACE to identify patients with either dementia or MCI was variable (between 70% and 99% of people were correctly identified as having dementia and between 64% and 95% for MCI). However, there was more variability between the studies in the number of false positives identified by the tests (between 0% and 96% of people were incorrectly identified as having dementia and between 8% and 54% of people were incorrectly identified as having MCI). At the lower test thresholds, there were fewer false positive diagnoses of dementia (between 64% and 100% of people correctly identified as not having dementia or MCI). How reliable are the results of this review? There were some issues with the methods used by studies: the way in which patients were identified and enrolled into the studies, and the way in which the ACE-III and mini-ACE were carried out were not well described. The studies were small and did not study enough Addenbrooke’s Cognitive Examination III (ACE-III) and mini-ACE for the detection of dementia and mild cognitive impairment (Review) Copyright © 2019 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd. 2 Cochrane Library Trusted evidence. Informed decisions. Better health. Cochrane Database of Systematic Reviews people to be confident about the results. These issues mean that the accuracy of the ACE-III and mini-ACE may have appeared better than it actually was. Who do the results of this review apply to? The average age in all the studies was over 60 years. The proportion of people with dementia was diOerent between studies (range: 15% to 55.6%). All of the studies were conducted in a specialist setting, so we do not know if the ACE-III or mini-ACE could be used in general practice or the community. Four studies were in the UK, two were in China, and one in Japan. What are the implications of this review? Overall, the quality, size, and number of included studies has not allowed a definitive conclusion on whether the ACE-III or the mini-ACE should be used to identify dementia or MCI. These findings can only be used in a hospital setting, as none of the studies investigated community or general populations. The ACE-III or mini-ACE should only be used as part of a clinical assessment when making a diagnosis of dementia, and should not be relied upon alone. More research is needed to investigate the ACE-III and mini-ACE in diOerent healthcare settings, languages, and cultures. How up to date is this review? The review authors searched for and included studies up to April 2019. Addenbrooke’s Cognitive Examination III (ACE-III) and mini-ACE for the detection of dementia and mild cognitive impairment (Review) Copyright © 2019 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd. 3 Addbrooke’s Conitive Exam intion III (AE-III) nd m in-ACE or he dection of em etia nd m ild conitive im pairm nt (Rview ) Coyright © 2019 he Cohrane Colloration. Pulished by John W ley & Sns, Ltd. 4 S U M M A R Y   O F   F I N D I N G S Summary of findings 1.   Summary of Test Accuracy Findings Patient population Patients presenting with cognitive decline but no known diagnosis of dementia. Index test The ACE-III and mini-ACE, including different languages. Reference standard Undifferentiated dementia: DSM-IV and DSM-5, ICD-10 and ICD-11; Alzheimer’s disease: NINCDS/ADRDA, ICD-10 and ICD-11, DSM-IV and DSM-5, NIA/AA; vascular dementia: NINDS-AIREN, DSM-IV and DSM-5, ICD-10 and ICD-11; frontotemporal dementia: Lund-Manchester criteria, NINDS; Lewy body dementia: International consensus criteria; MCI: NIA/AA, DSM-IV and DSM-5, Mayo, Petersen; post-stroke dementia: DSM-IV and DSM-5, ICD-10 and ICD-11. Target condition Dementia (all-cause and sub-types), MCI. Included studies 7 studies (1711 patients) Quality concerns The majority of studies were identified to be at low or unclear risk of bias on the QUADAS-2 assessment. More studies were labelled at high risk of bias for the index test (n = 4) and reference standard (n = 2) due to lack of information on the conduct of the index test or reference standard. All studies were low or unclear risk of applicability. Studies were at unclear risk mainly due to inadequate reporting. Heterogeneity There was significant heterogeneity between studies in terms of patient population, study setting, language and culture, and reference standard. Study ID Comparison Test threshold Sensitivity (%) Specificity (%) Positive predictive value (%) Negative predictive value (%)


T A B L E O F C O N T E N T S
Patients presenting with cognitive decline but no known diagnosis of dementia.

Index test
The ACE-III and mini-ACE, including different languages.

Target condition
Dementia (all-cause and sub-types), MCI.

Quality concerns
The majority of studies were identified to be at low or unclear risk of bias on the QUADAS-2 assessment. More studies were labelled at high risk of bias for the index test (n = 4) and reference standard (n = 2) due to lack of information on the conduct of the index test or reference standard. All studies were low or unclear risk of applicability. Studies were at unclear risk mainly due to inadequate reporting.

Heterogeneity
There was significant heterogeneity between studies in terms of patient population, study setting, language and culture, and reference standard.

Conclusions
This review identified 7 studies of cross-sectional design, 4 examining the screening accuracy of the ACE-III, and 3 of the mini-ACE. We identified no studies in primary care populations, 4 studies were conducted in outpatient clinics, 1 study in an in-patient setting, and 2 were unclear.
We did not perform meta-analysis due to significant heterogeneity. The majority of studies investigated published thresholds, but 3 studies determined optimal cut-o s.
Sensitivity of the mini-ACE for the detection of dementia and MCI across thresholds and patient populations was generally high (range: 64% to 99%) but with more variable specificity (range: 32% to 100%). The ACE-III also had good sensitivity across thresholds and patient populations (range: 75% to 97%), but specificity varied between populations, being significantly poorer in the post-stroke rehabilitation setting (range: 5% to 11%) compared to an outpatient memory clinic (range: 50% to 77%).

Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews

B A C K G R O U N D
Dementia is an emerging public health concern; 46 million people currently live with dementia worldwide (Alzheimer's Society 2016 Cochrane Database of Systematic Reviews of consistent international guidance on the assessment and management of dementia, which has the potential to introduce further geographical disparities in care (Ngo 2015). Concerns have been raised regarding the widespread use of common assessment tools, particularly for the assessment of mild cognitive impairment, where the sensitivity is low (Nasreddine 2005). Clarity is therefore urgently required on the most appropriate and valid cognitive assessment tool for the early identification and monitoring of cognitive disorders.
Cognitive impairment is frequently not identified in routine assessments in primary care; cognitive decline is not recognised in up to 76% of patients (

Rationale
A diagnosis of dementia still carries much stigma and fear in modern society (Aminzadeh 2007; de Vugt 2013). Despite increasing research, accurate diagnostic tests and curative treatments remain elusive. Given the absence of an available cure, the consequences of a dementia diagnosis are profound and have an enormous impact on the patient, their family, and support network (Aminzadeh 2007; Davis 2015; de Vugt 2013). A high specificity will minimise the number of false positive diagnoses. A false positive diagnosis of dementia could cause serious psychological harm, and lead to unnecessary further investigations and treatments for a patient and their carers (de Vugt 2013). Sensitivity is also important to minimise the rate of false negative diagnoses, which can prevent or delay access to available treatments and support services, and potentially worsen the dementia state and carer strain, and evoke loss of confidence in care services (de Vugt 2013). Given the lack of current therapeutic options available in dementia, high specificity and minimising false positive diagnoses take precedence over sensitivity. If clinical practitioners had access to a screening test with high sensitivity and specificity, it would reduce the negative consequences outlined above, and facilitate the timely delivery of support and available treatments (de Vugt 2013).
In Cochrane Database of Systematic Reviews mini-ACE) has been carried out to date. Therefore, a Cochrane Review is required to assess the validity of the ACE-III and mini-ACE across all the available evidence, cut-o scores, settings in which the tools have been validated, and the quality of the evidence to date. In particular, the ACE-III and mini-ACE have shown promising results in a number of studies, and so may prove more sensitive and specific tests for the early detection of cognitive disorders, with the ability to distinguish between dementia subtypes (Hsieh 2013; Hsieh 2015). Correct and early identification and stratification of patients with dementia can result in better clinical outcomes, through the early initiation of available therapeutics and support services for patients and carers (Creavin 2016; Davis 2015; de Vugt 2013).

O B J E C T I V E S
To assess the diagnostic test accuracy of the Addenbrooke's Cognitive Examination-III (ACE-III) and the mini-ACE, for the screening of all-cause dementia, dementia subtypes (Alzheimer's disease, vascular dementia, frontotemporal dementia, Lewy body dementia), and mild cognitive impairment, across all healthcare settings at all pre-specified thresholds.

Secondary objectives
• To identify the quality and quantity of the research evidence on the diagnostic test accuracy of the ACE-III and mini-ACE for the assessment of all-cause dementia, dementia subtypes (Alzheimer's disease, vascular dementia, frontotemporal dementia, Lewy body dementia), and mild cognitive impairment, across all healthcare settings at all reported thresholds. • To identify sources of heterogeneity (age, sex, education, severity or stage of the target condition, operator characteristic of the index test and reference standard) in the included studies. • To identify gaps in the evidence where further research is required.

Types of studies
We considered cross-sectional studies for inclusion in this review, where the index test was administered alongside expert confirmation for reference. We considered comparative studies between dementia subtypes (i.e. Alzheimer's disease and frontotemporal dementia), or comparing the index tests with an alternative (i.e. the Mini Mental State Examination (MMSE), the Montreal Cognitive Assessment (MoCA)) for inclusion if an appropriate reference standard was present, but we only included data on the ACE-III and mini-ACE.
We excluded case control studies in this review due to the high risk of bias in these studies. We did not consider delayed verification or longitudinal studies for inclusion.
We considered nested case control studies for inclusion, where cases and controls are selected from the cohort population, which has a lower risk of bias than a traditional case-control study.
We did not include studies with a small number of cases (fewer than 10), due to their associated high risk of bias.

Participants
We included patients presenting with cognitive decline, undergoing cognitive testing in primary or secondary care. In the secondary care setting we included participants recruited in both outpatient (clinic) and in-patient (ward) settings. We also included studies conducted in patient populations with a high risk of cognitive decline, but not necessarily presenting with cognitive symptoms. We excluded studies which included participants with a comorbidity associated with cognitive impairment (motor neurone disease (MND), multiple sclerosis (MS), Parkinson's disease, brain injury/tumour/infection), where these participants comprised more than 20% of the study population. In addition, we excluded studies which included participants with known substance abuse or medication use known to a ect cognition where these participants comprised more than 20% of the study population.

Index tests
We , have been reported consistently in the literature, and are currently used conventionally in clinical practice. We therefore investigated the summary sensitivity and specificity values at these predefined thresholds. The ACE-III and mini-ACE have been translated into several languages and we considered all versions for inclusion. The ACE-III and mini-ACE tools are available at dementia.ie/images/uploads/site-images/ACE-III_Administration_(UK).pdf and s3-eu-west-1.amazonaws.com/ pstorage-karger-594308543098/6990263/450784_sm1.pdf, respectively.

Target conditions
The target conditions to be detected by the ACE-III or mini-ACE were as follows: all-cause dementia (undi erentiated); specific dementia subtypes (Alzheimer's disease, vascular dementia, frontotemporal dementia, Lewy body dementia); and mild cognitive impairment (MCI). We included all-cause dementia as a target condition, as it was anticipated that some studies will not have di erentiated between dementia subtypes. In addition, the ACE-III and mini-ACE were being evaluated as screening tests, therefore understanding the ability of the test to identify undi erentiated cognitive impairment for onward specialist referral for subtype and classification would be of relevance to primary care practitioners.

Reference standards
At present, there is no 'gold standard' test for the confirmation of MCI, dementia, or subtype. In current practice, dementia and MCI are confirmed by an appropriately qualified clinical specialist or expert (i.e. neurologist or psychiatrist), using internationally developed and validated criteria. The reference standard for this review was a clinical confirmation of dementia or MCI using disease-specific reference standards developed by a consensus group or accredited body, as follows. The presence of the disease had to be confirmed using one of these recognised criteria by an appropriately qualified specialist, expert, or consensus group in order for us to consider a study eligible for inclusion in this review. Imaging and biochemical investigations are o en used alongside clinical assessment to confirm dementia or MCI but we excluded studies which relied on imaging and biochemical investigations alone (without clinical assessment) from this review.
Studies using a histopathological diagnosis of dementia as a reference standard were not suitable for inclusion as this is a postmortem diagnosis.

Search methods for identification of studies
We devised search methods in accordance with the guidance given in Chapter 7 of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1; and we developed the search strategy in conjunction with the Information Specialist at the Cochrane Dementia and Cognitive Improvement Group (CDCIG).

Electronic searches
We searched MEDLINE (OvidSP), Embase (OvidSP), BIOSIS (Ovid), Web of Science Core Collection (ISI Web of Knowledge), PsycINFO (Ovid) and LILACS (Bireme), using a structured search strategy appropriate for each database. We used controlled vocabulary, such as MeSH terms and Emtree, where appropriate. We did not restrict the search by date, sampling frame, setting, or language. The search strategies used can be seen in Appendix 1.

Searching other resources
We reviewed the reference lists of all included studies. We also searched the following databases. We used the 'related articles' feature of PubMed to search for additional studies. We searched citation databases, such as Science Citation Index and Scopus, using key studies to identify any additional relevant studies. We searched grey literature, including conference proceedings, theses, and PhD abstracts. We did not perform handsearching, in accordance with the generic protocol (Davis 2013). We contacted research groups involved in previously published or ongoing research on the ACE-III or mini-ACE to identify any relevant, unpublished data.

Selection of studies
The eligibility criteria are as follows.

Inclusion criteria
• Primary, secondary, and community care services • Patients presenting with cognitive decline or screening in a highrisk population • Cross-sectional, comparative, or nested case-control studies • Studies utilising the ACE-III or mini-ACE as the index test • Presence of a referenced standard as specified above

Exclusion criteria
• Patients with a diagnosis of dementia at presentation • Patients with comorbidity associated with cognitive impairment, motor neurone disease (MND), multiple sclerosis (MS), Parkinson's disease, brain injury, tumour, infection • Patients with presence of substance abuse, or medication use known to a ect cognition • Case-control studies, longitudinal or delayed-verification studies • Small sample size (fewer than 10 participants) • Studies utilising older versions of the tool (ACE, ACE-R) • Absence of a reference standard as specified above Two review authors (LCB, APB) independently screened eligible articles based on title and abstract. A er this, two authors (LCB, APB) independently reviewed full texts for inclusion in the review. We resolved disagreements by discussion; and if they remained unresolved, we referred them to an arbitrator within the study team (TJQ). Where disagreements were resolved, our default position was to include the study in the review. The study selection process is detailed in a PRISMA flow diagram ( Figure 1).

Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews

Figure 1. (Continued) Data extraction and management
We developed a study-specific proforma, and extracted data on the following: study characteristics (setting, type, number of participants, diagnostic criteria, language, index test); demographics of the participants (age, gender, diagnosis, comorbidities); study quality assessment; and heterogeneity. The data that we collected with the study proforma are detailed in Appendix 2.
Two review authors (LCB, APB) independently extracted data. Test accuracy data were cross-tabulated in two-by-two tables of index test results (positive or negative) against the target condition (positive or negative). We resolved disagreements between authors on data extraction by discussion. We extracted the results directly into tables in Review Manager 5 so ware (Review Manager 2014).

Assessment of methodological quality
Two authors (LCB, APB) independently assessed methodological quality, using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS-2) (Whiting 2011). The tool consists of four domains: patient selection; index tests; reference standard; and patient flow. We assessed each domain in terms of risk of bias, and the first three domains were considered in terms of applicability. We piloted the QUADAS-2 tool on the first five studies included in the review. Where there was poor agreement between the two review authors, we revised and re-piloted the tool. We resolved disagreements between authors on study quality by discussion.

Library
Trusted evidence. Informed decisions. Better health.

Cochrane Database of Systematic Reviews
We graded studies as being at high, medium or low risk of bias, and presented a narrative summary for each study (Characteristics of included studies). The QUADAS-2 tool is available in Appendix 3, and the anchoring statements in Appendix 4. The use of the reference standard and index tests are not completely independent of one another, and this introduces a risk of incorporation bias; we assessed included studies for the presence of incorporation bias.
The STARDdem tool has been recently developed to report the quality of study reporting in dementia (Table 2) (Noel-Storr 2014). In addition to reporting methodological quality, this review also reported on the quality of study reporting using this checklist (www.ncbi.nlm.nih.gov/pmc/articles/PMC4115600/ table/T3/?report=objectonly).

Statistical analysis and data synthesis
The target condition comprised three categories: 1) undi erentiated (all-cause) dementia; 2) specific dementia subtypes (Alzheimer's disease, vascular dementia, frontotemporal dementia, Lewy body dementia); and 3) MCI. The index test comprised two categories: ACE-III; or the mini-ACE. The setting also comprised three categories: primary; secondary; and community care. Due to insu icient studies at each of these levels we were unable to perform meta-analysis, and we have provided a descriptive summary of the numerical results.
For all included studies (cross-sectional), we extracted data in binary two-by-two tables (binary test results cross-classified with the binary reference standard) and we used this to calculate sensitivities and specificities, with 95% confidence intervals. We have presented individual study results graphically by plotting estimates of sensitivities and specificities in a forest plot. All analyses were performed with Review Manager 5 so ware (Review Manager 2014). As outlined above, data are presented at predefined thresholds of 82 and 88 for the ACE-III (Velayudhan 2014), and 21 and 25 for the mini-ACE (Hsieh 2015). Each study included in this review can contribute to one or more thresholds, and we excluded from this review studies which do not report any of these thresholds. We undertook graphical presentations for all predefined thresholds reported in the included studies.
We did not undertake summary and univariate analyses due to insu icient studies for each of the test thresholds and settings, and significant heterogeneity between the included studies. We present results for each individual study in tables and forest plots (Summary of findings 1, Figures 4 to 11).

Investigations of heterogeneity
As anticipated in the protocol, there were insu icient studies for heterogeneity analysis. In line with previous Cochrane DTA reviews of neuropsychological tests, we anticipated there would be a number of sources of heterogeneity in the studies identified for review (Creavin 2016; Davis 2013; Davis 2015; Harrison 2016). We explored the key factors, as outlined below, in a pre-specified heterogeneity analysis.

Case mix
The case mix of the populations included in the studies could introduce significant heterogeneity in terms of age, dementia diagnosis, specific versus unselected populations, and the severity or stage of the dementia diagnosis. The test properties are likely to di er in younger compared to older populations: studies where less than 20% of the population is under 65 years of age are not likely to be representative of this population. The majority of studies enrolled adults from an unselected population; some studies, however, enrolled a specific or limited population. There were insu icient studies to conduct sensitivity analyses; data were therefore collected on the type of study population enrolled and summarised in the Characteristics of included studies and Summary of findings 1.

Reference standard criteria
An important source of heterogeneity, and a key component of methodological quality, is the process by which the cases of dementia or MCI are confirmed and sub-classified. We collected data on this process, including which reference standard or criteria were used; whether it was by consensus meeting, individual assessment, or algorithm; and whether imaging or biochemical investigations were included. We assessed the quality of this process at study level using the QUADAS-2 tool.

Technical features of the index tests
Several thresholds have been reported in the literature for both the ACE-III and mini-ACE; we have, however, selected for analysis the two most consistent levels which are currently used in clinical practice. Data were collected for all of the predefined thresholds for each test.
We investigated heterogeneity informally through visual examination of forest plots of sensitivities and specificities. There were insu icient data present for formal investigation of the sources of heterogeneity through subgroup or regression analyses.

Sensitivity analyses
We did not undertake sensitivity analyses due to insu icient studies for analysis.

Assessment of reporting bias
We did not examine reporting bias in this review, as current quantitative methods for exploring reporting bias are not well established for studies of DTA. Specifically, we did not consider funnel plots of the diagnostic odds ratio versus the standard error of this estimate.

Results of the search
In total, the search identified 5659 records. A er de-duplication we were le with 2937 references to assess, of which we obtained 62 full-text articles to further screen against the inclusion and exclusion criteria for the review.
This review includes seven studies with a total of 1711 patients included in analyses. The inclusion and exclusion of studies is summarised in the PRISMA flow diagram (Figure 1).

Figure 3. Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
We classified the majority of domains as unclear or low risk of bias for all of the studies included; we rated no study as low risk of bias across all four of the categories. Where there was insu icient information to deem a study at low or high risk of bias, we contacted study authors for more information or clarification. We contacted all seven study authors to provide further information; of these, three authors responded to queries (Hobson 2016; Lees 2017; Takenoshita 2019).

Patient selection/sampling
We assessed four studies to be at unclear risk of bias In terms of applicability, we found three studies to be low risk of bias, recruiting from out-patient cognitive disorder clinics where patients were presenting with cognitive decline (Jubb 2015; Larner 2019; Takenoshita 2019). We found the remaining four studies to be at unclear risk of bias as they did not explicitly state they recruited patients presenting with cognitive decline (Li 2019; Yang 2019), or recruited from populations at high risk of cognitive impairment (patients with chronic kidney disease and type two diabetes (Hobson 2016), and post stroke (Lees 2017)). For three studies there was low concern in terms of applicability in the conduct of the index test (Hobson 2016; Jubb 2015; Lees 2017); we felt, however, that the remaining four studies provided insu icient information for this to be assessed and have therefore assessed them as unclear applicability (Larner 2019; Li 2019; Takenoshita 2019; Yang 2019).

Reference standard application
In this domain, we classified five studies at low risk of bias ( We classified all seven studies at low risk of applicability concerns given that the appropriate reference standards were used to diagnose dementia.

Flow and timing
In this domain, we classified four studies at low risk of bias

Reporting quality
We used the STARDdem tool to assess reporting quality (Appendix 5). A summary of the reporting quality can be found in Table 2. Areas found to have consistently low reporting across included studies were: the participant sampling procedure; the training and expertise of the persons delivering the index test; methods and estimates of test reproducibility; the number of participants who did not undergo the index test or reference standard and reasons; the time interval between the index test and the reference standard; a cross-tabulation of the results of the index test and the reference standard; adverse events; estimates of statistical uncertainty; and how missing data, outliers or indeterminate data were handled.

Findings
We have summarised the study characteristics for included studies in Characteristics of included studies, and the findings in the Summary of findings 1. We did not perform meta-analysis of included studies due to too few studies at pre-specified test thresholds for each of the index tests (less than three), and significant di erences in patient populations limiting the interpretation of results. The sensitivity and specificity findings from each study at published thresholds are summarised in Figures  4 to 10.

Target condition
All-cause dementia was the target condition in three studies, and post-stroke cognitive impairment in one study. In addition, two studies also investigated diagnostic test accuracy in MCI. None of the studies investigated specific dementia sub-types.

Setting
All four studies were conducted in secondary care settings -we identified no studies in primary or community care settings. Of these, three studies were conducted in a memory clinic or in a Neurology department (Jubb 2015; Li 2019; Takenoshita 2019), and one in the stroke rehabilitation setting (Lees 2017). All four studies had a relatively high prevalence of dementia (range: 32.4% to 55.9%).

Setting
All three studies were conducted in secondary care settingswe identified no studies in primary or community care settings. Of these, two studies were conducted in a memory clinic or in a Neurology department (Larner 2019; Yang 2019), and one in a clinic for chronic kidney disease (Hobson 2016). The prevalence of dementia was lower in these studies than for the ACE-III (range: 15% to 32%).

Target condition
All-cause dementia and MCI were the target conditions in all three studies of the mini-ACE. No study investigated diagnostic test accuracy of dementia sub-types.

Threshold
In two studies, at a threshold of 25 to detect dementia, sensitivity was 96% to 99% and specificity was 32% to 85% (  In three studies, at a threshold of 25 to detect MCI, sensitivity was 88% to 95%, and specificity was 46% to 72% (

Cochrane Database of Systematic Reviews
In three studies, at a threshold of 21 to detect dementia, sensitivity was 70% to 96%, and specificity was 64% to 100% ( Figure 10

Figure 10. Forest plot of Mini-ACE for the detection of dementia at a threshold of 21.
Only one study investigated the diagnostic test accuracy for the detection of MCI at a threshold of 21 (sensitivity: 64%; specificity: 79%) ( Figure 11) (Larner 2019).

Summary of main results
This review identified seven studies, four examining the diagnostic test accuracy of the ACE-III, and three of the mini-ACE. There was significant heterogeneity between studies in terms of the study populations, which precluded meta-analysis. Of the included studies, five had relatively small sample sizes, with two studies enrolling larger samples of more than 300 participants. Risk of bias was generally unclear to low across the majority of the domains; and the quality of study reporting was variable, particularly with reference to the conduct of the index test and reference standard, and the dropout or flow of participants. We determined optimal thresholds from study data in three studies, and classified them at high risk of bias. The sensitivity of the ACE-III varied across thresholds and patient populations (range: 75% to 97%), but specificity was more variable between populations, being significantly poorer in the post-stroke rehabilitation setting (range: 5% to 11%) compared to an outpatient memory clinic (range: 50% to 77%). Similarly, sensitivity of the mini-ACE for the detection of dementia and MCI varied across thresholds and patient populations (range: 64% to 99%) but with more variability in specificity (range: 32% to 100%).

Strengths and weaknesses of the review
The strengths of this review are the use of a robust and pre-specified protocol in accordance with guidance published on undertaking a diagnostic test accuracy review of cognitive assessment tools (Davis 2013). The review was conducted in accordance with this protocol. An extensive search was undertaken by Information Specialists at Cochrane across a range of databases. Despite this, only seven identified studies were suitable for inclusion. This was less likely to be as a result of a restricted search or extensive exclusion criteria, and more likely due to the lack of cross-sectional studies examining the diagnostic test accuracy properties of the ACE-III and mini-ACE. Furthermore, the number of studies was reduced significantly as a result of the recent publication of data from several studies in one manuscript (Larner 2019). The small number of studies identified is in keeping with previous Cochrane Reviews of the IQCODE (Harrison 2016), and the MoCA (Davis 2015). This review is also strengthened by the independent article screening, quality assessment, and data extraction by two study authors (LB and APB). The quality assessment tool (QUADAS-2) and study reporting criteria (STARDdem) are specific to diagnostic test accuracy studies and those reporting research in dementia. Furthermore, where domains in the risk assessment were found to be unclear, we contacted the study authors to provide additional information on this.
Weaknesses of this review include the small number of studies identified which precluded meta-analysis of the individual study findings to generate pooled estimates. In addition, there was significant heterogeneity between the study populations in which accuracy of the tools were investigated, which limits the generalisability of the findings. No studies were conducted in primary or community settings, and all of the studies investigated populations either at high risk of cognitive impairment, or where the prevalence of dementia or MCI is likely to be higher. Three of the studies in this review calculated optimal thresholds using their own study data, limiting the interpretation of these studies due to a higher risk of bias.

Applicability of findings to the review question
The results of the studies included in this review have limited generalisability given that they were all conducted in secondary care settings and in limited geographical locations (UK, China, Japan). The sensitivity of the ACE-III and mini-ACE was generally high across these settings at both thresholds for the detection of MCI or dementia, but specificity was more variable. Specificity could be improved by using low thresholds of detection, but many of the studies used their own study data to calculate these thresholds leading to a high risk of bias. A lack of specificity could result in a higher number of false positive diagnoses, with a risk Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews of significant psychological harm to patients from misdiagnosis. Given there are currently few treatment options available for people living with dementia, the priority for sensitivity may be lower than for a specific test which is able to exclude a diagnosis of dementia.

Implications for practice
Overall, there is insu icient information in terms of both quality and quantity to recommend the use of either the ACE-III or mini-ACE for the detection of dementia in patients presenting with cognitive decline or in high-risk groups. As there are no studies in a community or primary care setting, the test properties of either the ACE-III or mini-ACE in a low prevalent setting remain unknown. In secondary care where the prevalence of dementia or MCI is likely to be higher, particularly in high-risk groups or those presenting with symptoms of cognitive decline, the ACE-III and mini-ACE have good sensitivity for the detection of cognitive impairment, but specificity remains highly variable at di erent thresholds and in di erent patient populations. Thus, the ACE-III or mini-ACE should only be used by clinicians for the screening of cognitive impairment as an adjunct to clinical history, neuroimaging, and laboratory testing. It is also important to note that the published thresholds of 82 and 88 for the ACE-III, and 21 and 25 for the mini-ACE were originally generated from case-control studies to detect cognitive impairment, and thus have been developed in studies with a high risk of bias. Clinicians may want to consider the need for further or additional neuropsychological testing where there remains diagnostic uncertainty, given the lack of specificity of these tools for excluding other causes of cognitive decline. Of the thresholds published in the index study, the lower thresholds (21 for the mini-ACE, and 82 for the ACE-III) provide better specificity with acceptable sensitivity and may provide better utility in a secondary care setting.

Implications for research
Further research is needed to determine the clinical utility of the ACE-III and mini-ACE in the detection of dementia, dementia subtypes, and MCI. Specifically, the optimal thresholds for detection need to be determined in a variety of settings (primary care, secondary care (inpatient and outpatient), community services), prevalences, cultures, and languages. Five of the studies included in this review certainly highlighted that the previously published thresholds may not be applicable to all populations, settings, and languages, and may require adjustment depending on patient characteristics, and disease prevalence. Studies should follow the STARDdem reporting guidelines for diagnostic test accuracy studies in dementia. Ideally, studies should be cohort in design with the ACE-III or mini-ACE conducted on the same day asbut independent of -the reference standard to reflect clinical practice. Studies could also take a delayed verification approach, with prospective application of the reference standard with or without histopathological confirmation, which provides more accurate estimates of test properties. Practically, however, delayed verification studies are problematic, with significant losses to follow-up as identified in previous Cochrane Reviews (Harrison 2016).

A C K N O W L E D G E M E N T S LCB is a Research Fellow supported by the Dunhill Medical Trust.
TGR is a Senior Investigator for the National Institute for Health Research.
We would like to acknowledge peer reviewers Dimity Pond and Susan Shenkin for their comments and feedback.

Study characteristics
Patient sampling 118 patients attending outpatient clinic appointments who were aged over 60 years and had a diagnosis of chronic kidney disease (CKD) (eGFR < 60 ml/min/1.73 m ), and a diagnosis of diabetes. Participants who had a pre-existing diagnosis of stroke, cognitive impairment, or dementia were excluded from the study. The sampling procedure was not well described and it is not clear if this was a consecutive or random sample of patients.
The following additional information was provided by the study author: a consecutive sample of patients attending a renal diabetic clinic were enrolled. Patients were excluded if they had had a stroke or pre-existing neurocognitive disorder. All patients were screened for cognitive impairment as part of routine clinical management.

Patient characteristics and setting
This study included 118 patients over the age of 60 with diagnoses of CKD and diabetes. Participants were a community-based sample attending an outpatient clinic appointment. The type of clinic and geographical location were not specified. All patients were screened with ACE-III and MMSE, and the mini-ACE scores were derived from the ACE-III assessment. The diagnosis of dementia and MCI was based upon patient, informant, clinical case review, neuropsychological assessment, and application of the DSM-V and Petersen criteria respectively. In addition, MCI was diagnosed on the basis of patients', care- Cochrane Database of Systematic Reviews givers', informants', or clinicians' observed or reported symptoms of cognitive impairment, ability to perform activities of daily living, in the absence of delirium or dementia.

Addenbrooke's Cognitive Examination III (ACE-III) and mini-ACE for the detection of dementia and mild cognitive impairment (Review)
Further information provided by the author on request: the diagnosis of dementia was reached by consensus by all of the authors in this study, who also clinically managed all of the patients participating in the study.
27 participants had a diagnosis of dementia, 33 had a diagnosis of MCI, and 52 had no diagnosis of cognitive impairment. The prevalence of dementia in this sample was 24%.
There were no significant differences in baseline characteristics between participant groups.
Age in the non-cognitively impaired group was 76.4 ± 7.4 years, in the MCI group 78.1 ± 10.1 years, and in the dementia group 79.8 ± 5.4 years.
Education for the non-cognitively impaired group was 10.9 ± 1.9 years, for the MCI group 10.7 ± 1.8 years, and for the dementia group 10.5 ± 2.5 years.
Sources of the referrals were not specified.

Index tests
The mini-ACE was the index test, but scores were derived from the ACE-III. Basic details of the mini-ACE were provided, but no details on the administration or training of those conducting and interpreting the test. The test thresholds were pre-specified at 21 and 25.
Additional information provided by the author on the conduct of the index test: the assessments were completed by the physician in clinic due to a lack of clinical sta with suitable training and expertise in delivering cognitive assessment in the real clinic. Flow and timing Of the original sample of 118 patients, 112 were included in the final analysis. 6 patients were unable to complete the ACE-III (4 due to visual impairment, 1 due to learning difficulties, and 1 patient declined participation).
The time interval between the reference standard and the index test was unclear.
Information on the true positive and negative values were not provided in the original publication and were calculated from sensitivity and specificity data reported in the publication.
Further information provided by the author: the time interval between the mini-ACE/ACE-III and the diagnosis was not reached on the same day due to the assembly of patient and significant other reports, review of records, tests and getting the multidisciplinary team together.  Were the index test results interpreted without knowledge of the results of the reference standard?
No If a threshold was used, was it pre-specified?

Yes
Were sufficient data on ACE-III or mini-ACE application given for the test to be repeated in an independent study?

DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition?

Yes
Were the reference standard results interpreted without knowledge of the results of the index tests?

No
Were sufficient information on the method of dementia/MCI assessment given for the assessment to be repeated in an independent study?
Yes High Low

DOMAIN 4: Flow and Timing
Was there an appropriate interval between index Unclear Hobson 2016 (Continued) (Continued)

Study characteristics
Patient sampling 69 patients presenting to an outpatient memory service for the investigation of a memory or cognitive complaint, over a period of 16 months. Participants were aged between 75 and 85 years, not established on anti-dementia medication, had capacity to consent to the study, were not distressed by the assessment process, and had not completed the ACE-III as part of their clinical assessment. Exclusion criteria for the study were: unable to complete the ACE-III or there was significant evidence for an alternative cause for their cognitive impairment that was not degenerative or vascular in pathology (i.e. substance misuse, head injury, epilepsy, severe mood disorder).
Participants were identified through initial clinical assessment appointments but it was not clear if this was a random or consecutive sample.
Patient characteristics and setting 69 patients were recruited from a memory clinic at the Leeds and Yorkshire NHS Foundation Trust Memory Service.
Sources of referrals were not reported.
A diagnosis of dementia was made by an old age psychiatrist or specialist registrar using clinical history or informant report, neuroimaging, brief cognitive assessment, mood assessment, and dementia screening blood tests. If the diagnosis remained unclear, participants underwent a comprehensive neuropsychological assessment with a clinical psychologist.
33 participants had a diagnosis of dementia, 26 had no diagnosis of dementia. The prevalence of dementia in the sample was 55.9%.
There was a significant difference in the proportion of male and female participants between the groups, but no other differences were significant in the participant demographics.
Age of the participants with no dementia was 79.5 ± 2.8 years and for those with dementia was 80.4 ± 2.7 years.
In the group without dementia 73.1% were male; and in the group with dementia 51.5% were male.

Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews ACE-III scores for the participants without dementia were 87.3 ± 5.9, and for those with dementia were 70.4 ± 12.5.

Index tests
The index test was the ACE-III which was administered by either an experienced clinical psychologist or 1 of 3 postgraduate researchers who were trained in the administration of the ACE-III. The ACE-III scores were checked for consistency to improve the test reliability. Test thresholds of 82 and 88 were pre-specified, but optimal cut-o s were also calculated using study data.

Target condition and reference standard(s)
Target conditions: dementia, Alzheimer's disease, vascular dementia, Alzheimer's disease with cerebrovascular disease, and MCI.
Flow and timing 69 patients agreed to the initial contact, but of these 6 declined to participate, 1 lacked capacity to consent, 1 participant withdrew from the process, 1 participant had an unclear diagnosis, and 1 participant put suboptimal effort into completing the ACE-III. Therefore, the total sample was 59 patients.
The time interval between the reference standard and the index test was unclear.
Information on the true positive and negative values were not provided in the original publication and were calculated from sensitivity and specificity data reported in the publication.
Comparative Notes

Item Authors' judgement Risk of bias Applicability concerns DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled?
Unclear Was a case-control design avoided?
Yes Did the study avoid inappropriate exclusions?
Yes Unclear Low

DOMAIN 2: Index Test All tests
Were the index test results interpreted without knowledge of the results of the reference standard?
Yes If a threshold was used, was it pre-specified?

Yes
Were sufficient data on ACE-III or mini-ACE application given for the test

Study characteristics
Patient sampling 755 consecutive new outpatient referrals to a dedicated cognitive function clinic based at a regional neuroscience centre, located in the northwest of the UK. Patients were seen between June 2014 and December 2018.
There were no specific exclusion criteria, excepting patients with an established diagnosis of dementia.
Patient characteristics and setting 755 new outpatient referrals were recruited from a dedicated cognitive function clinic in the northwest of the United Kingdom.
The diagnosis was made by an experienced clinician using diagnostic criteria. The results of the mini-ACE were not used in the final diagnosis. No further information on the diagnostic process was provided in this publication; however the process was detailed as follows in a previous report (Williamson 2018): the diagnosis was made by an experienced clinician, based upon patient interview, collateral history (if available), neuroimaging, and neuropsychological assessment.
114 patients were diagnosed with dementia, 22 with MCI, and the remaining 419 patients with subjective memory complaints.
The median age of the whole sample was 60 years, and 47% of the sample were female. No further information on participant characteristics was provided.
The prevalence of dementia in the sample was 15%, and 29% for MCI.
The sources of the referrals were not specified.

Index tests
The index test was the mini-ACE. There was no information on the training or expertise of the person administrating the mini-ACE. There were no details provided on the administration of the mini-ACE. Test thresholds of 21 and 25 were pre-specified but optimal cuto s were also calculated using study data.
Target condition and reference standard(s)
Flow and timing 755 patients were recruited. Dropout rates were not reported.
No information on the time interval between the test and the reference standard; however in a previous report (Williamson 2018) the tests were completed on the same day.
Information on the true positive and negative values were not provided in the original publication and were calculated from sensitivity and specificity data reported in the publication.

Study characteristics
Patient sampling 86 patients who were admitted to 1 of 2 University Hospital stroke rehabilitation units. Participants were recruited who had a confirmed diagnosis of stroke at a minimum of 2 weeks post-event. Patients were excluded if the clinical team felt that cognitive assessment was inappropriate. Patients with depression or delirium were not excluded from the study.
Additional information from author: study sampling was sequential, but assessors did not go to the ward every day, and not everyone who was eligible agreed to participate, but everyone was asked if they would like to participate. The authors were not able to provide information on numbers excluded as a result of advice from the clinical team.
Participants underwent a multidisciplinary team assessment of cognition, and the final diagnosis was made by an experienced consultant in Geriatric Medicine based upon clinical psychology and occupational therapy assessments.
27 patients were diagnosed with post-stroke cognitive impairment. The prevalence of dementia in the sample was 53%.
The median age of the total sample was 74 years (interquartile range (IQR): 67 to 84 years), and 55% of the total sample were female. The median National Institutes of Health Stroke Scale was 9 (IQR: 6 to 13). 76% of the participants had an ischaemic stroke, and of 35% had a total anterior circulation stroke. The median time since stroke was 36 days (IQR: 20 to 55). 16% of patients were diagnosed with delirium, 8% had a preexisting diagnosis of dementia, and 12% of participants had pre-stroke depression.

Index tests
The index test was the ACE-III, in addition to the MMSE, and MoCA. The index test was performed by 1 of 2 psychology graduates who were trained in the use of the scales. The tests were administered as paper and pencil using verbal instructions in the first instance, and then further assistance if required. In total, 4 approaches were taken to completing the cognitive assessments, and test accuracy data are provided for each of the 4 approaches. The first approach excluded patients whose testing was incomplete and assigned a score of zero to partially completed items. At a threshold of 82, test sensitivity for approach 1 was 87% (95% CI 66% to 97%), and specificity was 5% (95% CI 1% to 25%). The second approach excluded patients with incomplete testing, and adapted partially completed assessments by excluding non-completed items from the total score. At a test threshold of 82, the sensitivity for approach 2 to detect cognitive impairment was 81% (95% CI 59% to 95%), and specificity was 10% (95% CI 1% to 31%). The third approach excluded all patients with either incomplete or partially incomplete tests. At a cut-o of 82, the sensitivity of approach 3 to detect cognitive impairment was 93% (95% CI 66% to 100%), and specificity was 11% (95% CI 1% to 35%). The final approach was the most inclusive which included all patients, assigning a score of zero to any incomplete items. The sensitivity to detect cognitive impairment at a threshold of 82 for approach 4 was 90% (95% CI 73% to 98%), and specificity was 5% (95% CI 1% to 22%). Participants were reapproached 1 week later if they were unable to complete the test at the first trial. The thresholds for the ACE-III were pre-specified at 82 and 88.
Target condition and reference standard(s) Target condition: post-stroke cognitive impairment.
Reference standard: guidelines on the diagnosis of post-stroke cognitive impairment (Brainin 2014).
Additional information from the author: no requests were made from the clinical team conducting the reference standard for results of the index tests.
Flow and timing 86 patients were admitted to the stroke rehabilitation units. 51 patients were included in the final analysis. 75 patients were eligible for the study, 24 were excluded due to: lack of capacity/no representative for consent (13), refused assessment (6), inappropriate to approach (2), discharged before test complete (2), language barrier (1).
No information on the time interval between the test and the reference standard.
Information on the true positive and negative values were not provided in the original publication and were calculated from sensitivity and specificity data reported in the publication. The sources of participants were not specified.

Index tests
The index test was the ACE-III which was translated and adapted culturally for a Chinese-speaking population using forward and backward translation methods. The translation and adaptation procedures were well described but there was no information on the training or expertise of the assessor. Test thresholds were not pre-specified, and the authors calculated optimal thresholds based on their study data. The authors also investigate the accuracy of the Chinese versions of the MoCA and MMSE.
Target condition and reference standard(s)  (Continued)

Study characteristics
Patient sampling 169 patients were recruited from the Department of Neurology, Sichuan Provincial People's Hospital, Chengdu, China. Inclusion criteria were: Chinese speaking, aged over 60 years, reasonable vision, hearing, and ability to communicate. Exclusion criteria were: major depression, schizophrenia, epilepsy, significant head injury, substance abuse, alcoholism, or other disorders which might influence task performances.
The sampling procedure was not well described and it was unclear if this was a consecutive or random sample.

Patient characteristics and setting
This study included 169 Chinese-speaking participants over the age of 60, who were recruited from the Department of Neurology in Chengdu, China.
The diagnosis of dementia was based upon demographic information, history or informant report, presentation at interview, general and neurological examination, neuropsychological examination, neuroimaging, screening blood tests. Daily and social function was evaluated using the Clinical Dementia Rating Scale. The Common Objects Memory Test was used to assess cognitive deficits. The diagnoses were based on the DSM-V criteria for dementia, and Petersen criteria for MCI. The healthy group had no memory complaints, and normal activities of daily living.
Diagnoses were made by 1 of 2 neurologists who checked each other's decisions, and disputes were resolved by consensus.
All participants with dementia were classified as mild severity, defined as Clinical Dementia Rating Scale of 1.
54 patients had a diagnosis of dementia, 64 had a diagnosis of MCI, and 51 were healthy. The prevalence of dementia in the sample was 32%, and 37.8% for MCI.   Are there concerns that the index test, its conduct, or interpretation differ from the review question?
Are there concerns that the target condition as defined by the reference standard does not match the review question? (Continued)

Appendix 4. QUADAS-2 anchoring statements
We have adapted the core anchoring statements provided for use with the QUADAS-2 tool. The original anchoring statements were determined from a two day multi-disciplinary group meeting, designed for use with the QUADAS-2 tool to support decisions concerning methodological quality for studies included in systematic reviews. Some of the original anchoring statements are less applicable to DTA reviews of neuropsychological assessments (ref MMSE review, etc.). Thus, two authors (LCB, APB) adapted the original anchoring statements specifically for this review, and these revised statements were reviewed by the co-authors. The tool and anchoring statements will be piloted against the first five studies included in this review and if there is poor inter-rater agreement of study methodological quality, the statements will be revised and re-piloted until good agreement between raters is achieved.

Domain 1: participant selection
Was a consecutive or random sample of participants enrolled?
The method of sampling should be stated or described. Non-random sampling, sampling based on volunteers, or selecting participants from a clinic or research population is more likely to introduce a high risk of bias and should be classified as such, whereas consecutive or random sampling are least likely to introduce bias, and should be classified as low risk.

Weighting: high risk
Was a case control design avoided?
Case control designs are associated with a high risk of bias and should be excluded from this review. However, nested case control studies (where the study population is drawn from a larger pool of patients from an interventional or cohort study) are associated with a lower risk of bias, and are considered for inclusion in this review. Nested-case control studies should be classified as a high risk of bias, and any study which increases or decreases the proportion of patients with the target condition (i.e. enrichment from secondary care settings) should be classified as high risk of bias.
Weighting: high risk

Did the study avoid inappropriate exclusions?
Studies which do not explicitly detail exclusion criteria will be classified as unclear risk of bias, but study authors will be contacted for this information. Studies which clearly detail all exclusions, and are felt to be appropriate by review authors will be classified as low risk of bias. Exclusion criteria must be justified for studies which exclude di icult to diagnose groups. It is anticipated that there will be common Library Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews exclusion criteria (e.g. substance misuse, other degenerative disease) for included studies, which are listed in the protocol. Community studies with extensive exclusion criteria should be classified at high risk of bias. Post-hoc exclusions will be classified as high risk of bias.
Weighting: high risk

Domain 2: index test
Could the conduct or interpretation of the ACE-III/mini-ACE have introduced bias?
Studies will be considered low risk where the investigators conducting the ACE-III/mini-ACE were blinded to the participant's diagnosis or were independent from the study and without knowledge of the reference standard. Studies which explicitly state this do not require further information on the blinding or independence of the process and will be classified as low risk of bias. Studies will be classified as low risk of bias if the ACE-III or mini-ACE were conducted prior to the reference standard.

Weighting: high risk
Were the ACE-III/mini-ACE thresholds pre-specified?
A study will be classified as high risk of bias where the authors set the optimal cut o point post-hoc using their own study data. Studies that do not use defined thresholds, and use an alternative methods of analysis will be classified as not applicable.
Weighting: high risk

Were su icient data on ACE-III or mini-ACE application given for the test to be repeated in an independent study?
For studies to be classified at low risk of bias, information on the method of administration (i.e. appropriately qualified/trained), and the language of assessment should be provided. If a translated version of the ACE-III or mini-ACE is used, details of the scale and on the validation process will be needed to be classified at low risk of bias.
Weighting: low risk

Domain 3: reference standard
Is the reference standard likely to correctly classify the target condition?
Studies using reference standards listed in the protocol or a recognised/validated reference standard will be considered at low risk of bias. Studies using a reference standard not recognised by the authors or the Cochrane Dementia and Cognitive Improvement Group, will be classified at high risk of bias.

Weighting: high risk
Were the reference standard results interpreted without knowledge of the results of the ACE-III/mini-ACE?
For a study to be classified as low risk of bias, the investigators would need to have interpreted the reference standard results independently to those of the ACE-III or mini-ACE. Studies which explicitly state this do not require further information on the blinding or independence of the process and will be classified as low risk of bias. If the ACE-III or mini-ACE were used as part of the clinical dementia/MCI assessment as reference standard, this will be considered to be at high risk of bias.

Weighting: high risk
Were su icient information on the method of dementia/MCI assessment given for the assessment to be repeated in an independent study?
The method of dementia assessment will need to be described to be considered at low risk of bias. Information should be provided on: the training and expertise of the assessor, whether it was by individual, consensus, or algorithm, and the use of neuropsychological, laboratory and neuroimaging assessments.
Weighting: high risk if not described

Domain 4: patient flow and timing
Was there an appropriate interval between the ACE-III or mini-ACE and the reference standard?
Ideally, the reference standard and ACE-III or mini-ACE would be completed on the same day or visit, to minimise changes or fluctuations in cognition over time. However, dementia is slowly progressive and an irreversible condition so delay is unlikely to introduce significant bias. However, patients with MCI can revert to normal cognition, progress, or remain stable over time. Therefore, a time delay could a ect the measured cognitive status of these individuals, however the duration over which this might occur is not known. We have therefore set an arbitrary cut o of one month for studies assessing MCI. Longitudinal and delayed verification studies are excluded from this review.

C O N T R I B U T I O N S O F A U T H O R S
LCB developed the dra and final versions of the manuscript. TGR, VJH, AB, RBP, TJQ, and CPN all reviewed and contributed to the dra and final versions of the manuscript.
LCB and APB independently screened all studies on title and abstract, and at full text.
LCB and APB independently quality assessed all included studies using QUADAS-2 and the STARDdem criteria.
LCB and ABP independently extracted data from the publications.
TQ mediated disagreements between LCB and APB in quality assessment and inclusion of relevant studies.

D E C L A R A T I O N S O F I N T E R E S T
Lucy C Beishon: none known. Angus P Batterham: none known. Terry J Quinn: none known. Ronney B Panerai: none known. Christopher P Nelson: none known. Thompson Robinson: none known. Victoria J Haunton: none known.

Internal sources
• No sources of support supplied

External sources
• Dunhill Medical Trust, Other.