Systematic Review of Evidence-Based Guidelines for Prehospital Care

Abstract Introduction: Multiple national organizations have identified a need to incorporate more evidence-based medicine in emergency medical services (EMS) through the creation of evidence-based guidelines (EBGs). Tools like the Appraisal of Guidelines for Research and Evaluation (AGREE) II and criteria outlined by the National Academy of Medicine (NAM) have established concrete recommendations for the development of high-quality guidelines. While many guidelines have been created that address topics within EMS medicine, neither the quantity nor quality of prehospital EBGs have been previously reported. Objectives: To perform a systematic review to identify existing EBGs related to prehospital care and evaluate the quality of these guidelines using the AGREE II tool and criteria for clinical guidelines described by the NAM. Methods: We performed a systematic search of the literature in MEDLINE, EMBASE, PubMED, Trip, and guidelines.gov, through September 2018. Guideline topics were categorized based on the 2019 Core Content of EMS Medicine. Two independent reviewers screened titles for relevance and then abstracts for essential guideline features. Included guidelines were appraised with the AGREE II tool across 6 domains by 3 independent reviewers and scores averaged. Two additional reviewers determined if each guideline reported the key elements of clinical practice guidelines recommended by the NAM via consensus. Results: We identified 71 guidelines, of which 89% addressed clinical aspects of EMS medicine. Only 9 guidelines scored >75% across AGREE II domains and most (63%) scored between 50 and 75%. Domain 4 (Clarity of Presentation) had the highest (79.7%) and domain 5 (Applicability) had the lowest average score across EMS guidelines. Only 38% of EMS guidelines included a reporting of all criteria identified by the NAM for clinical practice guidelines, with elements of a systematic review of the literature most commonly missing. Conclusions: EBGs exist addressing a variety of topics in EMS medicine. This systematic review and appraisal of EMS guidelines identified a wide range in the quality of these guidelines and variable reporting of key elements of clinical guidelines. Future guideline developers should consider established methodological and reporting recommendations to improve the quality of EMS guidelines.


INTRODUCTION
Over the past two decades, multiple national organizations and individuals have identified a need to incorporate more evidence-based medicine (EBM) into emergency medical services (EMS). This has been demonstrated in the creation of evidence-based guidelines (EBGs) written directly for care in the out-of-hospital setting, as well as guidelines focused on in-hospital care that recognize and address the importance of initiating an evidence-based approach from the initiation of patient care by prehospital personnel. Yet the incorporation of robust evidence evaluation to systematically guide prehospital care has been a slow-moving process. In its 2007 landmark publication "Emergency Medical Services: At the Crossroads," the National Academy of Medicine (NAM, formerly the Institute of Medicine) highlighted that most prehospital interventions were based upon weak evidence or expert opinion alone (1). Since that time, new standards for research and guideline development have continued to shine a spotlight on this issue.
Based on a recognized need to improve the quality and evidence assessment for prehospital guidelines, considerable efforts have been undertaken over the past decade to increase the number and quality of prehospital EBGs. In 2012, the National EMS Advisory Council (NEMSAC) and the Federal Interagency Committee on EMS (FICEMS) convened an expert panel that created an eight-step model for the development of prehospital EBGs (2). Following this guidance, the National Highway Traffic Safety Administration (NHTSA) funded several individual prehospital EBGs (3)(4)(5)(6)(7)(8). Additional funded efforts have aimed to better understand challenges to guideline implementation (9,10), as well as the development of performance measures to evaluate the quality of prehospital care (11,12). Further recognizing a need to establish a sustainable process to develop, implement, and evaluate prehospital EBGs, in 2013 NHTSA entered into a cooperative agreement with the National Association of EMS Physicians (NAEMSP) to create the National Prehospital EBG Strategy (13). The Strategy identified seven key action items aimed to promote the development, implementation, and evaluation of prehospital EBGs, and led to the formation of the Prehospital Guidelines Consortium (PGC), comprised of 35 member organizations collaborating to see this Strategy fulfilled (14). Concurrently, EMS guidelines have continued to be developed, published, and implemented by multiple entities, all of which support these collaborative efforts to improve the availability and use of scientific evidence in prehospital care.
Despite these efforts, neither the quantity nor quality of prehospital EBGs have been previously evaluated. In 2011, the NAM laid out specific criteria with which to assess the quality of clinical practice guidelines (CPGs) (15), and tools like the Appraisal of Guidelines for Research and Evaluation (AGREE) II have provided an objective and systematic means of assessing that quality (16). While these tools have been adopted in the development or assessment of EBGs across other medical disciplines (17)(18)(19), they have not been used consistently for the prehospital setting. Such an initiative has the potential to inform future work, to fill gaps between existing guidelines, to facilitate improvements in guideline quality, and to facilitate implementation of guidelines by end users. We therefore aimed to perform a systematic review of the medical literature to identify existing EBGs related to prehospital care. We further aimed to evaluate the quality of these guidelines with the AGREE II tool, and determine which guidelines meet the quality standards for CPGs set by the NAM. Finally, to promote improvements in the quality of future guidelines, we aimed to determine if there are any components of guideline development that are consistently missing or of lower quality across existing prehospital EBGs.

Study Design
We performed a systematic review and evaluation of published guidelines related to prehospital care. This project was developed and overseen by members of the PGC Development Committee and PGC leadership. Two investigators (ST, EL) developed the methodology and systematic protocol outlined below, with input and support from a working group of the committee. This strategy was then repeated in September 2018, updating the data to include articles between May 2016 and September 2018. At the time of this second search, MEDLINE and PubMed had become fully compatible, and guidelines.gov had ceased operation. Further, we noted that the Trip database provided no novel articles during the first search, so we excluded it as well. Finally, we manually searched bibliographies, updated articles where relevant, and removed duplicates that were not found in the initial screening process. The most recent available version of any guideline was evaluated.

Guideline Selection and Categorization
Two investigators (ST, CL) independently screened titles for relevance to prehospital care, and then reviewed abstracts to retain publications containing two key features of guidelinesrecommendations for practice based on a literature review. We used the kappa statistic to estimate inter-rater reliability, and all disagreements were mediated by consensus of two additional investigators (EL, KB). Again, we repeated this process for the second search. However, the newer articles were then screened by ST and JF, and disagreements were mediated by CM.
Each guideline was categorized based on the American Board of Emergency Medicine (ABEM) 2019 Core Content of EMS Medicine (20). Guidelines were categorized into multiple topic areas if appropriate based on the primary content (e.g. primary questions or recommendations) addressed by each guideline. For the content area of "special considerations for evaluation, treatment, transport, and destinations," a guideline was only categorized in this area if primarily described as addressing special considerations for time-life critical conditions or special patient populations, consistent with the limited ABEM description of this category.

Guideline Appraisal
We used AGREE II for the guideline appraisal process. AGREE II is a validated tool that looks at six key domains, encompassing 23 different items, of guideline quality (Table 1) (16). Each item is scored on a seven-point scale, with an online training program and user-guide providing direction on how to score. In addition, appraisers are asked to rate the overall quality of the guideline, and to provide a recommendation for its use. However, these values are meant to be subjective in nature, so we did not include them in this study.
Each guideline was appraised by 3 of 7 investigators (ST, CL, EB, JF, LR, MW, CJ), and we combined these appraisals with the calculation of cumulative domain totals (Table 1). We then averaged the cumulative domain totals across all domains, which provided an overall score for each guideline. Next, we grouped these scores into quartiles, which provided an aggregate assessment of guideline quality. Finally, we averaged the cumulative domain totals The overall objective(s) of the guideline is (are) specifically described. 2 The health question(s) covered by the guideline is (are) specifically described. 3 The population (patients, public, etc.) to whom the guideline is meant to apply is specifically described.

Stakeholder Involvement 4
The guideline development group includes individuals from all relevant professional groups. 5 The views and preferences of the target population (patients, public, etc.) have been sought. 6 The target users of the guideline are clearly defined. 3. Rigor of Development 1 Systematic methods were used to search for evidence. 2 The criteria for selecting the evidence are clearly described. 3 The strengths and limitations of the body of evidence are clearly described. 4 The methods for formulating the recommendations are clearly described. 5 The health benefits, side effects, and risks have been considered in formulating the recommendations. 6 There is an explicit link between the recommendations and the supporting evidence. 7 The guideline has been externally reviewed by experts prior to its publication. 8 A procedure for updating the guideline is provided. 4. Clarity of Presentation The recommendations are specific and unambiguous. 2 The different options for management of the condition or health issue are clearly presented. 3 Key recommendations are easily identifiable. 5. Applicability 1 The guideline describes facilitators and barriers to its application. 2 The guideline provides advice and/or tools on how the recommendations can be put into practice. 3 The potential resource implications of applying the recommendations have been considered. 4 The guideline presents monitoring and/or auditing criteria. 6. Editorial Independence 1 The views of the funding body have not influenced the content of the guideline. 2 Competing interests of guideline development group members have been recorded and addressed.

Seth Turner et al. PREHOSPITAL GUIDELINES
across all guidelines, which revealed specific areas in which prehospital guidelines tended to score well or poorly. Guidelines were also evaluated based on criteria for clinical practice guidelines established for the National Guidelines Clearinghouse by the U.S. Department of Health and Human Services, Agency for Healthcare Research and Quality (AHRQ) (21). These criteria were adopted from and based on the NAM publication "Clinical Practice Guidelines We Can Trust (22)." Guidelines were assessed across 6 core criteria with subcomponents summarized in Table 2. Appraisals were determined by consensus of CM and EC through full-text review of each guideline. Supplementary content linked to or referred to by the published guideline was reviewed and considered toward meeting any criterion. Based on the guideline search strategy, all publications were in English, available to the public (for free or for a fee), and the most recent version of the guideline was reviewed; these criteria are not reported further.

Guideline Search
The initial search strategy yielded N ¼ 2188 citations  A summary of the evidence synthesis (see 3d above) included in the guideline that relates the evidence to the recommendations, e.g., a descriptive summary or summary tables.

Assessment of Benefits/Harms and Alternative Care Options
The clinical practice guideline or its supporting documents contain an assessment of the benefits and harms of recommended care and alternative care options.

English and to the Public
The full text guideline is available in English to the public upon request (for free, or for a fee).

Current
The guideline is current and the most recent version. ÃÃ Ã If an explicit statement that the clinical practice guideline was based on a systematic review was not provided but all other criteria and subcriteria describing a systematic review were determined to be present, that subcriterion was marked as complete. Guideline developers should be advised to contain such an explicit statement in future guidelines.   (3, 26, 27, 29, 30, 33, 37, 39-41, 66-74, 78, 79, 82). There were N ¼ 7 (10%) guidelines that provided content specific to pediatric patients (3,42,(76)(77)(78)(79)(80). A minority of guidelines addressed non-clinical aspects of EMS

Guideline Appraisal
Results of the guideline appraisal process can be found in Table 3. When overall guideline scores are grouped into quartiles, most guidelines (N ¼ 45, 63%) scored between 50-75%, with 9 (13%) guidelines scoring >75%, and 17 (24%) guidelines scoring <50%. When cumulative domain totals were averaged across guidelines, Domain 4 (Clarity of Presentation) scored the highest at 79.7% (±14.0), whereas Domain 5 (Applicability) scored the lowest at 35.1% (±19.1%) (Figure 2). Key items of Domain 4 include specific, unambiguous, and identifiable recommendations, and presentation of alternative options for management of the health condition. Key items of Domain 5 include facilitators and barriers to guideline application, tools for guideline implementation, resource implications, and monitoring/auditing criteria. Evaluation of guidelines based on the NAM criteria for clinical guidelines revealed only 38% (N ¼ 27) guidelines contained all recommended reportable elements (Table 4). Most guidelines contained a description of benefits, harms, and alternate care options (N ¼ 69, 97%) and a summary of evidence synthesis (N ¼ 63, 89%). The most commonly missing elements (N, % containing), were study selection (N ¼ 35, 49%), synthesis of the evidence (N ¼ 38, 54%), and a description of the search strategy (N ¼ 39, 55%). Only N ¼ 41 (58%) of guidelines contained a statement reporting the performance of a systematic review of the literature, or included all elements of a systematic review of the literature, which would similarly qualify as meeting this criterion (21).

DISCUSSION
We identified a limited number of published guidelines in the field of prehospital care when compared to the scope of EMS medicine. This is consistent with prior NAM findings of a limited amount of scientific evidence guiding EMS (1). However, we noted that 60 of 71 (85%) guidelines found in this review were published after 2007 when the NAM report promoting more evidence-based medicine in EMS was published. Of these guidelines, only 9 (13%) obtained a composite score of >75% in the appraisal process using AGREE II, demonstrating room for improvement across most guidelines available for EMS systems. Further, only 2 of 6 domain averages were >60%. Almost two thirds of EMS guidelines lacked detailed reporting recommended by the NAM and AHRQ, identifying important gaps in reporting within clinical guidelines for EMS medicine.
The areas in which guidelines scored highest were Domains 1 and 4, with average scores of 76.6% and 79.9%, respectively. Domain 1 (Scope and Purpose) involves a description of the objectives, health questions, and target population of the guideline, whereas Domain 4 (Clarity of Presentation) involves specific and identifiable recommendations that are presented with their alternatives. The intuitive nature of these items is what makes them commonplace; they require little to no prior experience in guideline development, external input or instruction, additional research or work, or specific knowledge beyond the subject at hand. However, they represent only 6 of the 23 AGREE II items.
The remaining domains, 2, 3, 5, and 6, scored significantly lower, with averages of 51.3%, 53.6%, 35.1%, and 56.2%, respectively. Although each domain contains multiple, unique items, one common theme that can be seen across these domains is the impact of funding and methodology, which are often closely related. First, Domain 2 (Stakeholder Involvement) generally requires the coordination and associated costs of large, national meetings, spanning multiple professions and disciplines.  Assessment, Development and Evaluation (GRADE) (89, 90) is used. Further, a procedure for updating the guideline implies an ongoing effort with sustainable funding versus a stand-alone project, which is like the monitoring/auditing criteria of Domain 5 (Applicability). Domain 5 is also limited by the consideration of resource implications, which can involve performance of a cost-analysis and evaluation of the impact on existing resources, personnel, and local protocols. The impact of funding and methodology is evident in the appraisals of the two highest scoring guidelines. The Canadian Stroke Best Practice Recommendations for Acute Stroke Management by Boulanger et al. (66), had the highest average domain score of 89.2%. The Canadian Stroke Best Practice Recommendations, and their updates, are funded in their entirety by the Heart and Stroke Foundation, Canada. This is a well-funded, longstanding, national organization that advocates for stroke awareness and management across all levels of care. This organization possesses the resources to gather and support a large group of interdisciplinary experts, and the guideline development group used a rigorous framework adapted from the Practice Guidelines Evaluation and Adaptation Cycle (91). The second highest scoring guideline, with an average domain score of 88.8%, is the Evidence-Based Guidelines for Fatigue Risk Management in Emergency Medical Services by Patterson et al (7). Similarly, this guideline had robust funding from the U.S. Department of Transportation, National Highway Traffic Safety Administration and used GRADE methodology for evidence evaluation.
Yet funding from a national organization and use of structured methodology are not enough to achieve high scores across all domains. A variety of other guidelines appraised in this review reported either internal or external funding mechanisms. Appraisal across all domains suggests that future EMS guideline developers should be mindful of each individual domain identified by the AGREE II tool as a component of high guideline quality. Particular attention should be paid to domains, 2, 3, 5, and 6, which scored the lowest across EMS guidelines. Domain 6 (Editorial Independence) may specifically interact with the benefit of strong funding. It identifies the need for explicit statements regarding funding and potential competing interests. Specifically, it requires that the views of the funding body not influence the content of the guideline, and that any competing interests of the author group be recorded and addressed. The issue of bias is by no means a recent development in the world of research and guideline development, so it was surprising that many of the articles appraised failed to comment upon either issue. Smaller projects, specifically those that lack external funding, would be unlikely to have bias. Moreover, the individual requirements of this domain, and its items, call for a brief statement and attached document that addresses these concerns. This content can usually be found at the beginning or end of a guideline and involves minimal effort. Therefore, domain 6 appears to be an area   (2). These include guidelines on pediatric prehospital seizure management (3), prehospital analgesia in trauma (8), air medical transportation of prehospital trauma patients (5), external hemorrhage control (6), and fatigue risk management (7). The Model Process stems from criteria for quality clinical guidelines put forth by the NAM (15), and strongly promotes the use GRADE methodology. The systematic and exhaustive approach utilized by GRADE covers a large majority of the items in AGREE II. Each of these guidelines scored in the top third of all guidelines identified for EMS medicine, providing support for continued use of the Model Process, which has been recommended by NEMSAC and FICEMS (2).
Assessment of the NAM/AHRQ criteria for clinical practice guidelines revealed a variety of deficiencies in the methodology or reporting of EMS guidelines. Only 38% of guidelines reported all elements comprising these criteria. The most common elements missing included the reportable elements of a systematic review, including description of search strategy, study selection, and synthesis of evidence. Providing a description of the search strategy ensures readers know the databases searched, search terms used, and the time period covered by the literature search. This facilitates the identification of potential gaps in the literature search and is critical for updating guidelines using consistent methodology. Another key element is a description of study selection, including the number of studies identified, included, and excluded, along with the criteria used. These items further aid the reproducibility of the results and identifies why certain literature may or may not have been considered when framing the recommendations. A detailed description of the selected studies, such as the use of evidence tables was also missing in many guidelines, though most included a descriptive summary of the literature. Detailed reporting of the key elements of studies is important to best understand the basis for recommendations contained in a guideline.
Limitations of this systematic review include the potential for recommendations for prehospital care embedded in other clinical guidelines that may have been missed. Many clinical guidelines that address the spectrum of emergency care across disciplines by disease process were included, but our literature search may not have been exhaustive of such recommendations. Appraisals of guidelines by the AGREE II tool and for the NAM criteria may have variability based on subjective assessments. We used multiple reviewers and averaged scores across AGREE II domain reviews, as well as a consensus approach in assessing the NAM/ AHRQ criteria to mitigate subjective assessments.

CONCLUSIONS
This systematic review of evidence-based guidelines identifies existing recommendations for a variety of topics within EMS medicine. We identified a wide range of quality and important gaps in guideline methodology and reporting based on the AGREE II tool and the NAM/AHRQ criteria for clinical