Evidence for curricular and instructional design approaches in undergraduate medical education: An umbrella review.

Abstract Introduction: An umbrella review compiles evidence from multiple reviews into a single accessible document. This umbrella review synthesizes evidence from systematic reviews on curricular and instructional design approaches in undergraduate medical education, focusing on learning outcomes. Methods: We conducted bibliographic database searches in Medline, EMBASE and ERIC from database inception to May 2013 inclusive, and digital keyword searches of leading medical education journals. We identified 18,470 abstracts; 467 underwent duplicate full-text scrutiny. Results: Thirty-six articles met all eligibility criteria. Articles were abstracted independently by three authors, using a modified Kirkpatrick model for evaluating learning outcomes. Evidence for the effectiveness of diverse educational approaches is reported. Discussion: This review maps out empirical knowledge on the efficacy of a broad range of educational approaches in medical education. Critical knowledge gaps, and lapses in methodological rigour, are discussed, providing valuable insight for future research. The findings call attention to the need for adopting evaluative strategies that explore how contextual variabilities and individual (teacher/learner) differences influence efficacy of educational interventions. Additionally, the results underscore that extant empirical evidence does not always provide unequivocal answers about what approaches are most effective. Educators should incorporate best available empirical knowledge with experiential and contextual knowledge.


Introduction
Medical education is intended to promote learning and ensure that medical students develop proficiency in the broad competencies essential for servicing various health care needs in their communities. Many contend that medical education practice, planning, and reform should be theoretically grounded and evidence-based (Levinson 2010;Mayer 2010;Fernandez 2014). Indeed, in recent decades, medical education has grown significantly as an area of research, drawing on diverse disciplines to produce a large empirical knowledge base. Critically, however, leaders and faculty in medical education may not access or utilize this empirical knowledge to inform their curricular development and instructional design practices (Levinson 2010;Onyura et al. 2014). This echoes concerns across the broader field of education where there are notable research-practice gaps (Spencer et al. 2012). These gaps highlight the need for knowledge mobilization strategies and resources that promote empirical knowledge sharing and utilization. One identified barrier to empirical knowledge use by medical education faculty is the absence of accessible knowledge tools that provide high-level overviews of the effectiveness of various educational approaches (Onyura et al. 2014). Indeed, evidence synthesis has been identified as an important strategy for promoting the accessibility of empirical knowledge to practitioners and decision makers (Frank et al. 2007). Ergo, through this umbrella review we provide high-level syntheses of the learning outcomes of educational approaches commonly employed in undergraduate medical education.
Umbrella reviews (or meta-systematic reviews) are an established method of locating, appraising, and synthesizing systematic review level evidence (Becker & Oxman 2008; Practice points Accumulated evidence shows educational interventions have mixed effects on learning outcomes. Evaluative approaches that explore how contextual variabilities and individual (teacher/learner) adaptations influence effectiveness of educational interventions should be adopted. Diverse, long-term outcomes of educational interventions warrant exploration. Greater rigour in qualitative and quantitative research is needed. Educators' toolkits should incorporate best available empirical knowledge, with experiential and contextual knowledge. Grant & Booth 2009). Given growing volumes of primary research articles and associated systematic reviews, umbrella review methodology is emerging as an increasingly common technique in various areas of health research (e.g. Bambra et al. 2009;Gartlehner et al. 2010;Safron et al. 2011;Theodoratou et al. 2014) as it facilitates the rapid review of broad bases of evidence. In applying umbrella review strategies to medical education, this article aims to provide an evaluative summary of the learning outcomes of curricular design (overarching course/program structure) and instructional design (specific tools/strategies employed by teachers) approaches commonly employed in undergraduate medical education. Not only does this provide an overarching scan of existing literature on educational approaches in medical education, it allows for much needed critical discussion of what the extant body of empirical knowledge portends for research, practice, and knowledge translation in medical education.

Methods
We sought to identity both qualitative and quantitative systematic reviews that examined primary studies for the learning outcomes of curricular and instructional design methods in medical education.

Eligibility criteria
Reviews were considered systematic if: (1) they had a clearly defined search strategy, (2) included two or more databases in the search strategy, (3) defined inclusion criteria, and (4) described original studies (Grant & Booth 2009).
We limited included reviews to those that either focused uniquely on undergraduate medical students, or included populations of undergraduate medical students along with learners of other levels of education and health professions.
Reviews that addressed more than one educational approach or method were excluded for being too broad in scope. Although we did not limit our search by language or country, we limited our review to articles published in English. Only peer reviewed articles published from database inception to May 2013 were included.

Search strategies and selection methods
Two approaches were used to locate studies for inclusion in this review. First, we searched for relevant systematic reviews using the electronic literature databases Medline, EMBASE and ERIC. They were chosen to span the health professions and education. No language or date restrictions were imposed. Search terms consisted of a combination of medical subject headings (MeSH) and text words in the search for curricular approaches and instructional methods in undergraduate medical education. Appropriate wildcards were used in the searching in order to account for plurals and variations in spelling. A validated search strategy for retrieving systematic reviews was applied (Montori et al. 2005;Wilczynski & Haynes 2007). Databases were searched sequentially (Medline, EMBASE, ERIC), and where possible, duplicates were removed prior to review of identified abstracts. The initial search was conducted in January 2012 and was updated in May 2013.
Second, we conducted digital keyword searches of the leading medical education journals (Medical Education, Teaching and Learning in Medicine, Academic Medicine, Medical Teacher, and Advances in Health Science Education) for the same period (inception to May 2013).
Standard systematic review procedures were applied for sifting abstracts, scrutinizing full papers and abstracting data. In total, these searches produced 18,470 references related to curricular approaches and instructional methods employed in undergraduate medical education. After filtering for eligibility criteria, 467 abstracts were identified and full papers were obtained and scrutinized. This process identified 36 papers that met all of the eligibility criteria and are included in our umbrella review ( Figure 1).

Data abstraction, analysis and synthesis
Each full article was read and abstracted by at least two members of the team and agreement sought. A third member was consulted if a difference of opinion arose. Once this process was complete, the third member of the team independently abstracted the pertinent data from the included articles.
We developed a data extraction sheet through iterative testing and revision. To facilitate comparisons and summaries of data, all data from the abstraction sheets were entered into an Excel file to produce a synthesized descriptive account of the articles. Variables coded included details about the review (e.g. topic area, number of databases searched, number of articles included in review, whether and how quality was assessed) and details about the included studies (e.g. level(s) of education and profession(s) targeted).

Assessing the quality of included systematic reviews
For each review, the quality assessment was conducted by two researchers using a modified version of the AMSTAR (Assessing the Methodological Quality of Systematic Reviews) instrument (Shea et al. 2007a,b). The AMSTAR instrument is the only tool we are aware of that has been validated to assess the methodological quality of systematic reviews. It addresses key domains including: establishing the research question and inclusion criteria before Umbrella review of medical education methods conducting the review; data extraction by at least two independent researchers; comprehensive literature review with searching of at least two databases; detailed list of included/excluded studies; and quality assessments in analysis and conclusions.
The AMSTAR was initially developed to assess systematic reviews of randomized control trials for public health interventions. Because our review is situated in education and includes studies that employ qualitative approaches to data synthesis (only 17% are meta-analyses), we modified the AMSTAR for our   needs by excluding three items more appropriate for clinical (Item 11) and meta-analytical (Item 9, 10) research. Our modified AMSTAR checklist (Appendix 1, available as supplementary material online) includes eight items. Inter-rater reliability was evaluated using intra-class correlation coefficient.

Results
In this section, we synthesize and explicate the findings on learning outcomes from the included reviews. Please see Table  2 (available as supplementary material online) for a condensed summary. The average quality rating for each review is also tabulated, rounded up to the nearest whole number. Excellent inter-rater reliability was observed between the two modified-AMSTAR raters (r ¼ 0.93). The most reported outcomes were focused on knowledge and skill acquisition (Kirkpatrick level 2b). Learner reactions (level 1) and attitudes (level 2a) were also frequently reported, though reactions were not always clearly defined and attitude measures typically focused solely on changes in confidence. Fewer than 40% of included reviews presented data on behavioural changes (level 3), patient effects (level 4a), and broader organizational (level 4b) and community outcomes (level 4c). Overall, short-term effects were captured with few reviews reporting long-term retention of learning outcomes. We present the findings below in two broad categories: curricular design and instructional design.

Curricular design approaches
We grouped reviews focusing on curricular design into the following eight broad categories: cross-cultural exchanges, early clinical and community experience, e-learning, interprofessional education (IPE), portfolios, problem-based learning (PBL), scholarly concentrations (SCs) and self-directed learning (SDL).

Cross-cultural exchanges
Cross-cultural exchanges refer to elective (often) educational experiences completed outside a student's home country during the course of their program or in a different cultural context within their own country (e.g. experiences in communities populated by continental natives) (Mutchnick et al. 2003). We identified two reviews about cross-cultural exchanges (Mutchnick et al. 2003;Jeffrey et al. 2011). Mutchnick et al. (2003) found consistent evidence that cross cultural experiences enhanced professional development by promoting cultural competence and compassion for patients. In addition, learners developed an increased awareness of resource use, professional practice and public health issues. Overall, learners valued the learning experience. It enabled a broadening of perspectives and facilitated increased confidence, independence, and realistic goal setting ability. Having cross-cultural exchanges as a curricular option increased the attractiveness of medical schools to incoming students with many students reporting the exchange option as their primary reason for choosing a particular school. Whereas the outcomes regarding cross-cultural exchanges are all positive, the review authors question the quality of data across several studies (Mutchnick et al. 2003). Jeffrey et al. (2011) specifically explored international health electives (IHE) and found that students gained more confidence in certain clinical skills (e.g. history taking, physical examination) than peers who did not complete an IHE. The students also had greater appreciation for the importance of cultural competency, prevention, environmental and public health, as well as the need to provide care to underserviced populations. IHEs promoted knowledge acquisition on tropical disease and immigrant health. It provided opportunities to strengthen existing clinical reasoning, history taking, and diagnostic skills with less emphasis on the use of high-tech instruments or interventions. There was a reported tendency for students who completed IHE to choose primary care specialties (e.g. family medicine, internal medicine), seek employment in low-income clinics or pursue graduate education in public health (Jeffrey et al. 2011). It is worth noting that IHEs are highly variable and there is no consensus on how best to design and implement IHE curricula. More research is needed on IHE, with additional learning outcomes measured (Jeffrey et al. 2011).

Early clinical and community experience
Early experience involves the orientation of medical students to the social context of clinical practice before they enter into their formal clinical training (clerkship) years (Dornan et al. 2006). We identified two reviews on early experience (Dornan et al. 2006;Yardley et al. 2010). Dornan et al. (2006) found some comparative research that indicated first year medical students who had early experience were more satisfied with their medical education than their peers with less experience. Students with early experience developed enhanced learning motivation (Dornan et al. 2006;Yardley et al. 2010) and reported feeling increased responsibility to learn (Yardley et al. 2010).
Across reviews, there were positive effects of early experience on students' professional development, including adaptation to professional roles, teacher-rated maturity (Dornan et al. 2006), the identification of role-models, understanding of the doctor-patient relationship, and role clarity (Yardley et al. 2010). Review findings also illustrated improvements in students' self-awareness, confidence (Dornan et al. 2006) and empathy for patients (Dornan et al. 2006;Yardley et al. 2010). Some research demonstrated that these gains in empathy persisted several years beyond graduation (Dornan et al. 2006).
Knowledge gains were reported in several areas, including understanding the practical relevance of their work, the link between living conditions and health, and the ethical dimensions of patient care (Dornan et al. 2006). Skill acquisition was reported with regard to communication, reflection, history taking and physical examination skills (Dornan et al. 2006). Early experience also appeared to reinforce medical students' vocational choice to practice medicine and influence career choices (e.g. working with underserved populations) (Dornan et al. 2006;Yardley et al. 2010).

Umbrella review of medical education methods
Critically, the effects of early experience extended beyond the students themselves. Patients were satisfied with the students' skills and enthusiastic about their role in the learning process (Dornan et al. 2006). One institution was reported to modify its curriculum to include extra electives in year 1 (Dornan et al. 2006). There were also potential community benefits among populations where students were available to measure blood pressure, deliver oral health, treat trachoma and help manage malnutrition (Dornan et al. 2006).
Overall, early experience appears to strengthen and contextualize learning. However, more research is needed on how early authentic experience leads to particular learning outcomes and what factors contribute to its effectiveness (Dornan et al. 2006;Yardley et al. 2010).

E-learning
With the advent of broader access to advanced information technologies, medical educators are increasingly turning to online platforms to deliver curricula. We identified three reviews that examined the effects of e-learning, which included internet-or web-based instruction (IBL) (Chumley-Jones et al. 2002;Cook et al. 2010a) and blended learning (which incorporates e-learning with traditional forms of instruction) (Rowe et al. 2012). In general, IBL resulted in enhanced student course satisfaction (Cook et al. 2010a) even when compared to printed learning materials (Chumley-Jones et al. 2002). There were, however, no significant differences in comparisons of learner confidence between IBL and other educational interventions. Across studies, IBL promoted knowledge acquisition (Cook et al. 2010a) with knowledge gains being either superior (Cook et al. 2010a) or equivalent to gains realized using other educational interventions (Chumley-Jones et al. 2002;Cook et al. 2010a). However, learning time may have influenced findings as there were some studies where students spent longer studying or participating in IBL interventions (Chumley-Jones et al. 2002;Cook et al. 2010a).
Blended learning resulted in small improvements in selfefficacy, reflective thinking, clinical reasoning, and clinical skills (history taking, reporting, documenting, and patient management) (Rowe et al. 2012). It is, however, impossible to isolate the effects of different modalities in blended learning (Rowe et al. 2012).
Overall, findings suggest that whereas e-learning does support knowledge and skill acquisition, more research is needed to capture new and emerging e-learning approaches. Future research may be better directed toward examining strengths and weaknesses of the teaching modalities rather than trying to establish superiority of one educational intervention over another (Chumley-Jones et al. 2002;Cook et al. 2010a).

Inter-professional education
In recent years there has been a push for educators to incorporate models of IPE into their curricula. IPE refers to ''when students from two or more professions learn about, from and with each other to enable effective collaboration and improve health outcomes'' (WHO 2010, p. 7). We identified two reviews on IPE within undergraduate populations (Hammick et al. 2007;Lapkin et al. 2013). The reviews both found some evidence for learner enjoyment. Findings on general learner attitudes toward IPE were mixed. Although some research reported that IPE was perceived as positive and relevant (Hammick et al. 2007), other research found that improved attitudes toward IPE returned to baseline over time (Lapkin et al. 2013). Although some evidence of knowledge gains as a function of IPE was reported (Lapkin et al. 2013), medical students in uniprofessional groups outperformed those in inter-professional groups on knowledge tests (Lapkin et al. 2013). While IPE interventions resulted in improvements in inter-professional communication in some research, these gains were comparable to those of control interventions (Lapkin et al. 2013). Finally, there was some evidence to support the improvement of patient care (screening, illness prevention services) through IPE interventions (Hammick et al. 2007). Some IPE interventions were designed as quality improvement initiatives and found that IPE was effective at improving screening or illness prevention services and resulted in more comprehensive patient care in undergraduate student community clerkship teams (Hammick et al. 2007).

Portfolios
Portfolios are collections of information about a learner's achievements, assembled with the intension of promoting personal and professional development, in part through a critical analysis and reflection of portfolio contents (McMullan et al. 2003). We identified two reviews that examined the use of portfolios in undergraduate medical education (Driessen et al. 2007;Buckley et al. 2009). In general, students were satisfied and experienced positive changes; enhanced confidence and self-awareness were reported across several studies (Buckley et al. 2009). Although portfolios were often perceived to stimulate reflective learning (Driessen et al. 2007;Buckley et al. 2009) and enhance the ability to learn independently (Buckley et al. 2009), a few studies reported no significant changes in self-awareness, reflection or ability to learn independently (Buckley et al. 2009). Buckley et al.'s (2009) review found evidence that portfolio use resulted in skill gains with regard to communication skills, decisionmaking, critical thinking and ability to integrate theory with practice. Buckley et al. (2009) also found evidence of the positive impact of portfolios on a student's ability to meet course objectives and on learner professionalism. Some research demonstrated that portfolio use had a positive impact on examination performance, particularly for academically weaker students (Buckley et al. 2009). Correspondingly, several studies showed that by reviewing and providing feedback on students portfolios, tutors had a better understanding of their learners' needs and adapted their teaching accordingly. Overall, portfolios appear to support various aspects of learning, including personal and professional development (Driessen et al. 2007). It is worth noting, however, that many studies report that students often find completion of portfolios to be time-consuming and stressful (Buckley et al. 2009). Studies also reported poor preparation and that introducing portfolios resulted in poor commitment from learners (Driessen et al. 2007).

Problem-based learning 1
PBL is a pedagogical approach that aims to facilitate student learning by immersing students in authentic real professional practice problems (Barrows 1986;Hung et al. 2008). PBL is a student-centred approach that incorporates SDL, reflective inquiry (Hung et al. 2008) and active learning (Colliver 2000) as learners work in small groups (Srinivasan et al. 2007;Hung et al. 2008). PBL facilitators play a guiding as opposed to directive role as learners strive to define problems, explore related issues and grapple with problem resolution on their own (Srinivasan et al. 2007).
We identified four reviews of the use of PBL among undergraduates (Vernon & Blake 1993;Hartling et al. 2010;Polyzois et al. 2010;Mahmud & Hyder 2012). Across two reviews, a few studies examined student reactions and reported that students found PBL to be an enjoyable way to learn (Polyzois et al. 2010;Mahmud & Hyder 2012). In comparisons with lecture-based learning, PBL was found to be preferable as well as more interesting and stimulating (Mahmud & Hyder 2012). Students also demonstrated improved confidence, feelings of independence, and healthier attitudes toward research as a function of PBL instruction (Mahmud & Hyder 2012). There were mixed findings with regard to knowledge gains. Several studies found no differences between PBL and non-PBL students in exam performance (Hartling et al. 2010). Other studies found significant differences in academic achievement that favoured non-PBL groups (Vernon & Blake 1993;Hartling et al. 2010;Polyzois et al. 2010) and some students reported thinking certain concepts were not well covered or explicated in PBL (Mahmud & Hyder 2012). Conversely, other studies reported that PBL increased the depth of learners' knowledge and helped them retain facts for longer periods of time (Mahmud & Hyder 2012). The review by Vernon and Blake (1993) reported that PBL students and non-PBL students may acquire knowledge differently with PBL students placing more emphasis on the ''meaning'' in their learning material as opposed to reproducing knowledge. PBL had a positive impact on acquisition of inter-personal relationship skills and humanistic attitudes (Hartling et al. 2010;Polyzois et al. 2010). This included gains in empathy, patient-centred orientation, comfort with emotions, communication, and data collection skills (Vernon & Blake 1993). In comparison with other methods, PBL students demonstrated better critical reasoning, problem-solving activities, and creativity (Polyzois et al. 2010;Mahmud & Hyder 2012). Across two reviews, studies reported that PBL students demonstrated superior clinical performance compared to non-PBL students (Vernon & Blake 1993;Hartling et al. 2010), including better diagnostic accuracy (Hartling et al. 2010). Improvements in SDL were also reported (Hartling et al. 2010;Mahmud & Hyder 2012). One review found that student distress (including depression, anxiety, hostility, and somatic complaints) were lower among PBL students than their counterparts receiving education through traditional teaching methods (Vernon & Blake 1993).
Overall, the evidence on PBL is not definitive (Hartling et al. 2010). There is evidence to support the efficacy of PBL in improving competence in some domains (e.g. relational skills, clinical performance) and conflicting evidence in domains relating to academic achievement where several studies found knowledge acquisition was lower or equivalent in comparison to traditional teaching methods. Review authors suggest that there may be a need to adopt different evaluative strategies to assess the outcomes of PBL (Vernon & Blake 1993). For example, randomized controlled trials (RCTs) may not be a reliable methodology for PBL curricular design research given the lack of standardization or ability to blind students to an intervention (Polyzois et al. 2010).

Scholarly concentrations
SCs are elective or required programs of study designed to promote in-depth study on specified topic areas, going beyond what the conventional curriculum would provide (Bierer & Chen 2010). We identified one review (Bierer & Chen 2010) that examined the impact SCs have on medical students. Overall, students reacted favourably to SCs and would undertake SCs again if given the opportunity. Students believed that SCs gave them a broader perspective of patient care, improved understanding of research principles and enhanced their knowledge. Several studies showed that SCs improved student ability to critically evaluate literature, write scientific studies, and conduct ethically responsible research.
There were, however, mixed reactions as to whether SCs should be made mandatory. Student criticisms of SCs included unwelcome stress, high effort, inadequate structure, and undue focus on lab research. Students also expressed concern that SCs detracted from clinical opportunities. Finally, participation in SCs was found to influence career choices including choice of clinical specialty and decisions to pursue academic careers. Interestingly, though some research pointed to SC enhancing research interest, other research found the opposite effect. The diversity of articles and variable results prevent definitive conclusions about the value of SCs and the review authors advocate for increased rigour in evaluation designs to demonstrate SCs' true impact (Bierer & Chen 2010).

Self-directed learning
SDL is learning in which the conceptualization and conduct of a learning project are directed by the learner; he/she makes decisions about what to learn and how to learn it (Brookfield 2009). Murad et al. (2010) examined the effectiveness of SDL in improving learning outcomes in comparison with other approaches and found that attitude and skill gains were equivalent across comparison groups. However, SDL was associated with moderate increases in knowledge gains above other approaches; the greatest gains were found when learners chose their own resources. Some evidence suggested that more advanced learners may benefit more from SDL than more novice learners. In general, SDL was found to be at least as effective as traditional learning and perhaps more effective in certain instances (Murad et al. 2010).

Instructional design approaches
We grouped reviews focusing on instructional design approaches into the following seven broad categories: case-based learning (CBL), concept maps, dissection and prosection, educational games, patient involvement, technology-enabled simulation and technology-enabled teaching.

Case-based learning
CBL is an instructional approach that involves students' analytic review and discussion of real-life scenarios. CBL learners prepare in advance and work in small groups focused on problem-solving as a facilitator engages them using guided inquiry (Slavin et al. 1995;Srinivasan et al. 2007). We identified one review that examined effects of CBL on student outcomes (Thistlethwaite et al. 2012). Thistlethwaite et al. (2012) found that most students' reactions to CBL were very positive. CBL stimulated learner interest and was perceived to offer a good link between academic content and real-life practice. Students, however, had concerns about workloads and whether CBL prepared them adequately for summative assessments. Some studies showed that CBL resulted in knowledge gains in several areas including physiology and treatment methods as well as improvements in clinical reasoning and communication skills. However, the majority of studies found equivalent knowledge and skill outcomes between CBL learners and control groups. Overall, the review illustrates that CBL can be effective in enhancing learning though it may not necessarily be more effective than other teaching approaches (Thistlethwaite et al. 2012).

Concept maps
Concept mapping is an instructional strategy that involves having learners integrate knowledge through graphical representation of their understanding of conceptual linkages (Torre et al. 2013). Daley and Torre (2010) reviewed the effectiveness of concept maps in promoting learning and found variability in learners' perceptions about whether concept maps were a meaningful learning device. There was evidence that students who utilized concept maps performed better on problemsolving examinations than students given lecture-based instruction. However, there were no differences between these groups in multiple choice exam performance. One study found that concept maps had the most impact on students who came into the study with the lowest cognitive competence (Daley & Torre 2010). Concept maps appear to have a role to play in helping learners link new knowledge to previous knowledge and may be a valuable resource that instructors can integrate with other approaches. Daley and Torre (2010) contend that stronger data are needed given the methodological limitations of cited studies.

Dissection and prosection
Dissection is an active learning process that involves students cutting a cadaver to examine internal structures (Hasan et al. 2010). Prosection involves cadaveric dissection by an experienced instructor, who concurrently demonstrates anatomical structures to students. Winkelmann (2007) examined the comparative effectiveness of dissection and prosection in teaching anatomy. Findings were conflicting with some favouring knowledge acquisition through dissection and others through prosection. Although the findings reflect a slight advantage in favour of dissection, a more conclusive interpretation of findings is prohibited by methodological limitations, including the non-standardized assessment of anatomical knowledge (Winkelmann 2007). Winkelmann (2007) argues that more sophisticated research designs may be necessary to adequately assess the efficacy of dissection compared to prosection.

Educational games
Educational games are a form of experiential learning that involve the participation of learners in structured situations that enable them to experience and reflect on what they are learning (Silberman 2007). Our umbrella review identified three reviews on educational games (Bhoopathi et al. 2007;Blakely et al. 2009;Alfarah et al. 2010). Alfarah et al. (2010) examined the effectiveness of role-playing games for geriatric education. Whereas high rates of learner satisfaction were reported, the majority of studies found no changes in attitudes toward the elderly as a function of role-playing games. Additionally, no significant knowledge gains were reported in the study that compared knowledge acquisition among a role-playing group with a control group. The findings suggest that role-playing games are not effective interventions for geriatric education. However, the authors make note of methodological limitations in many of the reviewed studies that may have had an impact on research outcomes. Bhoopathi et al. (2007) compared the effects of educational games to standard teaching approaches for mental health education. Due to stringent inclusion criteria, only one study was included in their review. The results of the study showed that educational games resulted in improvements in knowledge, as measured by tests taken shortly after the intervention. The author contends that educational games may prevent forgetting of salient facts during test-taking, and may be a valuable tool for gaining knowledge in addition to standard learning packages. Blakely et al. (2009) examined whether educational games support classroom learning. Their review illustrates that learners generally respond favourably to game interventions, highlighting motivation and competition as positive features of educational games. For most studies, short-term knowledge acquisition was found to be either superior or equivalent to didactic lectures. Findings with regard to longer-term knowledge acquisition are mixed. Whereas most studies found superior or equivalent effects when comparing games to didactic learning, one study favoured didactic learning. These findings illustrate the varying situational effectiveness of educational games (Blakely et al. 2009). Current research is inadequate to support decisions about various sub-strategies or contextual variations that may influence the effectiveness of educational games as an instructional approach.

Patient involvement
There is a persuasive rationale for the active involvement of patients in health professional education, including the fact that patients' experiential knowledge of illness can be incorporated into student instruction (Towle et al. 2010). We identified three reviews that focused on two different approaches to patient involvement in medical education (Wykurz & Kelly 2002;Jha et al. 2009;May et al. 2009). Jha et al. (2009) and Wykurz and Kelly (2002) reviewed the effects of real patient as teachers and May et al. (2009) examined the effects of standardized patients in medical education. Wykurz and Kelly (2002) highlighted positive learner reactions to patient involvement. Many learners commented on gaining new insights when patients gave them constructive feedback and some even preferred the training they received from experienced patients over the training from doctors (Wykurz & Kelly 2002). This is in contrast to the findings of Jha et al. (2009) where students in some studies rated faculty teaching as being of better quality and more relevant than teaching by patients. Jha et al. (2009) also highlighted other potentially negative outcomes such as student skepticism about the benefit of patient involvement, or student perceptions of imposing on patients. One study reported lower examination scores for students taught through patient involvement in comparison with traditional teaching by faculty (Jha et al. 2009). Conversely, there were a myriad of positive effects due to real patient involvement on learners' perceptions of personal growth and of the quality of the doctor-patient relationship (Jha et al. 2009) along with a deepening respect for patients (Wykurz & Kelly 2002). In addition, several studies reported improvements in clinical skill proficiency and knowledge acquisition including knowledge about the social aspects of disease (Jha et al. 2009) and about patients' experience of the disease (Wykurz & Kelly 2002). Importantly, patient involvement had a positive effect on patients themselves. Patients reported enjoying the process and appreciated the opportunity to share their knowledge and facilitate learning (Wykurz & Kelly 2002). They also reported enhanced disease knowledge, improved perceptions of the doctor-patient relationship (Jha et al. 2009) and feelings of empowerment (Wykurz & Kelly 2002;Jha et al. 2009). Overall, involving real patients who have first-hand experience of disease in student instruction appears to benefit learners in unique ways, in addition to being cost-effective (Wykurz & Kelly 2002). There were, however, some concerns about patients' emotional well-being and stamina in some studies, where patients might experience stress when sharing potentially painful issues or undergoing repeated examinations (Wykurz & Kelly 2002). It is important that researchers studying the effects of real patient involvement on learners examine the ethical and psychological impact of patient involvement on patients as well as learners. May et al. (2009) examined the effects of standardized patients on learners. Learners consistently rated standardized patients as beneficial, reporting increased confidence and comfort levels, particularly in the areas of communication skills and performance of sensitive examinations. Standardized patients also had positive effects on learners' knowledge and skill acquisition (e.g. in communication, teamwork, and physical examination skills) across studies. As with real patient involvement, the reviews reveal significant educational value through the use of standardized patients as an instructional tool.

Technology-enabled simulation
Simulation has been defined as a person, device, or set of conditions that attempt to present practice scenarios authentically and require the learner to respond as he/she would under natural circumstances (McGaghie 1999). Technological advances have opened up new frontiers for simulation technology, allowing medical students to learn and practice clinical skills in controlled environments where patient safety is not at risk (Elley et al. 2012). Our umbrella review identified eight reviews on technology-enabled simulation; they encompassed various forms of simulation technology (virtual patients, virtual reality, computerized mannequins), allowing for varying levels of clinical engagement (Cook et al. 2010b;Harder 2010;Al-Kadi et al. 2012;Consorti et al. 2012;Ikonen et al. 2012;Larsen et al. 2012;Yuan et al. 2012;McKinney et al. 2013).
Two reviews were focused on virtual patients -that is, computerized clinical case simulations (Cook et al. 2010b;Consorti et al. 2012). Consorti et al. (2012) found that use of virtual patients resulted in improvements in clinical reasoning skills (including history taking, imaging, lab data interpretation) as well improvements in professionalism, communication skills and ethical reasoning. Cook et al. (2010b) reviewed the effectiveness of virtual patients in comparison with both ''no intervention'' and alternate instructional methods. Comparisons with no intervention showed significant improvements in knowledge and clinical skill acquisition. In contrast, comparisons with non-computer interventions (e.g. traditional instruction, standardized patients, etc.) showed negligible gains in student satisfaction and knowledge and clinical skill acquisition. Qualitative findings, however, indicated that students perceived virtual patients as promoting student independence and accommodating of student schedules. It is worth noting that certain features of virtual patient formats, such as enhanced feedback and group work, may enhance learning outcomes (Cook et al. 2010b).
Three reviews examined the use of virtual reality simulators within laparoscopic surgery training. Virtual reality simulator use was significantly associated with improved knowledge acquisition (Al-Kadi et al. 2012), lower error rates and improved operative performance (Ikonen et al. 2012;Larsen et al. 2012). In addition, it was associated with greater tissue respect and improved accuracy in handling laparoscopic instruments (Al-Kadi et al. 2012). These reviews provided convincing evidence for the efficacy of virtual reality simulators in basic laparoscopic training. Simulator training appears to reduce the number of technical mistakes, thus enabling the surgical trainee to concentrate more on other aspects of surgery such as decision making (Ikonen et al. 2012).
Three reviews examined evidence on the use of computerized mannequins to support knowledge and skill acquisition (Harder 2010;Yuan et al. 2012;McKinney et al. 2013).
In the review by Harder (2010), over 90% of the studies showed an increase in student confidence and perceived competence in comparison to students who did not participate. Across reviews, there were demonstrated improvements in knowledge and clinical skill acquisition among students who used computerized mannequins in comparison to no intervention (Yuan et al. 2012;McKinney et al. 2013) or other teaching methods (e.g. standardized patients, skills laboratory sessions, lectures, etc.) (Harder 2010). In addition, two reviews demonstrated that computerized mannequins improved learners' ability to both assess and perform clinical skills in comparison to control groups (Harder 2010;Yuan et al. 2012). However, across all reviews, the benefits of mannequin use were inconsistent in comparison to other instructional modalities (Harder 2010;Yuan et al. 2012;McKinney et al. 2013). Whereas several studies showed positive effects in favour of computerized mannequins, others found equivalent attitude (Harder 2010) and knowledge and skill acquisition scores (Harder 2010;Yuan et al. 2012;McKinney et al. 2013) across comparison groups. One study showed found that students who participated in simulation training had lower multiple choice test scores than those who did not (Yuan et al. 2012). Overall, there appears to be good evidence to support the use of the high fidelity simulation proffered by computerized mannequins in health professional education, however, evidence as to whether it is superior to other modalities is inconsistent. Whereas some of these inconsistencies may be attributed to contextual variations inherent in different educational settings, some of them can be attributed to methodological limitations. There is an absence of appropriate evaluation tools for assessing the effects of simulation training so better structured assessment tools are needed (Harder 2010).

Technology-enabled teaching
Educational technology is increasingly permeating instructional design practices in medical education (Han et al. 2013). We identified two reviews covering distinct forms of technological integration in instructional design (Chipps et al. 2012;Nelson et al. 2012). Chipps et al. (2012) looked at the use of videoconferencing in teaching. Videoconferencing involves the use of technology to allow individuals at two or more locations to communicate via simultaneous two-way audiovisual transmission. The review illustrated that although learners preferred face-to-face teaching in a number of studies, videoconference-based teaching was at least as effective as face-to-face teaching in terms of knowledge acquisition. The authors highlight the need for more rigourously designed studies in this area, as well as the need to understand how technical difficulties may affect learning (Chipps et al. 2012). Nelson et al. (2012) reviewed the effects of audience response systems (ARS) on learners. ARS create interactivity between a presenter and a large audience by providing a platform for audience members to communicate with the presenter. Overall, the review found evidence that students generally enjoy using ARSs, and the experience enhanced confidence. Interestingly, findings indicate that student enjoyment of ARS can be teacher-dependent -with ARS favoured with one teacher and traditional lecture format favoured with a different teacher. Across studies, use of ARS in lectures resulted in either enhanced or equivalent immediate and long-term knowledge acquisition outcomes in comparison to traditional lectures (Nelson et al. 2012). Overall, it appears that there are neutral to modest beneficial effects of ARS use that may be teacher dependent. ARSs may provide a convenient way for educators to create an interactive teaching environment, particularly in situations where lecturers struggle with student engagement. Methodological limitations prohibit drawing of firmer conclusions (Nelson et al. 2012).

Discussion
There is increasing proliferation of systematic reviews on the efficacy of discrete instructional and curricular design methods in medical education. This umbrella review provides researchers and practitioners with a broad-scoped synthesis of existing evidence across topic areas thus presenting valuable empirical knowledge on how educational design approaches influence learning outcomes in undergraduate medical education. Drawing on our findings, we delve into critical discussion of what the existing evidence base implies about research, practice, and knowledge translation in medical education. Across reviews, there is a resounding call for greater methodological rigour and theoretically informed research design. Indeed, our own assessment of the included systematic reviews highlights wide variability in quality (see AMSTAR ratings in Table 2, available as supplementary material online). Both qualitative and quantitative researchers need to attend to ensuring clarity of objectives, proper definition and consistent use of terms, inclusion of critical details about the context of educational interventions, explicit description of methodological details, and use of appropriate conceptual or theoretical frameworks, for organizing research and evaluation work. Moreover, greater focus must be given to examining long-term retention of learning outcomes if we are to fully understand the impact of different curricular and instructional design approaches.
Our findings highlight that extant research is largely directed at examining a limited set of learning outcomes, focusing predominantly on shifts in knowledge, skills, and attitudes (Kirkpatrick level 2a and level 2b). Often, the singular focus in assessing knowledge acquisition is test performance, while the singular focus in assessing attitudes is on shifts in confidence. While important, these narrow parameters do not reflect the multiple ways in which knowledge is manifested or the huge variability in relevant attitudes experienced by learners. Other learning outcomes are reported infrequently. This is likely a function of the inherent difficulty in capturing the impact that educational interventions have on behavioural change, and patient and community outcomes (Kirkpatrick level 3 and level 4). These outcomes are affected by multiple, complex variables and variable interactions. Conventional methodologies, designed for use in clinical research (e.g. RCTs), may not be best suited for understanding the full diversity of medical education interventions and outcomes.
Moving forward, medical educators should complement traditional methodologies with alternative research and evaluation approaches to facilitate greater understanding of how curricular and instructional design influence learning. Critically, we need to broaden the scope of outcomes examined to explore the diverse ways in which learning is developed and manifested.
The review findings highlight that many educational interventions result in mixed effects on learning. Whereas some of this variability is likely attributable to methodological limitations outlined above), some of it must be attributed to contextual variabilities and individual differences. Indeed, the situational variance in the efficacy of educational approaches is worthy of evaluation, and can generate knowledge about how contextual disparities and individual (teacher/learner) differences influence the effectiveness of curricular and instructional design approaches. We contend that the central realist evaluation approach of evaluating ''what works, for whom, under what circumstances, and why'' (Pawson & Tilley 1997;Ogrinc & Batalden 2009) is highly relevant for evaluating learning outcomes in the complex health professional education environment. This is because it extends beyond merely reporting intervention outcomes, to exploring contextual interactions and individual adaptations in response to given initiatives.
Notwithstanding the growing desire to engage in more evidence-informed educational practice, this umbrella review's findings demonstrate that empirical evidence does not always provide unequivocal answers about what curricular and instructional design approaches are most effective. It behooves medical educators to recognize empirical evidence as but one part of their decision-making toolkit. Educators need to consult other knowledge resources including an understanding of themselves (e.g. skill, philosophy, experience), their learners (e.g. motivation, ability), and an awareness of the facilitating or constraining affordances of their programs and institutions (e.g. objectives, assessment systems, resources, organizational culture, and physical space). It is worth noting that some curricular and instructional design approaches (e.g. PBL, IPE, simulation) receive more empirical attention than others. This likely reflects trends and priorities in research funding and health care policy. These trends are reflected in the relatively few topic areas covered in our review. However, it should not be interpreted to mean that other educational practice approaches are ineffective or unworthy of empirical attention. Rather, it may underscore the need for funding attention to be directed towards lesser studied educational approaches in medical education.

Limitations
Although evidence synthesis is an important knowledge mobilization strategy, there are some shortcomings to conducting high-level syntheses. We rely on systematic review authors' accuracy in interpreting and reporting findings. As we do not examine primary studies for specific details, subtleties in meaning may be lost in translation. Additionally, we used the AMSTAR -a tool developed to assess the quality of RCTtype systematic reviews -even though many of our includes reviews employed a qualitative approach to synthesize data.
To address this limitation, we modified the AMSTAR to be more appropriate for our research. As with any review, we used bibliographic databases to identify the potential articles for inclusion. Although doing so provided an efficient source of material, it also limited us to the articles included in databases we deemed relevant. Similarly, our data set is bounded by the dates of our search. Finally, in order to provide a more focused review, we made decisions about inclusion and exclusion criteria to limit scope. To address this limitation, a list of excluded articles that underwent a full-text review is available from the authors upon request.

Conclusion
This umbrella review maps what is known about the efficacy of diverse curricular and instructional design approaches in medical education. It highlights critical gaps in extant knowledge, providing insight for future research. Medical education researchers must attend to enhancing methodological rigour, engaging in theoretically informed research design, and broadening the scope of learning outcomes examined. The review findings call attention to the need for alternative evaluative approaches that explore the contextual variabilities and individual adaptations that often influence efficacy of educational interventions. Additionally, it underscores that empirical evidence in medical education does not always provide explicit answers about what approaches are most effective. Medical educators should incorporate the best available empirical knowledge with experiential and contextual knowledge.

Glossary
Umbrella Reviews: Also known as meta-systematic reviews, umbrella reviews provide high-level syntheses of systematic review level evidence.