How Are Our MPAs Doing? Challenges in Assessing Global Patterns in Marine Protected Area Performance

Without effective management, protected areas are unlikely to achieve the high expectations the conservation and development sectors have for them: conserving biodiversity and alleviating poverty. Numerous marine protected area (MPA) assessment initiatives have been developed at various spatial and temporal scales, including the guidebook How is your MPA doing? These management assessments have been useful to sites to clarify and evaluate their objectives, yet efforts to examine broader regional or global patterns in MPA performance are only beginning. The authors conducted exploratory trend analyses on How is your MPA doing? indicator data collected by 24 MPAs worldwide to identify challenges and areas for future work. Wide variability across sites with regard to the indicators examined and the constructs used to measure them prevented a true meta-analysis. Managers assessed biophysical indicators more often than socioeconomic and governance constructs. Investment by the conservation community to support collecting and reporting high-quality data at the site level would enable a better understanding of the variation in MPA performance, clarify the contribution of MPAs to both biodiversity conservation and poverty alleviation, and help drive better MPA performance. The absence of rigorous and consistent monitoring protocols and instruments and a platform to turn raw MPA monitoring data into actionable information is a critical but under-recognized obstacle to cross-project learning, comparative analyses, and adaptive resource management.

Without effective management, protected areas are unlikely to achieve the high expectations the conservation and development sectors have for them: conserving biodiversity and alleviating poverty. Numerous marine protected area (MPA) assessment initiatives have been developed at various spatial and temporal scales, including the guidebook How is your MPA doing? These management assessments have been useful to sites to clarify and evaluate their objectives, yet efforts to examine broader regional or global patterns in MPA performance are only beginning. The authors conducted exploratory trend analyses on How is your MPA doing? indicator data collected by 24 MPAs worldwide to identify challenges and areas for future work. Wide variability across sites with regard to the indicators examined and the constructs used to measure them prevented a true meta-analysis. Managers assessed biophysical indicators more often than socioeconomic and governance constructs. Investment by the conservation community to support collecting and reporting high-quality data at the site level would enable a better understanding of the variation in MPA performance, clarify the contribution of MPAs to both biodiversity conservation and poverty alleviation, and help drive better MPA performance. The absence of rigorous and consistent monitoring protocols and instruments and a platform to turn raw MPA monitoring data into actionable information is a critical but under-recognized obstacle to cross-project learning, comparative analyses, and adaptive resource management.

Introduction
Marine protected areas (MPAs) have been established with diverse goals, including protecting marine biodiversity and habitats from degradation, replenishing depleted fish populations, regulating tourism and recreation, accommodating conflicting resource uses, and enhancing the welfare of local communities (Pomeroy, Parks, and Watson 2004;Thorpe, Failler, and Bavinck 2011). Advocates highlight MPAs as a win-win strategy for biodiversity conservation and poverty alleviation (Russ et al. 2004), while others have found that MPA planning can alienate impoverished fishing communities (Agardy, Notarbartolo di Sciara, and Christie 2011) or produce negative outcomes (Christie 2004;Gjertsen 2005;Silva 2006). MPA management varies widely, highly active in some cases while little more than "paper parks" in others (McClanahan 1999;McClanahan et al. 2006;Samoilys et al. 2007), with many MPAs not achieving their management objectives (Ban et al. 2012;Jones 2001;Kingsland 2002;Mora et al. 2006).
Objective evaluation of MPA performance promotes responsible management and guides both the planning of future MPAs and the efficient distribution of human and financial resources (Gaines et al. 2010;Pomeroy et al. 2005;Thorpe, Failler, and Bavinck 2011). Some attempts have been made to evaluate MPA performance (Thorpe, Failler, and Bavinck 2011;Gaines et al. 2010, and other papers in these Special Features), including regionally (van't Hof 1988), by ecosystem (e.g., for coral reefs Hargreaves-Allen, Mourato, and Milner-Gulland 2011; Selig and Bruno 2010), or globally (e.g., Kelleher, Bleakley, and Wells 1995;Pomeroy and Campson 2008;Halpern 2003). Recent worldwide assessments have been conducted of ecological indicators of coral cover (Selig and Bruno 2010) and of density, biomass, and species richness of temperate (Stewart et al. 2009) and global ) marine reserves. A recent initiative examined MPA governance around the world (De Santo et al. 2013), building upon a long history of case study research (e.g., Fiske 1992;Christie, White, and Buhat 1994) and comparative analyses (e.g., Pollnac et al. 2010b;Pollnac, Crawford, and Gorospe 2001). Fisheries and social impacts of MPAs are increasingly being examined as well (McClanahan 2010;Mascia, Claus, and Naidoo 2010;Cinner et al. 2012Cinner et al. , 2013Gjertsen 2005). Comparative assessments of ecological and socioeconomic success have been conducted in some regions (e.g., Cinner and McClanahan 2006), and globally comprehensive, field-based assessment of MPAs are increasingly attempted (Pollnac et al. 2010a). In general, these assessments are undertaken by integrated research groups, rather than by consolidating site-level information collected by different managers (but see De Santo et al. 2013).
Because most MPAs lack sufficient information to measure performance (Jones 2001), numerous assessment initiatives have been developed at various spatial and temporal scales (Wells and Dahl-Tacconi 2006). These efforts range from management assessment protocols like the World Bank Score Card (Staub and Hatziolous 2004) to more involved frameworks for field monitoring like How is your MPA doing? (HIYMPAD), a guidebook with a comprehensive, data-driven methodology to monitor biophysical, social, and governance indicators (Pomeroy, Parks, and Watson 2004). Region-specific adaptations were created as well (Wells and Mangubhai 2004;White et al. 2006), all intended to guide MPA managers in conducting local assessments and evaluation (Pomeroy, Parks, and Watson 2004;Pomeroy et al. 2005). As intended, these assessments have been useful to sites for both clarifying and evaluating their management objectives (Pomeroy, Parks, and Watson 2004), but no standardized set of measures or global coordination mechanism for sharing and analyzing comparable data exists. The developers of HIYMPAD, IUCN's Marine Section of the World Commission on Protected Areas (WCPA), originally envisioned and preliminarily drafted such a mechanism, but it was not pursued due to concerns that standardization would be difficult and managers might be reluctant to participate (Pomeroy et al. 2005;R. Pomeroy and J. Parks, personal communication).
Nonetheless, given the 200+ sites that have employed HIYMPAD (J. Parks, personal communication) and the absence of many globally representative MPA performance datasets, it is imperative that researchers and policy-makers are able to derive the maximum possible empirical and methodological insights from past HIYMPAD assessments and other monitoring efforts with such wide distribution and great expenditure of managerhours. The more recent emergence of several analogous research initiatives-for example, the University College London Marine Protected Area Governance program (De Santo et al. 2013), Conservation International's Marine Managed Area Science project (Pomeroy and Campson 2008), and WWF's Solving the Mystery of MPA Performance initiative (Fox et al. 2012a)-only makes more salient the need for investigation into methods of synthesizing site-level assessments toward larger analyses.
To explore the potential for a global comparison along these lines, management assessments of MPA impacts using HIYMPAD biophysical, socioeconomic, and governance indicator data were examined in the context of Elinor Ostrom's comprehensive conceptual framework for analyzing sustainability of Social-Ecological Systems (2009). Inspired by theories of common-pool resource governance (Ostrom 1990(Ostrom , 2009, the framework posits a complex flow of causal relationships among domains of MPA governance, human resource use patterns, ecological integrity, and social well-being (each with sub-domains), influenced by social and biophysical context ( Figure 1). This ontology provides a heuristic model for understanding the diverse factors that affect, or are affected by, marine resource management. Though data were insufficient to test this model statistically, our classification of HIYMPAD indicators into this system provided the dual benefits of focusing a fairly broad inquiry along relevant conceptual lines and strengthening its comparative power. We were able to statistically test whether domain, sub-domain, MPA size, and other factors are related to relative performance score.
The objective was not to conduct analyses at the site level or to single out the performance of particular MPAs, but rather to examine protected area goals and objectives and explore the possibility of merging site-level HIYMPAD data into a single, meaningful body of knowledge-to incorporate local context into emerging analytic insights in order to understand "how MPAs might be ever more effectively established/managed" (Thorpe, Failler, and Bavinck 2011). These analyses highlight the current major challenges facing a meta-analysis of MPA performance and identify avenues for future work necessary to answer the question "How are our MPAs doing?"

Study Sites and Characteristics
The sources for the work described below were reports from MPAs that had participated in pilot and subsequent phases of development of HIYMPAD, most of which had been funded by the National Oceanic and Atmospheric Administration (NOAA)'s International Coral Program, which from 2002-2008 made small grants ($5,000-30,000) to support the Figure 1. Conceptual framework of marine protected area management, theorizing a flow of causal relationships among governance, ecological, and social systems, within sociocultural, political, economic, and ecological contexts. Broad domains (e.g., MPA governance) house topical subdomains (e.g., Resource use rights), which are associated with relevant indicators (e.g., G4) from the How is your MPA doing? management effectiveness assessment guidebook. use of HIYMPAD methodology (Parks 2009). Protected area managers, staff, and consultants originally collected the source MPA data using the detailed procedures described in HIYMPAD (Pomeroy, Parks, and Watson 2004;Pomeroy et al. 2005). Methods varied from reviewing existing information to collecting new field data (household questionnaires, key informant interviews, and biological sampling). The resulting 24 MPAs' reports, obtained from the NOAA International Program, represent a "convenience sample" (Patton 1990) with geographic, ecological, social, and political diversity (see Figure 2 and Table  S1 in Supplementary Materials). The sites are from 13 nation-states and one U.S. territory, with the majority (18 of 24) in the tropics. The length of time between an MPA's establishment and its HIYMPAD assessment ranged from under 2 to over 20 years. The MPAs also vary greatly in size, from 0.006 km 2 to ∼135,000 km 2 . For this work, MPAs were classified into small (<100 km 2 , n = 9), medium (100-2,500 km 2 , n = 8), and large (>2,500 km 2 , n = 3) groups to explore the effect of MPA size on performance (natural breaks method). Because the objective was not to conduct analyses at the site level or to single out the performance of particular MPAs, data from individual MPAs are not reported.

HIYMPAD Data Coding
Data were coded for each indicator onto a nine-point ordinal scale (1 = highly negative, 5 = neutral, 9 = highly positive) (Bernard 2006) using an original schema (Table S2 in Supplementary Materials). We did not score baseline data, since they represented a snapshot of current MPA conditions that could not contribute to trend analysis. A dummy score was used for cases in which no data was presented in the reports despite the apparent assessment of the indicator.

MPA Goals and Objectives
HIYMPAD associates each of its indicators (10 biophysical, 16 socioeconomic, and 16 governance indicators) with evaluation of particular management objectives; these 68 objectives fall under 16 broad MPA goals (see Table S3 in Supplementary Materials for numbers of indicators that can be used to assess each goal and objective). The MPAs in this study each reported on a different subset of indicators, which examine differing goals and objectives. To explore which goals and objectives received the most attention from the monitoring and evaluation efforts, we calculated indices of both representation and effort. (For more information on these measures, see Supplementary Materials.) For the examination of MPA goals and objectives, all indicator scores were used, including baseline and dummy scores, since these scores show intent to measure a particular indicator. The resulting final set for analysis included 228 measurements across 22 MPAs.

MPA Performance
Data processing. For each indicator measurement, six variables were noted: the specific target of assessment (e.g., focal species, type of stakeholder participation), measures used (e.g., mean shell length, number of enforcement patrols), sampling methodology, data collected, percent change observed over time, and any statistical results presented. This dataset, though broad-based and collected using a single assessment tool, did not provide the level of statistical detail needed in order to conduct a conventional meta-analysis, so we pursued an alternate method and transformed all data into a consistent numerical format (see again Table S2 in Supplementary Materials).
For the examination of management trends, dummy scores and the baseline data were excluded, as were records that did not directly address a particular HIYMPAD indicator or that did not follow the guidebook's assessment procedures. In total, these exclusions removed 38.3% of the data (37.0% of biophysical data, 30.2% of governance data and 43.8% of socioeconomic data) from analysis and eliminated all data from four of the original 24 MPAs.
The final dataset included 462 measurements of 33 different indicators, each assessed in at least one of 20 MPAs. Multiple measurements of the same indicator at a given site were averaged, resulting in one score per site for each indicator assessed there. All further MPA performance analysis used these 157 site scores, which we treated as independent, given the many sources of variability even within a particular MPA likely to introduce error. One score for each domain and subdomain was generated by determining the mean site-specific indicator scores for each category.
Data analysis. Indicators were characterized based on the categorical domains (Figure 1), each domain comprised of multiple subdomains (e.g., monitoring and enforcement, trophic structure, education). Most subdomains include one or more attributes represented by the HIYMPAD indicators (e.g., enforcement coverage, species abundance) and measured using specific metrics suggested in the guidebook (e.g., area patrolled, number of individuals per km 2 ).
The grand mean and other summary statistics of indicator scores were calculated, and tests were done to examine the effects of MPA size and domain and subdomain grouping on performance score (Kruskal-Wallis tests followed by pairwise comparisons using the Wilcoxon method with Bonferroni correction). To reduce the possibility of spurious comparisons associated with small sample sizes, categories with n < 8 were excluded from the statistical analysis. To test whether the age of an MPA or national differences shaped observed patterns in MPA performance, correlations were examined between indicator scores and years between MPA establishment and assessment, as well as national-level geographic, economic, fisheries, development, and governance metrics (Spearman's rank tests). All statistical tests were run using JMP 9. (See Supplementary Materials for further details.)

MPA Goals and Objectives
The number of HIYMPAD goals and objectives addressed by each indicator varied (Table 1). The analysis of representation reveals that all 20 of the top MPA objectives and all five top MPA goals most commonly assessed by managers were biophysical. In contrast, three of the six socioeconomic goals in the guidebook were among those five least assessed (Tables 2A  and 2B for goal and objective scores, respectively). When objectives and goals were ranked based on effort, however, the results were more varied. Using the effort index, the top goal was governance-related, with three biophysical and another governance goal in the top five. The most thoroughly assessed goal, G1 ("Effective management structures and strategies maintained"), had an effort index of 0.33; and the least assessed, G5 ("Resource use conflicts managed and reduced"), an effort index of 0.04. Of the top 21 most thoroughly assessed objectives, 10 were biophysical, 8 were governance, and only 3 were socioeconomic. The most thoroughly assessed objective, G3A ("Representativeness, equity, and efficacy of collaborative management systems ensured") had an effort index of 0.50, meaning that 50% of all possible assessments were made (either all sites assessing half of the indicators, or half of the sites assessing all indicators, or some combination thereof). A single objective, G1F ("Periodic monitoring, evaluation, and effective adaptation of management plan ensured") was not assessed at all, so had an effort index of 0.

MPA Performance
Many challenges emerged through this exercise, revealing limitations of data quality and analysis and highlighting the need for improvements in MPA monitoring and evaluation to gain a better understanding of the variation in MPA performance (see Discussion). Bearing in mind that our global dataset-while geographically, ecologically, socially, and politically diverse (Figure 2)-emerged from a "convenience sample," the MPAs in this study did average marginally positive scores on HIYMPAD indicators. The mean MPA performance score for our final 157 measures was 6.24 ± 1.78 SD, with the median of 6.5 and mode of 6 (see Figure 3 for score distribution). The mean of the indicator dataset differed significantly from a hypothetical mean value of 5 (Wilcoxon signed-rank T 151 = 3482, p < .0001), representing a mild positivity in overall MPA conditions above the "average" (i.e., trendless; remaining at baseline) state. MPA size was significantly related to performance scores, (χ 2 2 = 7.38, p = .025). Post-hoc tests indicated that small MPAs (<100 km 2 ) received higher performance scores than medium-sized reserves (p = .007), although less than one point higher on average. Our assessment sample in large reserves was considerably less substantial, and variability was too high to draw conclusions about their performance as a group.

Domains and Subdomains
MPAs performed fairly equivalently across governance, ecological, and social domains (χ 2 4 = 6.54, p = .16), though performance scores for "social condition" were significantly lower than the grand mean (z 22 = -2.25, p = .025, Figure 4A). Looking at a finer scale, subdomains differed significantly from one another, suggesting heterogeneity in performance between different thematic constructs (χ 2 5 = 10.96, p = .052, Figure 4B). Performance scores for "education"-including resource user knowledge as well as stakeholder and community training by management-were lower on average than performance at large (z 14 = -2.88, p = .0039, Figure 4B) and were also less positive than scores for species Table 2 Marine protected area ( population size (Z = 2.51, p = .012) or "decision-making arrangements" (Z = -3.00, p = .0027). MPAs also performed higher on assessments of "decision-making arrangements" than performance at large (z 39 = 2.33, p = .020).

Non-Management Factors
No relationship emerged between MPA performance or age of MPA, or between MPA performance and measures of country area, length of coastline, population, or amount of fisheries exports (Spearman's rank tests). MPA performance was significantly correlated with development (HDI) (Spearman's R = 0.25, p = .002), economic vitality (PPP) (R = 0.22, p = .0067), integrity of governance (WGI) (R = 0.27, p = .0013), and number of fishers (R = -0.17, p = .038). The positive correlations of MPA performance with common measures of socioeconomic development (HDI, PPP, and WGI) suggest that resource protection is related to the welfare of human populations and reduction of corruption, so investments in these areas may improve conservation outcomes secondarily. However, the strongest correlation explained less than 7% of variability in indicator scores, and HDI, PPP, and mean WGI were themselves very highly correlated (R 2 = 0.603-0.915), suggesting that much of their effect on performance stemmed from the same source construct. Other contextual factors beyond those we tested may impact MPA performance as well, including source and level of funding, level of community support, and management structure, and others as discussed in DeSanto et al. (2013). While some national-level variables-such as coastline length, HDI, and spatial overlap with designated conservation priority areas-have been shown to  (B, bottom). Horizontal line denotes the grand mean of all indicator assessments (6.24). The domain "social condition" performance scores were significantly lower than the grand mean (A). Subdomains differed significantly from one another (p = .052), with decision-making higher than average and education lower than average and lower than species population size or decision-making (B). correlate positively with MPA establishment (Fox et al. 2012b), the large-scale contextual variables we examined explained little variation in our sample of MPA performance. These results suggest that local factors are likely more important drivers of MPA performance.

Challenges to Meta-Analysis
This analysis highlights challenges to documenting and explaining MPA performance; some are specific to HIYMPAD, while others are characteristic of MPA monitoring and evaluation in general. HIYMPAD is not a management effectiveness assessment tool per se, instead functioning as an assessment "toolbox" or framework for monitoring efforts. The guidebook provides a comprehensive set of indicators, with multiple constructs and flexible methods for each, and aids managers in translating institutional goals and objectives into priority foci for investigation. To be clear, HIYMPAD was never intended to provide standardized management effectiveness data for the global conservation community, and it is not surprising that the resulting data varies widely because of the nature of how the guide was designed to be used. Nonetheless, the great interest in the potential of gaining global insights from such extensive local datasets motivates attempts such as those described here in order to capitalize on current and past investments in monitoring and evaluation.
Despite the shared methodology of HIYMPAD, operational challenges stemmed primarily from the format and consistency of the monitoring data. While a conventional meta-analytic technique is well-suited to the problem of combining site-level results to examine macro-scale patterns, less than 3% of the data presented in 24 internal MPA reports presented the basic statistical detail needed to use the method (i.e., sample sizes, means, and standard deviations). In addition, many socioeconomic indicators in HIYMPAD, and almost all its governance indicators, produced narrative reports or descriptions as their prescribed output. This information is useful for the longitudinal, qualitative, and site-level assessments for which HIYMPAD developers designed the guidebook. Meta-analysis and statistical cross-site comparison, however, would require researchers to re-parameterize quantitative indicator measurements and use consistent methods and measures for each indicator. Furthermore, HIYMPAD often provides more than one means of measuring an indicator (e.g., in the case of species abundance, variables of interest include density, number, size, catch per unit effort, weight of catch, and biomass), and methodologies for measuring each of these varied between sites. While this degree of flexibility is helpful at the site level, it results in inconsistent data acquisition at larger scales and creates significant analytical problems when statistical power is desired. Recent efforts to examine MPA governance based on exploratory conceptual frameworks (e.g., De Santo et al. 2013) are likely to experience similar challenges as sample sizes and heterogeneity increase.
To make comparisons across sites, HIYMPAD data was transformed to an ordinal scale. This was complicated to manage and introduced interpretive problems into the analysis of global MPA performance patterns. Limited information provided about the original data (e.g., year[s] it was collected, how it was collected) prevents knowing whether the trends were measured over similar time periods and whether performance variables are measured in years after the causal variables are measured, all important for assessing the validity of results. Scoring all indicators according to a common scale gives equal validity to all data, which is a serious drawback. Furthermore, the magnitude of observed differences is difficult to translate into on-the-ground impacts because the ordinal scores do not reflect actual continuous data. In addition, though there is a need to average indicator values in order to conduct statistical testing, doing so obscures the heterogeneity within and among outcomes. This system of accounting does not keep track of "winners" and "losers," ignoring the distributional effects that may result from resource management (Mascia and Claus 2009;Mascia, Claus, and Naidoo 2010). If MPAs are generating positive outcomes for some and negative outcomes for others, the use of continuous data is necessary for a more complex and nuanced interpretation of results.
The issues highlighted herein are by no means restricted to the marine environment. Performance of terrestrial protected areas remains unclear as well (Chape et al. 2005), despite recent efforts to evaluate global protected area management (e.g., Leverington, Hockings, and Costa 2008), mandatory use of the Management Effectiveness Tracking Tool (METT) (Stolton et al. 2007) in all recent World Bank-and Global Environmental Facility-funded projects, and the widespread application of the Rapid Assessment and Prioritization of Protected Area Management (RAPPAM) methodology (Ervin 2003). Regardless of the particular system of assessment used, logistical, financial, and jurisdictional realities often demand that managers themselves conduct the evaluations of their protected areas and collect the relevant data on outcomes, potentially reducing credibility or skewing the overall picture of MPA performance through self-reporting bias (Mascia 1999;Heck et al. 2011). Models to reduce the potential conflict of interest inherent in both managing resources and then assessing that management include external assessment and participatory assessment, both of which can be conducted in combination with internal assessment .

MPA Goals and Objectives
These findings reveal a mismatch between the balanced monitoring and evaluation portfolio proposed by HIYMPAD and actual management practice (at least in the pilot stage); the monitoring is skewed toward biophysical goals and objectives. The discrepancy is partly due to the fact that biophysical indicators cover more goals and objectives (Table 1). By assessing just a few indicators, managers can address many more biophysical goals and objectives than governance or socioeconomic ones. Nonetheless, over half of the MPAs sampled for this study explicitly list improvements in social conditions as major project goals. Such improvements have been significant drivers underlying the formation of many reserves, yet only three of the top 21 objectives and none of the top five goals assessed were socioeconomic.
By HIYMPAD's own system of accounting, biophysical assessments are also more difficult to conduct than other types of assessments, with an average difficulty rating of 3.8 out of 5, compared with 2.8 for socioeconomic indicators and 2.4 for governance indicators. Yet the imbalance we observe is consistent with the broader pattern of greater focus on ecological rather than sociopolitical MPA studies (Ojeda-Martinez et al. 2007;Mascia et al. 2010) and may be a function of managers' background and/or training circumscribing their perception and definition of their professional role. Concerted efforts to build awareness and capacity around the importance and methods of social science research would ensure that managers have the skill set necessary to assess MPA impacts on resource users effectively (Teh and Teh 2011;Bunce et al. 2000;Mascia 2003;Mascia et al. 2003Mascia et al. , 2010. Amidst calls for better ecological and social balance (e.g., Christie 2011), several recent assessment efforts strive to make progress in this area, including the UNEP-WCMC's Protected Areas Management Effectiveness (PAME) module, the Global Study of Management Effectiveness (e.g., Leverington, Hockings, and Costa 2008), Conservation International's Marine Management Area Science (MMAS) program (Pomeroy and Campson 2008), and WWF's Solving the Mystery of MPA Performance research initiative (Mascia et al. 2010;Fox et al. 2012a).

MPA Performance
Given the challenges presented above, this analysis of biophysical, socioeconomic, and governance indicator data from a global but opportunistic sample of MPAs can only be considered an exploratory exercise. In addition, these results are likely a "best-case scenario," since the sites included in this analysis were among the first to pilot management effectiveness testing using HIYMPAD and likely had closer connections to conservation agencies, greater administrative capacity, or other characteristics that may bias indicator scores and cause this subset of MPAs to differ from the average MPA. That said, our results suggested modest positive trends and, as in other studies of protected areas Mascia et al. 2010), smaller MPAs were correlated with better performance. Despite the potential significance of MPA age or contextual economic, fisheries, development, and governance factors in driving governance and management trends (Fox et al. 2012b), these explain little of the variability in indicator data, emphasizing the importance of site-specific factors in driving MPA performance.
MPAs appeared to experience particular success in the area of "decision-making arrangements," comprised of components such as management plans, decision-making bodies, and enabling legislation. It is important to note, however, that HIYMPAD indicators survey existence of a decision-making body, adoption of a management plan, and adequacy of legislation, and do not generally assess information on their degree of implementation or effectiveness. Thus, a "paper park" with a well-written management plan and clear legislation that are not put into practice will still receive high marks on many governance indicators. Other frameworks do examine how governance systems are practiced, in addition to their existence (De Santo et al. 2013;Hockings et al. 2006;Wells and Mangubhai 2004). Although data are insufficiently powerful to test the relationships in Figure 1 statistically, effective governance is likely the preeminent means to achieving the beneficial social and ecological outcomes that the conservation and development sectors are seeking (Mascia 2004). Without effective governance, ecological benefits of management are not guaranteed to be fairly and sustainably distributed as socioeconomic benefits to resource users.

Implications for Science and Policy
Having effective MPAs requires knowing what works, what does not, and why-yet it is currently nearly impossible to roll up existing approaches for site-level monitoring to get valid higher-level comparisons. HIYMPAD continues to be utilized by sites, as well as adapted to local contexts (Wells and Mangubhai 2004;White et al. 2006), but is generally not used to influence policy actions on national or global scales. Exploratory analyses suggested mildly positive outcomes for the sample of MPAs we examined, with measures of stakeholder education outcomes lower than assessments of species population outcomes and performance on average. However, challenges at the project level (e.g., disconnect in monitoring of goals and objectives) and the program level (e.g., difficulties in measuring across sites) impact policy and make it hard to identify specific forms of MPA governance that foster both biological and social benefits. Although hundreds of HIYMPAD MPA assessments now exist, a system to compare these results and yield global datasets is not yet available, despite many requests from MPA managers themselves (J. Parks and R. Pomeroy, unpublished data).
Moving forward, any global system for MPA evaluation that would allow cross-MPA comparisons should also ensure site-level benefits, such as secure data storage and assistance with analysis and reporting, to individual MPA managers. Recommendations include that a "next-generation" HIYMPAD contain indicators to assess network effects, given the growing understanding that MPAs may function better as part of a network than on their own, as well as increased emphasis on measuring the social impacts of MPAs for a more balanced picture of interdisciplinary MPA performance (Mascia et al. 2010;McClanahan, Maina, and Davies 2005;Pomeroy, Parks, and Watson 2004). Using HIYMPAD as part of a quasi-experimental before-after control-impact (BACI) sample design could also help disentangle direct effects of management action from factors beyond management jurisdiction or prevailing environmental conditions (Underwood 1994;Lincoln-Smith et al. 2006;Ferraro and Pattanayak 2006). This finer understanding could be indispensable for directing future efforts, as other studies have found that trends in biological and social indicators may be unrelated to local management efforts (Jameson, Tupper, and Ridley 2002;Tobey and Torell 2006).

Conclusion
Without effective management, protected areas are unlikely to achieve the twin goals of conserving biodiversity and alleviating poverty. Yet, efforts to examine broader regional or global patterns in MPA performance are only beginning. Strengthened global MPA datasets would provide tremendous potential for gleaning lessons about what governance structures and resource use patterns may help drive stronger MPA performance. By reforming assessment tools and monitoring systems to collect standardized, quantitative data, and by drawing upon this data collectively, a true meta-analysis would become possible. The absence of this platform to turn raw MPA monitoring data into actionable information is a critical but under-recognized obstacle to cross-project learning, comparative analyses, and adaptive management. An accurate assessment of global impacts to explain more robustly the variation in MPA performance would provide critical insights to policymakers and practitioners in the conservation and development community. Monitoring a system or group of MPAs as a whole would not only enable MPA managers to evaluate success and adaptively manage at the site level, but also provide globally transferable adaptive insights into replicating success, reforming failure, and avoiding potential mishaps in the design and management of MPAs.
More thorough and credible assessments are a key step toward investigating the notion, often contested within conservation and fishing communities, that management and protection of marine resources through MPAs is a policy choice with ultimately beneficial outcomes for both ecological and social systems (Gell and Roberts 2003;Balmford et al. 2004). Although this article examined MPA performance, the recommendations and conclusions for improving the power and scope of future work in this area should be broadly applicable to all protected areas.
F. Leverington, A. Lombana, M. McField, A. Paterson, and L. Watson. We are grateful to E. Alicia (NOAA) for her assistance with the reports and to L. Glew and S. Palminteri for statistical advice. This article is contribution #6 of the WWF research initiative Solving the Mystery of MPA Performance.

Supplemental Material
Supplemental data for this article can be accessed on the publisher's website.