Effect of land cover and use on dry season river runoff, runoff efficiency, and peak storm runoff in the seasonal tropics of Central Panama

A paired catchment methodology was used with more than 3 years of data to test whether forests increase base flow in the dry season, despite reduced annual runoff caused by evapotranspiration (the “sponge‐effect hypothesis”), and whether forests reduce maximum runoff rates and totals during storms. The three study catchments were: a 142.3 ha old secondary forest, a 175.6 ha mosaic of mixed age forest, pasture, and subsistence agriculture, and a 35.9 ha actively grazed pasture subcatchment of the mosaic catchment. The two larger catchments are adjacent, with similar morphology, soils, underlying geology, and rainfall. Annual water balances, peak runoff rates, runoff efficiencies, and dry season recessions show significant differences. Dry season runoff from the forested catchment receded more slowly than from the mosaic and pasture catchments. The runoff rate from the forest catchment was 1–50% greater than that from the similarly sized mosaic catchment at the end of the dry season. This observation supports the sponge‐effect hypothesis. The pasture and mosaic catchment median runoff efficiencies were 2.7 and 1.8 times that of the forest catchment, respectively, and increased with total storm rainfall. Peak runoff rates from the pasture and mosaic catchments were 1.7 and 1.4 times those of the forest catchment, respectively. The forest catchment produced 35% less total runoff and smaller peak runoff rates during the flood of record in the Panama Canal Watershed. Flood peak reduction and increased streamflows through dry periods are important benefits relevant to watershed management, payment for ecosystem services, water‐quality management, reservoir sedimentation, and fresh water security in the Panama Canal watershed and similar tropical landscapes.


Introduction
[2] Tropical forests play a major role in global water and carbon dynamics. The humid tropics presently occupy about 25% of the Earth's land surface, with tropical forests covering about half this area. Existing tropical forests contain 45-52% of terrestrial biomass carbon, 11-17% of soil carbon, and account for 23-35% of global net primary productivity [Prentice et al., 2001]. Mean annual evapotranspiration from tropical forests is about 1550 mm per year, exceeding that of any other land cover [Calder, 1999;Zhang et al., 2001].
[3] In this study, we examine the effect of land use and land cover on catchment-scale hydraulic function in landscapes that are typical of the Panama Canal watershed (PCW) and much of the seasonal tropics having pronounced wet and dry seasons. About half of the PCW has been deforested, and the official policy is to foster sustainable management, including reforestation, in anticipation of regaining ecosystem services and improving the livelihoods of rural farmers [Autoridad del Canal de Panam a (ACP), 2006, 2010Cerezo, 2011]. Desired ecosystem services include: improved water quality, increased dryseason base flow, reduction of hydrograph peaks, fewer wildfires, reduced erosion, increased carbon storage, increased biodiversity and environmental resilience, preservation of undiscovered pharmaceuticals, and timber production.
[5] In this paper, we use the paired catchment methodology to examine two desirable hydrological ecosystem services attributed to forests, increased dry-season flow associated with greater wet-season infiltration, the socalled ''sponge-effect,'' and the reduction of peak runoff rates and volumes. The term ''sponge-effect'' has been used to lump a broad suite of beneficial characteristics of soil hydrology attributed to the presence of forests as compared to other land covers. Bruijnzeel [2004] provided a thorough review of the concept, and argued that while the factors that contribute to the sponge will also reduce flood peaks, reduced flood peaks alone do not provide proof of the sponge-effect hypothesis. Other factors such as canopy interception and changes in flow path can also reduce flood peaks without necessarily enhancing groundwater recharge. We agree with Bruijnzeel [2004], and restrict the ''sponge effect'' to a condition whereby well developed forest cover promotes high infiltrability and groundwater recharge during the wet season leading to increased streamflow during dry periods, despite the reduction of total annual runoff [Malmer et al., 2010]. We separately examine whether land use and land cover reduces peak runoff rates, total runoff, and the rapidity of the rise and fall of the peak (flashiness) during storms [Van Dijk and Keenan, 2007].
[6] Reviews of the sponge-effect hypothesis reveal considerable controversy. For example, Center for International Forestry Research (CIFOR) [2005] states: ''When it comes to prevention of major floods, the ''sponge theory'' is a historical erratum-a fiction often inappropriately used to justify soil and water conservation measures, appropriate forest management and logging bans. Unfortunately, the ''sponge theory'' has also been used inappropriately to secure funds for various development and governmental projects.'' The Forward of CIFOR [2005] includes the statement that the document ''...does not pretend to be an exhaustive overview of the subject...''. However CIFOR [2005] has had a major influence on policy since its publication, and deserves to be evaluated in the context of field observations.
[7] The value of reforestation/afforestation is also disputed. Farley et al. [2005] examined the effects of afforestation on water yield, but did not include data sets from steep catchments with saprolitic soils in the humid seasonal tropics. FAO [2002] cautions against the application of general statements, and encourages site specific observations and models in evaluating the effects of afforestation on hydrological behavior. Calder [1998] argues that the local interplay between infiltration, groundwater recharge, and increased evapotranspiration due to afforestation will ultimately determine the impacts of afforestation on dry season river flows. He also notes that this interplay is likely to be highly site specific. Calder [1998] asserts that planting deep-rooting trees can reduce groundwater recharge. We note that once dead, however, deep roots can also serve as preferential flow paths in tight soils and may increase recharge of groundwater. To this, we add to this another factor that seems to be overlooked, the effect of ground water withdrawals by tree roots on the slope of the water table. This will affect ground water discharge to streams in the late dry season. Some assert that as the amount of precipitation increases, the effects of soil and plant cover on storm flow diminishes [Brooks et al., 1989;Bruijnzeel, 1990]. However, this generalized statement has not been thoroughly tested, and might be dependent upon site specific factors such as to invalidate its generality.
[8] The spatial scale over which hydrological effects of land cover contrasts might be observable is debated. Van Dijk and Keenan [2007] state ''A recent science digest reiterates that there are no strong empirical or theoretical arguments to expect a reduction of flooding in large basins [CIFOR, 2005]; such synoptic events are directly associated with prolonged, intensive and large scale rainfall events. This digest ultimately comes from Kiersch [2000] who provides a table, without citations, of the spatial dimensions of land use impacts on hydrological variables such as average flow, peak flow, base flow, and groundwater recharge. The core assertion of this table is that land cover and land use affects only small watersheds, no larger than hundreds of square kilometers. This table has been used to justify general hydrological policy [e.g., FAO, 2002;CIFOR 2005].' ' [9] In this context, the sponge-effect hypothesis has been central to the controversy of forest versus water supply in the PCW. One study, funded by the US Agency for International Development [Heckadon-Moreno et al., 1999; Proyecto de Monitoreo de la Cuenca del Canal de Panam a (PMCC), 1999] used 9 months of data from the two of the research catchments discussed in the present paper and presented data that supported the sponge-effect hypothesis. Another report commissioned by the World Bank [Calder et al., 2001] argued against the sponge-effect hypothesis and against reforestation of the PCW in general. A third report, again commissioned by the World Bank [Aylward, 2002], evaluated the other two reports and described the results as equivocal. Aylward [2002] acknowledged that although the PMCC [1999] data seem to demonstrate a sponge-effect, the study was performed using data collected during 1997, an unusually dry year coincident with a major El Niño event, and was too short term to be conclusive. Calder [2007] noted that there is uncertainty in the PCW regarding the effects of afforestation on low flows, and cites Aylward [2002] in recommending further study of the issue.
[10] A major problem in demonstrating a sponge-effect is that it is difficult to define and measure. Bruijnzeel [2004], in summarizing research globally, pointed out that delivery of dry season streamflows involves a complex interplay of both the soil hydrological properties and those of the geologic substrate, asserting that ''this does not so much impair the usefulness of the ''sponge'' concept but rather illustrates the range of conditions under which it can be usefully applied.' ' Bruijnzeel [2004] noted that resolution of this problem requires extensive research into the physics of water movement in forested and deforested landscapes, and that one of the fundamental questions is whether forest restoration can restore dry season storage. The influence of heterogeneous geology and soils on the degree of the sponge-effect, is especially important when considering larger scales, such as the entire Panama Canal watershed [Bruijnzeel, 2004;CIFOR, 2005;Van Dijk and Keenan, 2007;Malmer et al., 2010].
[11] Prior studies focus on soil matrix properties as they might affect hydrologic response. However, the influence of macropores, roots, and other preferential flow paths, and how their function is affected by land use (e.g., grazing) is largely not considered. The presence of macropores, soil pipes, and other biologically created preferential flow paths makes it difficult to upscale from localized measurements of soil matrix infiltration using small scale cores or infiltrometers or soil classifications based on these measurements [Chappell et al., 1998[Chappell et al., , 2005Ghimire et al., 2013]. A major problem is that deep infiltration, lateral subsurface flow, and overland flow are affected by both microscale and macroscale permeability and that small differences in permeability can translate into large effects in water movement through deep or shallow flow paths.
[12] In Panama, preferential flow paths have been shown to affect infiltration at the hillslope scale. Niedzialek [2007] and Hendrickx et al. [2005] measured infiltration rates at a depth of 10 cm, ranging between 80 and 600 mm h 21 in Chagres National Park approximately 40 km east of the sites reported in this paper, on similar Oxisols. These high infiltration rates are attributed to vertical and lateral preferential flow downslope from live and decayed tree roots and animal burrows [Bonell, 1993;Noguchi et al., 1999;Niedzialek, 2007]. Dye tracer tests confirm the presence of numerous preferential flow paths [Niedzialek, 2007].
[13] A recent study in a rainforest in eastern Puerto Rico [Larsen et al., 2012] employed an exclusion strategy to examine the effect of earthworm burrowing on overland flow and sheet wash erosion on small test plots. Results showed that exclusion of earthworms approximately doubled overland flow, while plot-scale sheet wash erosion increased fourfold. Presumably this increase in overland flow, if generalized to an entire hillslope, would translate into larger hydrograph peaks.
[14] In summary, this study provides data that can be used to test hypotheses related to the hydrological function of small catchments in the Panama Canal watershed as a function of land use and land cover. We provide hard data to inform a debate that is currently occupied by generalities. While our study catchments are small (<2 km 2 ), a large catchment is ultimately composed of a large number of small catchments. For instance, our topographic analysis of the Panama Canal watershed topography shows that 96% of the 2900 km 2 nonlake portion of the PCW is covered by watersheds that are 150 ha in size.

General Research Setting and Motivation
[15] Similar to much of the tropics, land cover and land use in central Panama and the PCW have changed dramatically since European settlement [Robinson, 1985;Condit et al., 2001;Heckadon-Moreno et al., 1999;Ib añez et al., 2002;ACP, 2006ACP, , 2010. Within the PCW, land use has a long history of differing styles of agricultural development, primarily small-scale subsistence farming and cattle grazing. Between 1986 and 2003, the PCW lost 8% of its mature forest. However, the Panama Canal Authority (ACP) subsequently implemented what amounts to a Payment for Ecosystem Services scheme to restore forests in the PCW through reforestation, silvipastoral practices, and agroforestry [Cerezo, 2011;Autoridad del Canal de Panama, 2012]. A comparison of Landsat data from 2003 to 2008 in the PCW shows that the forested area has increased by approximately 4% as the result of forest conservation, reforestation, agroforestry, and the abandonment of pasture and crop lands [ACP, 2010].
[16] In 2008, the Smithsonian Tropical Research Institute (STRI) began a landscape-scale study, the Agua Salud Project, focused on understanding the ecosystem services provided by forests and how these are affected by land cover and climate changes. The project is located in central Panama in a low-altitude, coastal, seasonal-tropical setting. The physiography of the study catchments is hilly with steep slopes (>20%). The experimental design includes different catchments and study plots established on a range of land uses and covers that are typical within the Panama Canal watershed and portions of the humid tropics [Stallard et al., 2010, Weber andHall, 2009].
[17] This paper presents analyses of three and one-half years of data from the three catchments that serve as experimental controls for comparison with other Agua Salud Project study catchments, many of which are undergoing different reforestation treatments over the multidecadal duration of the project ( Figure 1; Table 1). These three catchments have distinct land covers: old-secondary forest (FOR), a dynamic mosaic of young forest of various ages, pasture, and subsistence agriculture (MOS), and actively grazed pasture (PAS). The FOR and MOS catchments were first instrumented from 1980 to 1983 by the Panama Canal Commission (PCC) with the objective of determining the effects of deforestation on the water balance and peak flows [Panama Canal Commission (PCC), 1984]. However, instrument failures and vandalism prevented reliable long-term observations [PCC, 1984]. The FOR and MOS catchments were reinstrumented as part of the Panama Canal Watershed Monitoring Project from 1997 to 1999, when the spongeeffect was tentatively identified as occurring in this landscape [Kinner and Stallard, 1999;PMCC, 1999;Heckadon-Moreno et al., 1999]. Comparison of land cover maps from 1979 and 2012 ( Figure 1) shows that significant changes have occurred in land use and land cover in the MOS catchment.

Methods
[18] To address the contradictory conclusions regarding the sponge-effect and the influence of land cover on runoff peaks, we designed an experiment , collected hydrological and hydrometeorological data, and compared the hydrologic behavior of the three control catchments with their different land covers and uses.

Paired Catchments
[19] Paired catchment studies are widely used to determine the magnitude of water-yield changes resulting from shifts in vegetation [Brown et al., 2005]. To understand the effects of vegetation and land use on hydrologic behavior through a paired-catchment analysis, it is important that the catchments chosen for comparison have similar area, slope, aspect, geology, meteorology, soil composition, and catchment morphology.

Description of Study Area 3.2.1. Study Catchments
[20] The 142.3 ha forest (FOR) catchment ( Figure 1; Table 1) is a tributary to the Agua Salud River contained OGDEN ET AL.: SPONGE EFFECT PAPER entirely within the Soberania National Park (SNP). Before becoming a national park, the land was largely protected from development as part of the former U.S. Canal Zone. Land cover in this catchment is dominated by old secondary and/or mature forest, hereafter referred to as old secondary forest. According to the 1979 land use/land cover map (Figure 1), about 20% of the forest in the catchment could be as young as 34 years in age, but most of the forest is likely at least 80 years old. There is very little human activity in the catchment and no grazing.
[21] The mosaic catchment (MOS) (Figure 1; Table 1) is adjacent to the FOR catchment, has a drainage area of 175.6 ha, and is also tributary to the Agua Salud River. This catchment is covered with a mix of young secondary forest, less than 15 years in age, older secondary forest, greater than 15 years in age, actively grazed cattle pasture, and small areas managed as subsistence farms. Much of the catchment is undergoing slow, natural reforestation. Although young secondary forest was establishing in large parts of the catchment at the time of acquisition (2008)  cattle were still present in these areas at very low densities (<0.1 head per ha). Such grazing during reforestation, particularly if intense and unmanaged, can limit recovery of forest-related hydrologic benefits through continued soil compaction and ground cover removal [Ghimire et al., 2013]. We have visited this catchment on numerous occasions starting in 1996 and have always found the secondary forest to be virtually impenetrable by people or cattle, except on established trails. Regrowth on abandoned agricultural land has allowed the total area of secondary forest cover to increase since 1998. Analysis of aerial photography indicates that as of 2012, 51% of the MOS catchment was classified as older (>15 year) secondary and mature forest, 30% is young (<15 year) secondary forest, and 19% is active cattle pasture.

Republic of Panama
[22] The active-cattle-pasture subcatchment (PAS) is nested within the MOS catchment, at its most upstream or most eastward end, and covers 35.9 hectares ( Figure 1 and Table 1). For several decades, the catchment has been maintained as a pasture with a one-month-on, one-monthoff rotation with cattle at a grazing density of about 1.3 head of cattle per ha, when present. This relatively low intensity of grazing is typical for this region. Active grazing is ongoing at this site through an arrangement with the former land owner, and successional tree and shrub growth are manually cleared. A portion of this catchment is young secondary forest (15.5 ha) with sparse gallery forest near streams, fruit, and other trees [Weber and Hall, 2009].

Climate
[23] Panama lies between 7 and 10 north of the equator, and experiences strong wet and dry seasons [Espinosa, 1999;Callaghan and Bonell, 2005 . During the wet season, low pressure associated with the ITCZ forces convective thunderstorms that produce the high intensity rainfall that is characteristic of the humid tropics. On average, the wet season ends 20 December, with a standard deviation of 15 days. December is the month that shows the most interannual variation in rainfall. Murphy et al. [2013] found that both the tropical North Atlantic and equatorial Pacific oceans have signifi-cant effects on occasional extreme rainfall events that can impact the PCW in December. In central Panama, the El Niño and La Niña cycles of the ENSO, which occur roughly every 5 years, often cause noticeably drier and wetter years, respectively. Rainfall data from Barro Colorado Island (BCI), which is approximately 11 km west of the Agua Salud study catchments, were used to estimate longterm rainfall statistics over the study area [STRI, 2013]. The average annual rainfall at BCI between 1960 and 2010 was 2700 mm y 21 .

Geology and Soils
[24] The elevation of the Agua Salud Project catchments ( Figure 1) ranges from 52 to 302 m above mean sea level on a strongly dissected basalt plateau developed on the remnants of a Cretaceous island arc [Harmon, 2005a[Harmon, , 2005bStewart et al., 1980;Wörner et al., 2005; U.S. Geological Survey (USGS), 1997; Hassler et al., 2010]. The catchments are underlain by deeply weathered basaltic and andesitic parent rocks. Chemical analyses of waters from the Rio Agua Salud found alkalinity:silica ratios between 1.5 and 1.8 equivalents per mole (R.F. Stallard, 2007, unpublished data), which is consistent with igneous weathering [Stallard, 1995;Stallard and Murphy, 2012], and indicates that there are no significant carbonate rocks in the study catchments. Soils in the study catchments are Oxisols derived from in situ weathering of bedrock [PMCC, 1999;Turner and Engelbrecht, 2010]. Soils are up to 20 m in depth according to seismic surveys . Seismic data and our personal observations of road cuts, canal excavations, and quarries throughout the region show that soil thickness typically decreases from ridge to stream or floodplain.
[25] Oxisols characteristically have increasing clay content with depth that leads to a decrease of saturated hydraulic conductivity (K s ). Hassler et al. [2010] used 7 cm diameter cores and constant-head permeameters to measure Ks of the Oxisols in the Agua Salud catchments. They report median K s in samples taken from 0 to 6 cm depth of 23 and 38 mm h 21 in pasture and 5 year old secondary succession, respectively. Soil cracks observed in the dry season suggests some soil swelling when saturated [Hassler et al., 2010].
[26] On several occasions, we used irrigation equipment to test wet-season ponded infiltration in two former pastures that had cattle on them as recently as 2007 and were Based on the rate of water application and wetted area, we measured steady-state area-averaged infiltration into the soils at the scale of several m 2 between 600 and 1000 mm h 21 .

Catchment Morphology
[27] The stream networks in all the catchments were surveyed to establish general characteristics and provide information for morphologic analysis. In most first-order streams, flow originates at an approximately 1.5 m tall erosional head cut, with streamflow originating as seepage from the toe of the head cut during the wet season. Based on watershed characterization using 1 m LiDAR, the contributing area for these stream origins is about 1 ha. This area was used to characterize the drainage network using Rivertools [2003] (Table 2). Channels are hydraulically steep in the headwaters of study catchments, becoming much less steep, downstream. The soil-bedrock interface is visible along many channel reaches. Groundwater seepage and significant pipe flow have been observed along channels immediately above this interface during rainfall. In the FOR catchment, channels are deeply incised with narrow interfluves and frequent exposed weathered bedrock. In the MOS catchment, which includes the PAS subcatchment, channels incisions are largely sediment filled with occasional weathered bedrock outcroppings.
[28] The FOR and MOS catchments are morphologically quite similar. The channel-network properties are closely matched, as are average hillslopes (Table 2). For an added level of comparison, we use the topographic indexdefined as the natural logarithm of the ratio of the potential runoff-contributing area upslope of a unit length of slope contour divided by the tangent of the slope at that contour [Beven and Kirkby, 1979]. Catchments with the same topographic index distributions should be hydrologically similar if saturation-excess runoff generation is dominant in both [Beven et al., 1995;Ambroise et al., 1996]. The topographic-index distributions for the three catchments are shown in Figure 2, and they are indeed very similar.
[29] The closely matched morphologic properties, geology, and soil composition, particularly of the FOR and MOS catchments (Table 2; Figure 2), implies that differences in stream response to rainfall have to be attributed to soil structure and land cover, but not to channel morphology and hillslope form. All Agua Salud study catchments larger than 20 ha are perennial, and our larger study catchments are 150-170 ha. The channel profiles decreases in slope going downstream, and the mainstream and largest tributaries are perennial. Accordingly, this 150 ha scale would represent a suitable catchment-size (a hydrologic-response unit) for upscaling to the larger Canal watershed, after appropriate linking and routing through the channel network.
[30] The PAS catchment is distinctly smaller and is less steep both in terms of overall hillslopes and the channels. Upstream of the pasture weir, the valley is broader and not deeply incised, and unlike any of the remaining research catchments, there is little riparian forest ( Figure 1).

Measurements and Derived Quantities 3.3.1. Rainfall
[31] Rainfall data were collected using a cluster of tipping-bucket (TB) rain gages installed on the north edge of the MOS catchment, at a point located 0.68, 1.2, and 2.6 km from the PAS, MOS, and FOR weirs, respectively (Figure 1). Three rain gauges were used in a cluster with a mean separation distance of 1 m for quality control purposes [Ciach, 2003]. The study catchments were not bounded by rain measurement sites over the entire period of study, which prevented rainfall interpolation.
[32] Specifications of the rain gauge (Davis Instruments Inc., 0.01 inch-bucket Rain Collector II with HOBO pendent data loggers) state a random error of 65%. Based on  static calibration for every rain gauge in the field and a dynamic calibration using 12 simulated rain intensities between 30 and 230 mm/h in the lab, we estimated that the random error as installed is 68%. Agreement between the three TB gauges in each cluster is typically very good, less than the 68% random error, when binned into 15 min periods. Disagreement among the three rain gauges is typically an under-measurement of rain rate in one, and occasionally two, gauges, associated with obvious plugging of the funnel orifice by bird droppings, leaves, and insects.
[33] Between 2009 and 2011, five additional TB rain gauge clusters were installed across the Agua Salud Project study sites. Intercluster distances varied from 1.7 to 4.6 km. Correlograms constructed using 15 min and daily rainfall totals were fit using an exponential function [Niedzialek, 2007]. This analysis yielded correlation lengths of 6.6 km and 30 km for 15 min and daily rainfall accumulations, respectively. The distances between our catchments are considerably less than the 15 min correlation length of 6.6 km, indicating that the use of rainfall data from a single rain gauge cluster with 15 min temporal aggregation is appropriate.

Streamflow
[34] V-notch weirs were used to measure streamflow at the outlet of each catchment. The FOR and MOS weirs were constructed in 1979. The PAS weir was constructed in December 2008, and instrumented in January 2009. All weirs have a two-stage design as shown in Figure 3. Upstream is a concrete short-crested 120 or 140 , V-notch weir for measuring high flows. Outflow from this weir spills into a concrete box that has an offset, 90 , sharpcrested, V-notch weir for accurate low-flow measurements. Weirs were constructed on bedrock outcrops to minimize underflow. The maximum capacity of the sharp-crested weir was approximately 0.03 m 3 s 21 . Whenever this discharge was exceeded, such as during much of the wet season in the FOR and MOS catchments, the short-crested concrete weirs were used to measure discharges.
[35] Both the high-flow and low-flow weirs were instrumented with nonvented pressure transducers (In-Situ Leveltroll, model 300) that measure the combined pressure variations due to changes in the water depth and barometric pressure at 5 min intervals. These pressure transducers have a published measurement error of 60.1%. Each time the transducers were downloaded in the field, the water depth behind the weir was noted for use in data processing. Floating debris catchers were installed to prevent clogging of the weirs at low flows.
[36] Calculation of flow rates required barometricpressure correction and conversion of pressure to stage above the weir invert. A time-series editor was used to make measured water levels consistent with field observations, and to correct for transients caused by debris and sensor drift.
[42] A constant C d 5 0.59 was used for the 90 sharpcrested V-notch weirs. For the 120 and 140 V-notch concrete weirs, discharge coefficients determined using 1:2 and 3:16 scale physical hydraulic models were used. Errors in C d values, determined in the laboratory, were less than 2% [Creel, 2013]. Combining the standard error of the pressure-transducer errors with standard error in C d yielded a total standard error for discharge measurements of approximately 62%. Because the three catchments have different areas, runoff rate (also referred to as unit discharge), R, is calculated from R 5 Q/A, where A 5 catchment area.

Rainfall-Runoff Analysis
[43] Based on our experience during rain storms in the Agua Salud Project field sites and other studies in the humid tropics, significant events were assumed to be those with a total rainfall volume greater than 3.0 mm and/or a duration longer than 2 h [Waterloo et al., 2007]. During rainy periods, a rainfall event was assumed to be separated from other rainfall events by at least 3 nonrainy hours. These gaps were identified through analysis of 15 min rainfall data. If the gap was less than 3 h, rainy periods before and after were combined into one event, and the individual rainy periods are referred to as pulses. Accordingly, longer rainfall events often included short nonrainy periods. The start of significant rain events and the end of direct runoff were then used to demarcate the beginning and end of runoff events.

Recession Analysis
[44] Base flow recession analysis, based on linearreservoir theory [Chapman, 1999], underpins our estimation of groundwater storage. Most of our recessions are of short duration, only a few days. Chapman [1999] notes that the simplest recession model, and the only model that is statistically valid for recessions of less than 10 days is the log-linear recession of linear-reservoir theory, where [45] R t 5 Runoff rate (mm hr 21 ) at time t after the start of recession [46] R 0 5 Runoff rate (mm hr 21 ) at the beginning of recession [47] s 5 turnover time of groundwater storage in days, and [48] k 5 recession constant.
[49] Effective groundwater storage G t is given by [Chapman, 1999]: [50] We used two regression methodologies described by Vogel and Kroll [1996]: integrated-moving-average (IMA) and simple autoregressive (AR (1)). The first three days of data from an identified recession period were excluded to avoid the influence of water moving through rapid flow paths. In addition we only considered flow rates that account for less than 10% of total annual runoff, restricting our analysis to mostly dry-season recessions. Rivera-Ram ırez et al. [2002] and Murphy and Stallard [2012] applied these methods in eastern Puerto Rico. This approach may introduce some bias to the recession characterization, as base flow recession can vary significantly across seasons depending on antecedent conditions [Tallaksen, 1995].

Base Flow Separation
[51] Stream discharges (m 3 s 21 ) measured at 5 min intervals, were converted to R in units of mm per 15 min, to align with the 15 min rainfall data. Hydrograph separation used a two-parameter digital base-flow filter for estimation of base flow and direct runoff [Boughton, 1993]: [55] K 5 filter parameter given by recession constant, and [56] C 5 subjectively determined filter parameter.
[57] After base flow separation, direct runoff R d is calculated as: R d 5 R -R b , and runoff efficiencies for each storm, E R , are calculated as the ratio of total direct runoff to rainfall : E R 5 R d /P. Peak runoff rates, R p , (mm h 21 ) were determined from the original 5 min runoff data.

Average Runoff Efficiencies
[58] Following approaches normally used to calculate rainfall loading [Stallard, 2012], average runoff efficiencies, E R , for multiple storms were calculated using a precipitation-weighted equation: where subscript i refers to the P, R d , and E R of each storm being averaged for which we have data from all three catchments.

Flashiness Indices
[59] A stream that rises and falls quickly is flashy [Richards, 1990]. Flashiness is a characterization of the rate of a stream response to rainfall, and provides an indication of the role of quick flow paths in streamflow generation. The flashiness indices [Richards, 1990] were derived from runoff-duration curves. The probability of exceedance was calculated using the Weibull plotting position formula : [60] In equation (6), p is the percent daily runoff equaled or exceeded, r is the rank of the daily runoff values, and n is the total number of daily runoff observations. After Richards [1990], we computed flashiness indices F 10/90 (F 10/90 5 p 10 /p 90 ), F 20/80 , and F 25/75 using runoff rates corresponding to the 10, 25, 30, 75, 80, and 90% runoffexceedance probabilities from the runoff-duration curves.

Meteorology and Micrometeorology
[61] A surface energy balance (SURBAL) station was installed in the PAS catchment ( Figure 1). The instrumentation included a two-channel Kipp & Zonen model CNR2 net radiometer, an Apogee model SI-111 ground-pointing infrared radiometer, a Vaisala model HMP-50 air temperature and relative humidity sensor located at 2.0 m above the ground, and two Windsonic 2-D ultrasonic wind sensors located at 0.5 m and 2.0 m heights above ground. Data were recorded using a Campbell Scientific model CR-1000 data logger at 30 min intervals. This station was adjacent to the Agua Salud meteorological station operated by the ACP.
[62] The eddy-covariance (EC) system used by Niedzialek and Ogden [2012] at Cerro Pelado, about 10 km eastsoutheast from the FOR catchment, was used to measure ET over old secondary forest. Land cover at Cerro Pelado is old secondary forest, similar to much of the FOR catchment, with a canopy height approximately of 27 m. The eddy-covariance system, installed at a height of 33 m, consisted of a Campbell Scientific CSAT3 3-D sonic anemometer, a Licor Li-7500 path H 2 O/CO 2 infrared gas analyzer, and a fine-wire thermocouple. Radiation was measured using a Kipp & Zonen CNR1 four-channel net radiometer. Data were recorded using a Campbell Scientific CR-5000 data logger. During rain, the EC system does not function, because the infrared gas analyzer (IRGA) windows are obscured by rain drops. To help mitigate this effect, we installed the IRGA on the tower in an inverted fashion, 10 off vertical orientation so that it would shed raindrops. Shedding was verified by comparing tower rainfall data with data flags showing a functional IRGA signal tested every 30 min after the rain stopped. The 3D sonic anemometer included steel-mesh wicks near the ultrasonic transducers to keep them dry so that shortly after rain stopped, measurements would recommence. 3.3.9. Evapotranspiration Estimation 3.3.9.1. Old Secondary Forest [63] The Priestley and Taylor [1972] ET equation compares well with eddy-covariance ET measurements after calibration [Niedzialek and Ogden, 2012]. Accordingly, the Priestley-Taylor (P-T) method was used as our primary method for estimating ET in areas covered by old secondary and mature forest, because the Cerro Pelado EC system was not operated throughout the study period due to solar-power limitations caused by wet-season cloudiness. Furthermore, the P-T method requires only measurements of air temperature and pressure, net radiation, and soil heat flux. Soil heat flux was assumed negligible in old secondary and mature forest because less than 5% of radiation reaches ground level [Bruijnzeel, 1990;Schellekens et al., 2000]. [72] A 21 day period of high quality EC data collected during the wet season from 4 to 25 October 2011 was used to calibrate the Priestley-Taylor (P-T) a parameter. Calibration of a was achieved by comparing the PET, estimated using the P-T method, to quality-controlled, 30 min EC data. Night time EC data were not considered because of consistently low vapor pressure deficits and little wind at night. The remaining data were quality controlled using published friction velocity (u Ã ) thresholds [Detto et al., 2008] and instrument flags. Of the 1056 total data points collected during the 21-day calibration period, the 398 half-hourly values that remained after quality control were used to calibrate the a parameter. These values included periods after rainfall. Hence, the EC system measured both evaporation of intercepted water and transpiration. Therefore, the calibrated a parameter accounts for both of these components.

Young Secondary Forest and Grass
[73] The Penman-Monteith equation is widely used to estimate potential evapotranspiration (PET), but it requires detailed understanding of the canopy properties and structure as these affect canopy and aerodynamic resistances. The large variability of tree architecture, canopy structure, and land cover across the MOS catchment makes estimation of these resistances difficult. In areas covered by young secondary succession and grass, we used atmometer measurements of PET [Altenhofen, 1992, Fontaine andTodd, 1993]. The atmometers used in this study were manufactured by the ETGage Company, Models A and E, with grass reference (ET o ) covers. Multiyear comparisons of atmometers versus weighing lysimeters and evaporation pans indicate that they accurately measure PET [Wilcox, 1963;Broner, 1990]. However, one limitation of atmometers is that they do not operate when the atmometer membrane is wet during and after rainfall until after the rain water on the membrane evaporates [Irmak et al., 2005].
Accordingly, we assume that the atmometers did not measure evaporation of intercepted rainfall in young secondary forest and pasture land covers (I Ã ) because the time to evaporate water ponded on the atmometer membrane was unknown. Instead, we assumed that this time is approximately the same as the time required to evaporate intercepted rainfall from vegetation. The implications of this assumption are discussed in the water-balance section. 3.3.10. Throughfall (TF) and Stemflow (SF) Estimation [74] The land-cover data shown in Figure 1 were used to estimate catchment-average values of throughfall (TF) and stemflow (SF) based on average values derived from other studies in Panama. We assumed that all intercepted water was evaporated to satisfy PET demand, while stemflow was not. Total interception was calculated using: where [75] I 5 Canopy intercepted rainfall (mm), [76] P g 5 gross rainfall above the canopy (mm), [77] TF 5 fraction of gross rainfall (0-1) that passes to the soil surface as throughfall, and [78] SF5 fraction of gross rainfall (0-1) that becomes stemflow.
[79] Approximately 20% of the land surface in the PAS catchment has been essentially denuded by cattle. These areas have very little vegetation and consist of cattle trails and places where cattle loiter beneath trees and near water. We refer to this as degraded pasture, and assumed that in these areas SF 1 TF 5 100% [Lilienfein and Wilcke, 2004]. In young secondary forest and nondegraded pasture, TF and SF are highly variable because of the variance in species crown morphology and age. We used values from Park et al. [2009], collected in a plantation monoculture of 5 year old native tree species, who found TF 1 SF 5 95% in that land use. We used SF 1 TF 5 84% as measured by Niedzialek and Ogden [2012] in an old secondary forest at Cerro Pelado, Panama, using two trough systems with a sampling area of 1.86 m 2 each. Stemflow in the mature forest was assumed to equal 2% of rainfall [Cavelier et al., 1997;Hölscher et al., 2004].

Water Balance
[80] The water balance for each catchment used measurements of rainfall and runoff together with estimates of interception in young secondary succession, evapotranspiration, and changes in groundwater storage over a given time period using: [88] Water-balance periods include a 21 day calibration period (4-25 October 2010) and the entire 2009-2010 water year.
[89] We hypothesized that the amount of rainfall during a wet season determines antecedent groundwater conditions for the following dry season and the potential for dry-season benefits associated with the sponge-effect. For this reason, we based our analysis on water years, which we defined as the start of a wet season to the end of the following dry season [Espinosa, 1999]. The Meteorological and Hydrological Branch of the ACP defines the start of the wet and dry seasons by tracking 11 variables and then making a subjective decision based on the performance of these variables, and their prior experience with weather patterns in the Panama Canal area [STRI, 2013]. Year-to-year changes in soil-moisture storage were assumed negligible because runoff rates at the ends of each water year are low.
[90] Evaporation of interception by young secondary succession and grasses (I Ã ) was not measured by the atmometers. For this reason I Ã was added to the atmometer measurements to estimate total ET for these land covers under the assumption that all intercepted precipitation minus SF was evaporated. This explains why I Ã appears in equation (9), while the interception by mature and old secondary forest I does not, because it is part of the calibrated P-T a. Catchment-average values of SF 1 TF were calculated for young secondary forest and pasture by summing the product of fractional land cover by assigned (TF 1 SF) parameter values.
[91] The water-balance-residual term L from equation (9), normalized by total precipitation, allows expression of the final water-balance residual as a percentage. If the absolute value of the residual expressed as a percentage falls outside the standard error bounds of the measurements, then either catchment leakiness or presence of inter-basin groundwater flow is indicated [Schellekens et al., 2000;Waterloo et al., 2007;Muñoz-Villers et al., 2011]. Standard error bounds were calculated as the weighted summation of the estimated standard errors of each water balance component after Muñoz-Villers et al. [2011]. The change in groundwater storage, DG, was estimated as the difference of the average effective groundwater storage of the ten successive days before the beginning and end of the water balance periods [Waterloo et al., 2007;Muñoz-Villers et al., 2011].

Rainfall
[92] The total water-year rainfall from 2009 through 2012 at Agua Salud is listed in Table 3 together with the number of significant events with total rainfall greater than 3 mm. Table 3 also presents a classification of each wateryear according to the average total rainfall given by the historical data at Barro Colorado Island. Analysis of rainfall  events over the 3 year study period showed that 70% of events were less than 7 h in duration with a median duration of approximately 3.3 h. The mean and median total rainfall per event for the three water-years combined are 26.0 and 29.8 mm, respectively. Of the rainfall events recorded, 24% had more than 50 mm of total rainfall. All events that exceeded 7 h duration consisted of multiple rainfall pulses. The longest storm for the three water-years lasted 35 h and 30 min between 7 and 9 December 2010 and is coincident with the flood of record in the PCW Shamir et al., 2013]. In that event, there were seven pulses of rain. Only 14% (65 of the 461) of the significant events, recorded over the three water-years, fell during the dry seasons.

Catchment Runoff
[93] Figure 4 depicts daily runoff and rainfall over the analyzed period of record, with wet seasons identified by gray background. During the wet season, the MOS and PAS catchments on average produce more runoff during rainy periods than the FOR catchment. In the dry season, when the runoff from the MOS catchment is less than 0.8 mm d 21 , the runoff from the FOR catchment is greater than that from the MOS catchment 80% of the time. During dryseason rain events, the PAS catchment produces more runoff than both the FOR and MOS catchments. However, the sum of the dry-season runoff between 1 January and 1 May for the three wet seasons analyzed was 344 mm from the FOR catchment, 297 mm from the MOS catchment, and 301 mm from the PAS catchment.
[94] Differences in the behavior of the three catchments are most apparent at the level of individual runoff events. Figure 4 shows that for most rainfall events, peak runoff from the FOR catchment was considerably less than that from the PAS and MOS catchments. From August through the end of the wet season, the base flow from the MOS catchment is greater than that from the FOR and PAS catchments. However, beginning about mid-way through the dry season, base flow from the MOS catchment recedes at a faster rate than in both the PAS and FOR catchments (Figure 4). The comparison between the MOS and FOR catchments is most appropriate, given that they are much closer in size, and that the PAS catchment is contained within the MOS catchment. During dry season rainfall events, the PAS catchment produces more runoff, as shown in the dry season of 2010, which had a significant number of rainfall events. However, during extended nonrainy periods such as those shown in the dry seasons of 2011 and 2012, the PAS catchment runoff falls well below that for the other two catchments.
[95] Results of recession analyses in Table 4 show that the three catchments behave differently. Recession constants from the Integral Moving Average (IMA) analysis are 21.3, 22.4, and 22.2% per day for the FOR, MOS, and PAS catchments, respectively. Within minor The applied empirical base-flow filter parameters [Boughton, 1993] used in equation (4) are given in the last two rows.  Figure 5. Runoff-duration curves for the study catchments over the three water years examined in this study. Orange line denotes percentage increase in FOR runoff relative to MOS when R FOR > R MOS . Violet line denotes percentage increase in FOR flow relative to PAS when R FOR > R PAS . differences, the results from the AR(1) recession analysis were the same. In the FOR catchment, the 10th percentile runoff rate (R 10 ) is about half that of the MOS and PAS catchments, while the amounts of water stored in the FOR and MOS catchments are about the same at the end of the wet season. However, in the FOR catchment, base flow is released more slowly because of the smaller recession constant. By the time the runoff rate corresponding to 1% of the annual volume (R 1 ) is reached, runoff rates in all catchments are similar.
[96] Results based on hydrograph separations performed using the Boughton [1993] two-parameter digital filter, shown in Table 5, indicate that annual runoff is dominated by base flow in all three catchments. However, the PAS catchment consistently produces more direct runoff and a smaller percentage of the annual runoff as base flow compared to both the MOS and FOR catchments.
[97] Runoff-duration curves (RDC) for the study catchments, derived from 3 years of daily runoff (1101 values) from 27 June 2009 through 2 July 2012, using equation (6), are shown in Figure 5. Runoff from the MOS exceeds the PAS and MOS for the highest 45% of flows. The PAS catchment produces the highest flows between 45% and 80% exceedance probability. The outflows from the FOR catchment exceed those from the PAS and MOS catchment by approximately 0.1 mm d 21 , or about 10% in relative terms, for exceedance probabilities between about 0.8 and 0.9. For flow exceedance probabilities greater than 0.92, the PAS and FOR produce the same flows, and are about 0.1-0.15 mm d 21 greater than the MOS catchment. The PAS catchment, while a subcatchment of the MOS, is highly sensitive to dry-season rainfall events as shown in Figure 4. Using Richards [1990] classification of RDC, based on flashiness indices listed in Table 6, all three catchments would be called ''event responsive.'' The flashiness indices show that the MOS catchment is most flashy, followed by the PAS and FOR catchments.

Evapotranspiration
[98] Over the 21 day calibration period, 4-25 October 2011, the difference between calibrated P-T and eddy covariance measurements was 20.1%. The average daily evapotranspiration rate was 3.16 mm d 21 during the calibration period. This value is very similar to those reported by Wang and Georgakakos [2007]. With reference to equation (1), the calibrated Priestley-Taylor a value was 0.79, which is considerably below the commonly assumed value of 1.26. However, both Vourlitis et al. [2002] and Kumagai [2005] found P-T a values as low as 0.6 in the tropical rain forests of Brazil and Malaysia, respectively. In the tropics low vapor pressure deficit and energy limitations due to

Water Balance
[99] A water balance is presented in Table 7, both for the 21 day ET calibration period and for the entire 2009-2010 water year, which is shown because it is closer to average in terms of rainfall. Water balance residual errors reported in Table 7 for the FOR, MOS, and PAS catchments were 8, 27, and 28%, respectively, during the ET calibration period. For the entire 2009-2010 water year, the water balance residual errors were 13, 218, and 12%, respectively. The rainfall measurement error is 68% and the flow mea-surement error is 62%. The error in PET estimation is near 62% during the calibration period, and may be as large as 65% during the 2009-2010 water year. Assuming that the direct-runoff estimation error associated with base-flow separation is on the order of 610%, and that the errors are additive, then the estimated total direct-runoff error is approximately 622% during the calibration period and 625% during the 2009-2010 water year. The magnitudes of the water balance residuals are below this magnitude in each of the three catchments, indicating that substantial leakage of deep groundwater was not likely.

Runoff Peaks
[100] Runoff peaks, R p , in both the MOS and PAS catchments (5 min temporal resolution) are typically higher than in the FOR catchment ( Figure 6). The correlation of R p is highest between the PAS and MOS. The R p are typically greatest in the PAS catchment. Based on skewness and kurtosis, the ratios R pMOS /R pFOR and R pPAS /R pFOR , are nearly log-normally distributed (Table 8). On average, the R p in the MOS and PAS catchments exceed the R p from the FOR catchment by factors of 1.4 and 1.7, respectively. The probability that, during a particular event, the R p will be higher in the MOS catchment than in the FOR catchment is 67%. The R p from the PAS catchment exceed those from the FOR catchment in 75% of events.

Runoff Efficiencies
[101] Calculated average runoff efficiencies, show that the PAS catchment produced considerably more direct runoff than the MOS catchment, which in turn, produced more direct runoff than the FOR catchment. There is considerable variation and overlap of E R among the various catchments. Calculation of average efficiencies as the rainfall-    Figure 7. Storm runoff efficiencies for the three study catchments versus storm total rainfall. Bars denote 90th percentiles of the observations. Bins: 0.316-1.0 mm, 1.0-3.16 mm, 3.16-10 mm, 10-31.6 mm, 31.6-100 mm, and 100-316 mm.
weighted average of individual storms (equation (5)) eliminates the effect of this variation. Average runoff efficiencies increase with increasing total storm rainfall ( Figure 7; Table 9). For all rainstorms in aggregate, the PAS catchment average runoff efficiency was 1.5 times greater than the MOS catchment and 2.7 times greater than the FOR catchment, while the average runoff efficiency from the MOS catchment was 1.8 times greater than the FOR catchment. This disparity is first seen for rain storms with more than about 10 mm rainfall and increases with increasing storm total precipitation. For rainfall events between 100 and 316 mm, the average runoff efficiencies for the FOR, MOS, and PAS catchments are 19, 31.5, and 49.5%, respectively. Analysis of the ratio of E RMOS to E RFOR for 286 storms revealed that this ratio is very nearly lognormally distributed with a mean, standard deviation, and skewness of the log 10 -transformed values equal to 0.17, 0.31, and 20.067, respectively.

Land Use and Land Cover Dependent Response to Extreme Events
[102] The most extreme rainfall recorded to date in the Agua Salud study catchments occurred during December 2010, which was also the wettest month in the over 100 year recorded history of the PCW [Espinosa, 2011]. Monthly rainfall recorded by two rain gage clusters around the study catchments for December, 2010, was 1135 mm. Hydrographs, rainfall hyetograph, cumulative runoff, and rainfall are plotted in Figure 8 for the period 7-12 December 2010. The storm of 7-8 December produced 300 mm of rainfall and was followed by two pulses of heavy rain on 11 December that dropped a further 150 mm of rain. After subtracting the prestorm base-flow recession, the direct runoff from the FOR, MOS, and PAS catchments was 245 mm, 377 mm, and 365 mm, respectively. The FOR catchment produced smaller peak runoff rates and 125 mm (34%) less total runoff than the MOS and PAS catchments, despite the  system being wet and large rainfall amounts. Losses to ET (<3.2 mm d 21 ) and to groundwater during this 6 day rainy period likely are negligible compared to rainfall.

Discussion
[103] The most uncertain components of our water balance are evapotranspiration and changes in groundwater storage. We currently have no deep groundwater monitoring wells in the study catchments. Instead we rely on baseflow recession as it is indicative of aquifer transmissivity and storage [Arnold and Allen, 1999].
[104] Errors associated with streamflow measurements were quantified and generally small. Rainfall measurement errors were minimized using in situ calibration, and the use of three tipping-bucket gauges in clusters. We applied throughfall plus stemflow values from local studies augmented with literature values. The strong seasonality of the climate in Panama minimizes the effect of changes in soil moisture storage. In the mid-to-late wet-season, soil moisture is high. Because the dry season is 4 months long and typically quite dry, with high potential ET, water-year based analyses start and end with little soil moisture storage.
[105] Niedzialek and Ogden [2012] suggested that the calibrated Priestley-Taylor method underestimates ET during dry periods. Furthermore, the use of a single P-T a value calibrated using wet season EC data will likely underestimate ET in the dry season. This is confirmed in one instance, wherein Wohl et al. [2012] presented an ET map produced using the METRIC algorithm [Allen et al., 2007] during a cloudless period in the dry season (27 March 2000) of the Agua Salud Project study area [ Figure  3, Wohl et al., 2012]. The average dry season daily ET rates shown in Wohl et al. [2012] were 4.4, 5.3, and 5.8 mm d 21 for pasture, young secondary forest, and forest over 80 years old, respectively. Future efforts should focus on more accurate measurements of these water balance components.
[106] The three water-years examined in this study each had above-average rainfall. We calculated the joint probability that the flow from the FOR catchment was greater than that from the MOS catchment provided that the flow from both is less than 0.8 mm d 21 , which is the flow where the MOS and FOR runoff duration curves cross with a probability of exceedance of 80% ( Figure 5). Runoff rates with 80%, or greater, probability of exceedance occur only during the dry season. The condition R FOR >R MOS occurred in 174 out of 216 days with R FOR < 0.8 mm d 21 , yielding a joint probability of 0.81. The joint probability that R FOR > R PAS given R FOR < 0.8 mm d 21 is only 0.38, because of the influence of dry-season rainfall events on the PAS catchment. In the future, testing for the spongeeffect following a wet season with below-normal rainfall is an important goal of the Agua Salud project, to confirm the result reported in PMCC [1999] for the very dry 1997-1998 water year associated with a major El Niño.
[107] Water-balance results (Table 7) show that in terms of annual runoff, the FOR catchment produces less total streamflow than both of the MOS and FOR catchments. This is attributable to greater interception of rainfall and increased transpiration by the old-secondary forest. With reference to Table 7, interception from the FOR, MOS, and PAS catchments were estimated to be 442, 221, and 69 mm, respectively, during the 2009-2010 water year. During this same period, ET was estimated to be 1451, 1217, and 874 mm, respectively, in the FOR, MOS, and PAS catchments. The FOR catchment produced less total runoff (1164 mm) than the MOS (2280 mm) and PAS (1686 mm) catchments. The FOR catchment also produced significantly less base flow over this water year (940 mm) than the MOS (1945 mm) and PAS (1273 mm) catchments (Table 5). However, hydrographs shown in Figure 4, together with runoff-duration curves (RDCs) for the FOR and MOS catchments in Figure 5 show significant differences between the behavior of these two catchments during the dry season. Both the FOR and MOS catchments start the dry season with similar subsurface storage, but the runoff rate from the FOR catchment (10% runoff percentile rows, Table 4) was about half that of the MOS catchment. The MOS catchment drains faster and once lower runoff rates are reached (1% runoff percentile rows, Table 4), more water remains in storage in the FOR catchment and runoff rates are similar. With further drying the runoff rates in the FOR catchment exceed those in the MOS catchment ( Figure 5). Daily hydrographs for the driest water-year (2009)(2010) clearly show that the FOR catchment produces more dry-season base-flow runoff than the MOS catchment ( Figure 4).
[108] Base flows from the FOR catchment exceed those from the MOS catchment for 80-99% probability of exceedance by approximately 0.1-0.2 mm d 21 or 1-50%. The base flow from the FOR catchment for the 96% probability of exceedance is 50% greater than that from the MOS catchment. The larger recession constant in Table 4 indicates that stored water is released to streams more slowly from the FOR catchment. We suggest two hypotheses to explain this behavior. The first is that uptake of water by trees lowers the water table slope. The second is that mature forest enhances recharge to deeper stores in the catchment enabling flow through deeper and slower flow paths. Our data did not allow us to distinguish these hypotheses. Given the increased dry-season ET in the FOR catchment compared to the others, we expect that the reduced water-table slope is the most likely explanation, although we cannot support it.
[109] In terms of the sponge hypothesis, the PAS catchment is much smaller than the other two catchments, and not a rigorous paired basin. We include the PAS catchment in the analysis for the sake of completeness, and we believe that differences between the PAS catchment response during events are indicative of the effects of grazing on the hydrologic behavior.
[110] The response of our catchments to large storms offers evidence that illustrates the importance of land cover and land use in runoff generation. Considering the relatively short duration of large storms, the more uncertain components of the water balance : evaporation, transpiration, and deep groundwater fluxes, are small. For example, ET in the FOR catchment, which averages 3.2 mm d 21 , is considerably less during especially rainy weather because of low net radiation, low vapor-pressure deficit, and evaporation of intercepted water. The largest groundwater storage residual in our water budget is 1.3 mm d 21 . Compared to the storm of December 2010, with 6 day rainfall of 520 mm, these uncertainties are negligible.
[111] Differences in peak runoff rates and runoff totals among catchments for bigger storms must therefore reflect heterogeneity of rainfall inputs, errors in measurement, and differences due to land use and land cover. If we make the reasonable assumption that heterogeneity of rainfall and measurement errors are random, then the observed differences in runoff must be related exclusively to differences in land use and land cover.
[112] We considered the possibility that rainfall interception might account for differences in storm total runoff between the MOS and FOR catchments. Niedzialek [2007] measured the relation between throughfall (TF) and storm total rainfall (P) in a nearby old regrowth forest that is very similar to the FOR catchment land cover: TF50:92P20:78 r 2 50:97; n573 À Á (10) [113] Using this equation on the 300 mm of rainfall that fell 7-8 December, we calculate 25 mm of interception. Assuming that interception in the MOS catchment is half this amount (Table 8), the difference in interception is 12.5 mm while the difference in observed runoff during this period is 65 mm. This difference in interception accounts for 19% of the difference in runoff. Thus, most of the difference is due to the hydraulic function of the soil as affected by land use and land cover.
[114] The additional water infiltrated in the FOR catchment during this December 2010 event did not result in higher runoff in the following dry season compared to the other years (Figure 4). This indicates that much of the additional infiltrated water in the FOR catchment was removed by transpiration.
[115] Calculated average runoff efficiencies indicate a significant influence of both land use and rainfall amount (Figure 7; Table 9). Runoff efficiencies are greatest for the PAS catchment, and we have observed significant overland flow there from areas regularly trodden upon by cattle. MOS catchment runoff efficiency values are 1.6 times larger than those from the FOR catchment. We have rarely observed overland flow in the MOS or FOR catchments during rainfall. In grass-covered portions of the MOS catchment, approximately the upper 2 cm of the soil profile consists of a high-density root mat, making observation of overland flow there difficult. We have occasionally observed flow in swales in all three catchments, although the mechanism is as yet unknown.
[116] The runoff efficiencies shown in Figure 7 for the MOS and FOR catchments do not suggest significant overland flow generation during most events because we would expect to see some runoff ratios above 0.5. These ratios are only observed in the PAS catchment for the largest rainfall events. Baumgartner [1984] reports median values of runoff ratios for temperate forest, grass, cropland, and bare soil to be 30%, 35%, 50-60%, and 70%, respectively. The FOR, MOS, and PAS catchment median runoff ratios are 0.06, 0.09, and 0.13, respectively. The 90th percentile runoff efficiencies are 0.15, 0.18, and 0.22 in the FOR, MOS, and PAS catchments, respectively. Ogden and Dawdy [2003] report 90th percentile runoff ratios of 0.7 in a small agricultural watershed in Mississippi that experiences significant overland flow. Fujieda et al. [1997] reported that in two small (56 and 37 ha) catchments with Oxisols, near Sao Paulo, Brazil, direct runoff is only about 0.6% of annual rainfall. Vertessy and Elsenbeer [1999] in a very small (0.75 ha) rainforest catchment in western Amazonia with Ultisols and some Inceptisols, reported runoff ratios of 30-50% with observed spatially discontinuous overland flow. The soils in that catchment had measured K s in the range of 1 cm d 21 at a depth below 0.1 m [Vertessy and Elsenbeer, 1999]. Some of the outliers, particularly those in the MOS and PAS catchments may indicate overland flow runoff, particularly those events with runoff ratios >0.5, for which there were 7 events in the MOS catchment, 16 events in the PAS catchment, and none in the forest catchment over the period analyzed. The data suggest that overland flow is not common in the three AS catchments. It is most likely to occur in the PAS catchment during the biggest storms, and has been observed from cattle paths. Overland flow is extremely unlikely to be a significant contributor to runoff in the FOR catchment except for some saturation excess runoff near streams.
[117] Storm runoff response shows that the MOS and PAS catchments produce more direct runoff and higher peak runoff during storms, suggesting that the runoff generation mechanisms in these two catchments are different from the FOR catchment. We attribute this behavior to differences in infiltration and flow path. Hassler et al. [2010] showed a significant decrease in saturated hydraulic conductivity with depth using 7 cm dia. soil cores. However, our observations indicate that soil-matrix permeability is not the controlling factor ; biologically created preferential flow paths are important. Soil-pipe flow has been observed in both the FOR and MOS catchment hours after the end of wet season storms. Our infiltration tests with several m 2 of ponded area measured much higher infiltration rates than the permeabilities measured by Hassler et al. [2010]. The difference is most likely due to preferential flow paths. The results of the present study suggest that the presence of oldgrowth and more mature secondary forest enhances infiltration, reduces direct runoff, reduces flood peaks, and increases base-flow runoff during dry periods. Our results also suggest that compaction by grazing, as well as disconnection of preferential flow paths from the soil surface enhance runoff ratios and peak runoff from the PAS catchment.
[118] Van Dijk and Keenan [2007], based largely on CIFOR [2005], which reported results from the unreferenced first table in Kiersch [2000], wrote that land cover and land use do not affect peak flows during large floods in large watersheds (>1000 km 2 ). Our results suggest that this might not be a generally valid conclusion, at least in the PCW. Other studies also suggest that peak flows in response to medium to large rain events may be reduced by reforestation in smaller catchments [Bruijnzeel, 2004;Scott et al., 2005;Waterloo et al., 2007]. In the specific setting of the PCW, the geology, soils, vegetation, and land-use patterns of much of the Canal watershed east of the Canal is similar to our study catchments [Stallard and Kinner, 2005; Autoridad del Canal de Panama, 2012]. Analysis of the PCW topography indicates that 96% of the entire PWC is contained in catchments that are 150 ha in size. The land-use and land-cover dependent effects that we observed in this study occur at a scale that represents approximately 96% of the catchments in the PCW.
[119] Our data show an increase in runoff efficiency with increasing storm size (Figure 7; Table 9). Our data also show considerably greater runoff efficiencies in the MOS catchment and more so in the PAS catchment compared to the FOR catchment with increasing storm total rainfall. This disparity demonstrates that soil hydraulic function of mature forests mitigate the volume of flood runoff, not just peak flows. Results shown in Figure 8 clearly demonstrate this.
[120] The land-use history of the MOS catchment is complex ( Figure 1). However, much of the areas covered by secondary forest in the MOS catchment have not experienced significant grazing in at least a decade or longer. Yet, the differences between the MOS and PAS catchment behavior are striking. This result suggests that simply excluding cattle from a catchment might not result in significant changes in hydrologic response over this time scale. This remains an important research question.
[121] Given that the topographic-index distributions of the MOS and FOR catchment are very similar, one would expect similar hydrological response in both of these catchments if the saturation-excess mechanism is dominant. Because the runoff response in these two catchments is quite different, the saturation excess runoff generation mechanism is not dominant in one or both catchments.

Conclusions
[122] We used a paired catchment methodology to analyze hydrologic data on three catchments of contrasting land cover and land use. Data were collected during three wetter-than-average water years including the wettest year recorded in the >100 year history of hydrologic observations in support of the Panama Canal. The catchments have similar topography, topographic-index distributions, soils, underlying geology, and have highly correlated rainfall. Aside from land cover and land use, the only other significant difference among these three catchments is the considerably smaller area of the pasture catchment. Large differences were seen in annual water balance, peak runoff rates, runoff duration, runoff ratios, and dry-season recessions that are attributable to land cover and land use. Our observations are relevant to many practical questions related to land-use management, payment for ecosystem services schemes, water quality management, reservoir sedimentation, and fresh water security in similar settings throughout the tropics.
[123] Our observations support the ''sponge-effect hypothesis'' that forested catchments produce more base flow in the dry season compared to disturbed catchments, despite reduced annual runoff caused by evapotranspiration. These observations include that the base-flow runoff from the forest catchment is greater than that from the mosaic catchment 20% of the time, when the runoffexceedance probability is greater than 80%. These runoff probabilities occur entirely during the dry season when water shortages most commonly occur, and the amount of increased base flow can be up to 50% at runoff rates corresponding to 96% probability of exceedance. Median runoff efficiencies in the mosaic and pasture catchments, respectively, were 1.8 and 2.7 times greater than those seen in the forest catchment, and increase with increasing storm total rainfall. The mature secondary forest consumes a considerable amount of water, which reduces wet-season runoff as well as runoff during the early dry season.
[124] Our data demonstrate that land use and land cover have a significant effect on peak runoff rates during events. Median peak runoff rates from the mosaic and pasture catchments were 1.4 and 1.7 times greater than from the forest catchment, respectively.
[125] The largest storms on record, 7-13 December 2010, which produced 520 mm of rainfall, did not overwhelm the ability of the forest catchment to store rainfall. During this large event the forest catchment produced about 35% less total runoff than the mosaic catchment and lower peak runoff rates on average.
[126] The Agua Salud Project is ongoing with a significant emphasis on determining how tropical land management, including forest restoration, silvipastoral practices, and grazing, affect the provisioning hydrological ecosystem services. The observations made here suggest that the sponge-effect will be more apparent during drier dry seasons, and more years of measurement are needed. Moreover, we recognize the need for improved measurements of hillslope-scale infiltration, flow path, evapotranspiration and changes in groundwater storage to clarify the mechanisms behind the behavior seen in our data. At the same time, we also will be observing the effects of different styles of reforestation to see if, and the rate at which, different management practices can produce desired hydrologic behaviors.
[127] Acknowledgments. The Agua Salud project is a collaboration between the Smithsonian Tropical Research Institute (STRI), the Panama Canal Authority (ACP), the National Environmental Authority of Panama (ANAM), and other institutions. This research was funded by the HSBC Climate Partnership through the Smithsonian Tropical Research Institute from 2007 to 2012. Additional funding to the first author from the U.S. Army Research Office through grants 55573EVRIP, 52454EVDPS, and 61481EVRIP, the US National Science Foundation through grants EAR-1045166 and EAR-1123468 is acknowledged. The U.S. Geological Survey and the Smithsonian Tropical Research Institute funded the third author. Other research support for laboratory improvements to enable our research on weir sedimentation effects was provided by a generous gift from Roy and Caryl Cline to the University of Wyoming. We are grateful to the Smithsonian Tropical Research Institute and the following STRI personnel for their support: Daniela Weber, Federico Davis, Aquilino Alveo, Jorge Bautista, Michiel van Breugel, Juan Carlos Briceño, and Milton Solano. We gratefully acknowledge the support of and collaboration with the Panama Canal Authority, particularly Jorge Espinosa, Chief of the Hydraulics Works Section in the Water Division and his staff and Oscar Vallarino, the Manager of the Environmental Division during the time of this study and his staff. We also acknowledge cooperation with the National Environmental Authority of Panama (ANAM). Other financial support for the STRI Agua Salud Project came from the Frank Levinson Family Foundation, and the Motta Family Foundation. The 1 m resolution LiDAR topography data were provided by NSF DEB 0939907 to J. Dalling, S. Hubbell, and S. Dewalt. Edward Kempema, Jesse Creel, and Guy Litt of the University of Wyoming provided essential support in data analysis and quality control. We acknowledge fruitful discussions with Sibylle Hassler formerly of Potsdam University pertaining to this study. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the funding institutions. Finally, we acknowledge constructive reviews by three anonymous reviewers and John Moody of the U.S. Geological Survey.