Mean May–September Potomac River streamflow was reconstructed from 950–2001 using a network of tree ring chronologies (n = 27) representing multiple species. We chose a nested principal components reconstruction method to maximize use of available chronologies backward in time. Explained variance during the period of calibration ranged from 20% to 53% depending on the number and species of chronologies available in each 25 year time step. The model was verified by two goodness of fit tests, the coefficient of efficiency (CE) and the reduction of error statistic (RE). The RE and CE never fell below zero, suggesting the model had explanatory power over the entire period of reconstruction. Beta weights indicated a loss of explained variance during the 1550–1700 period that we hypothesize was caused by the reduction in total number of predictor chronologies and loss of important predictor species. Thus, the reconstruction is strongest from 1700–2001. Frequency, intensity, and duration of drought and pluvial events were examined to aid water resource managers. We found that the instrumental period did not represent adequately the full range of annual to multidecadal variability present in the reconstruction. Our reconstruction of mean May–September Potomac River streamflow was a significant improvement over the Cook and Jacoby (1983) reconstruction because it expanded the seasonal window, lengthened the record by 780 years, and better replicated the mean and variance of the instrumental record. By capitalizing on variable phenologies and tree growth responses to climate, multispecies reconstructions may provide significantly more information about past hydroclimate, especially in regions with low aridity and high tree species diversity.
 The Potomac River is the primary water resource for the Washington, DC, Metropolitan Area (WMA) supplying ∼75% of the water demand for nearly 4 million residents [Kame'enui et al., 2005]. In the Potomac River Basin (PRB), a variety of human and natural factors influence water quantity and quality, but the underlying climatic variability of the region is likely the most important component [Neff et al., 2000; Polsky et al., 2000]. Previous drought planning operations in the WMA have utilized the 1930 drought event to assess the ability of the water supply system to withstand future droughts. In 2002, the water supply system was tested by a drought that rivaled the 1930 event in intensity. While the 2002 drought was less severe than expected, below normal precipitation, record low groundwater levels, and record low daily streamflows in the winter and spring necessitated the augmentation of Potomac River streamflow from three reservoirs in the Potomac River Basin, reducing two reservoir storage levels to ∼65% of baseline [Kame'enui et al., 2005; Lorie and Hagen, 2007]. The instrumental record of Potomac River streamflow, which extends back to the 1890s, includes not only important drought events but also pluvial events that have affected the water supply system. In 1996, record high flood events occurred in January and September, making it the first time during the instrumental period that two large flooding events happened in a single year (U.S. Geological Survey, available at http://waterdata.usgs.gov/md/nwis/uv?site_no=01638500). The instrumental record is useful for assessing the water supply system's ability to operate under short-term drought and pluvial events, but it may not represent the full range of climatic variability over past centuries. Water resources in the region are generally abundant, but periodic drought and pluvial events require careful management of the water resources and the surrounding watersheds to minimize the negative impact of changes in water quantity and quality [Neff et al., 2000; Najjar et al., 2000]. Managers face additional challenges because the mid-Atlantic Region is predicted to become warmer and wetter in the coming decades [Polsky et al., 2000]. Increasing demands combined with climate variability may stretch the current water resource system in the PRB beyond levels experienced during the 20th century.
 In regions with short records of instrumental data, tree rings may be used as a proxy to extend the streamflow record, having important implications for water resource management [Rice et al., 2009; Woodhouse and Lukas, 2006a]. Previous streamflow studies indicate that the instrumental gauge records of the 20th and 21st centuries represent only a portion of the full range of streamflow variability in the past several centuries [Meko et al., 1995; Stockton and Jacoby, 1976; Woodhouse et al., 2006]. In the PRB, Cook and Jacoby  reconstructed the July–September streamflow period for the Potomac River (Point of Rocks, Maryland) from 1730–1977 using five tree ring chronologies from multiple species. In Cook and Jacoby's reconstruction, the 1930 drought was surpassed several times, but the prolonged regional drought of the 1960s was the most severe since 1730. The results of Cook and Jacoby suggest that water supply models calibrated on low-flow periods during the 1930s may not adequately forecast the ability of the water supply system to withstand more extreme drought events recorded in the reconstructed streamflow record. Additionally, several long periods (∼50 years) generally above and below the long-term median were noted. Cook and Jacoby's results clearly indicate that the instrumental record of streamflow is not sufficiently long to determine the frequency, intensity, and duration of long-term drought and pluvial events; however, their record represents less than 300 years of streamflow variability.
 The use of multiple species in the reconstruction of climate and streamflow is common in locations across the globe [e.g., Frank and Esper, 2005; Meko et al., 2001; Pederson et al., 2001], but multispecies methods rarely have been used in the eastern United States [i.e., Cook and Jacoby, 1977, 1983; Cook et al., 1999]. In the eastern deciduous forest, tens of tree species grow together across a variety of sites, each with a different response to climate resulting from differing locations (elevation, soils, topography) as well as species and population level phenological variation. While site history and stand dynamics can affect tree growth in different ways, the careful standardization of individual series can produce tree ring chronologies with a common climatic signal. In this paper, we use a set of existing chronologies including nine different species growing in or near the PRB to reconstruct mean May–September Potomac River streamflow from 950–2001. Nested principal components regression models were calculated at 25 year time steps to maximize the use of available predictor chronologies as the model moved backward in time. Such time-varying reconstruction models are useful because the longevity of species ranges from 300–900 years in our study. Further, we discuss the use and effect of multiple species on the calibration and verification of the streamflow reconstruction model. To facilitate the use of the reconstruction for water resource management, we analyze the frequency, intensity, and duration of drought and pluvial events in the reconstruction and compare the reconstruction to the instrumental record. Finally, we compare our mean May–September Potomac River streamflow reconstruction to Cook and Jacoby's  original July–September reconstruction of the Point of Rocks, Maryland gauge.
2.1. Streamflow Data
 The PRB extends across >37,000 km2 including parts of Virginia, Maryland, West Virginia, Pennsylvania, and the District of Columbia (Figure 1). The headwaters of the Potomac River begin in the mountains of West Virginia (North Branch) and Virginia (South Branch), where it flows 616 km to the Chesapeake Bay making it the fourth longest river on the Atlantic Coast. Instrumental Potomac River streamflow data (1895–2007) were collected for the Point of Rocks, Maryland gauge from the U.S. Geological Survey (USGS) (Figure 2; 39°16′24.9″N, 77°32′35″W). The Point of Rocks record is the longest available data set of Potomac River streamflow and is relied upon for water resource planning in the Washington, D.C., metropolitan area [Kame'enui et al., 2005]. The average flow of the Potomac River at Point of Rocks, Maryland for the period of record is 270 m3/s. The maximum flow (13,600 m3/s) occurred on 19 March 1936 and the minimum flow (15 m3/s) was recorded on 11 September 1966. The low-flow period extending from July to November is a primary concern of water resource managers in maintaining the water quantity and quality in the Potomac River Basin (Figure 2). Two reservoirs are located upstream of the Point of Rocks gauge. The Savage and Jennings-Randolph reservoirs were completed in 1952 and 1982, respectively, and are used to augment Potomac River streamflow when water demand exceeds supply during the low-flow period. In addition to low-flow events, pluvial events challenge the ability of water resource managers to maintain adequate levels of water quality for human and environmental needs [Neff et al., 2000].
 Additional adjusted streamflow records for the Point of Rocks gauge were obtained from the Interstate Commission on the Potomac River Basin (ICPRB), a collaborative water resource management agency for the WMA. Streamflow records were adjusted for reservoir outflows, and two monthly time series with and without consumptive use were created, respectively. Preliminary correlation and response function analysis between tree ring chronologies and the three gauge records (i.e., 1 unadjusted and 2 adjusted) showed that the unadjusted USGS data had the strongest and most time stable relationship to tree growth (data not presented). Our selection of the unadjusted streamflow record is somewhat circular because it relies on the relationship to tree growth. Yet, the central premise behind tree ring reconstructions of streamflow is that the same inputs (precipitation and runoff) that affect tree growth also affect streamflow. A significant and time stable correlation between growth and streamflow is a prerequisite for reconstruction. It is possible that adjustments to the streamflow record may provide more information for water resource management than streamflow reconstructions. Further investigation is needed to determine the effect of streamflow adjustments on the correlation with tree growth. Subsequent analysis used the unadjusted USGS streamflow record; the same as was used successfully by Cook and Jacoby .
 The USGS reported two gauge changes in 1902 and 1929 that may have affected the homogeneity of the streamflow record (U.S. Geological Survey, available at http://waterdata.usgs.gov/md/nwis/uv?site_no=01638500). Additionally, Brooks  documented extensive fire and logging activity around the turn of the 20th century in the eastern counties of West Virginia that may have affected Potomac River streamflow. The effects of land clearing activities on streamflow in the region are well documented [Lull and Sopper, 1966; Patric and Reinhart, 1971]. Cook and Jacoby  previously investigated the homogeneity of the Point of Rocks gauge using double mass analysis to determine if the Point of Rocks gauge changes or logging events in the early 1900s created anomalous values. The double mass analysis compared the streamflow record to regional precipitation records and showed no effect of the gauge changes. However, the period of intense logging activity created a departure from the expected flow prior to 1907, leading to inhomogeneity in the instrumental streamflow record. We truncated the instrumental streamflow record (1907–2007) to avoid possible spurious results associated with land clearing and disturbance. We also investigated the streamflow record for an increasing monotonic trend following logging activities in the late 19th century and farm abandonment in the early 20th century that lead to an increase in forest cover. An examination of the mean May–September instrumental streamflow record showed no trend through the 20th century, confirming Cook and Jacoby's original double mass analysis. Next, monthly streamflow data were examined for normality using normal quantile plots and the Shapiro-Wilk W goodness of fit test of the normal distribution. All months were not adequately modeled by the normal distribution (W = 0.54 to 0.92; p < 0.0001) and were log transformed. Streamflow data were later back-transformed into the original units (m3/s). Log transformations of data are necessary to meet the assumptions of multiple linear regression, but the back transformation process causes a reduction in the mean and variance.
2.2. Tree Ring Network
 Tree ring data for the streamflow reconstruction came from both unpublished collections and published chronologies freely available on the International Tree-Ring Data [International Tree-Ring Data Bank, 2010] (Figure 1). Chronologies were selected from locations in the Appalachian Mountains and east to the Atlantic Coast in Maryland, Pennsylvania, West Virginia, and Virginia. The collection sites vary from coastal lowlands to dry upland slopes and from closed to open canopy forests. At a few sites, chronologies for more than one species were developed. A total of 27 chronologies with the common period 1700–1977 were compiled from the region (Table 1). We chose the 1700–1977 common period to exclude chronologies with an abundance of young trees (<200 years old) and minimize problems caused by juvenile growth and European settlement disturbance. The screening process for inclusion in the streamflow reconstruction model is described in section 2.3. The range of species used in our study include Carya ovata Mill., Juniperus virginiana L., Liriodendron tulipifera L., Magnolia acuminata L., Picea rubens Sarg., Quercus alba L., Q. prinus L., Taxodium distichum L., and Tsuga Canadensis L. While many chronologies are located outside of the PRB, chronologies several hundred kilometers away from a climate or streamflow recording station may be significantly correlated with streamflow because of regional climate patterns [Cook et al., 1999; Woodhouse and Lukas, 2006b].
Table 1. Chronologies Used in the Potomac River Streamflow Reconstruction
Number of series and median series length were calculated following removal of series shorter than 125 years.
Blue Ridge Parkway
Pederson, N., Cook, E.R.
Otter Creek Natural Area
Pederson, N., Cook, E.R.
Alan Seeger Natural Area
Hemlocks Natural Area
Sweetroot Natural Area
Pederson, N., Cook, E.R.
Savage River State Forest
Savage River State Forest
Maxwell, R.S., Wixom, J.A.
Stahle, D.W., Cleaveland, M.K., Hehr, J.G.
Stahle, D.W., Cleaveland, M.K.
Stahle, D.W., Cleaveland, M.K., Hehr, J.G.
Stahle, D.W., Cleaveland, M.K.
Gaudimeer Scenic Area
 Each chronology was examined prior to standardization. Individual tree ring series were removed if they were less than 125 years in length to preserve low-frequency signals at multidecadal time scales associated with climate and streamflow [Cook et al., 1995]. After short series were removed, the number of series per sampling site ranged from 13 to 152 (8 to 81 trees) and median series length at a site ranged from 169 to 376 years (Table 1). The computer program ARSTAN was used to standardize each tree ring series using a smoothing spline that operates in the same way as a low-pass digital filter [Cook and Peters, 1981]. For removing biological growth trends and disturbances associated with closed canopy forests, a spline with a 50% frequency response cutoff equal to two-thirds the length of a series (“two-thirds spline”) was used for detrending. This level of detrending preserves low-frequency variance in the detrended series that is potentially resolvable given the length of the series being detrended [Cook, 1985; Cook et al., 1995]. Thus, a two-thirds spline used on series with a minimum 125 year segment length retains variability at periods up to ∼80 years in duration and produces chronologies suitable for analysis of multidecadal trends in reconstructed streamflow. Low-order autocorrelation (e.g., 1–3 year lag) was removed from each series with an autoregressive model. Then, the tree ring series for each site were averaged into residual chronologies using a robust mean [Cook, 1985], and the variance of the chronologies was stabilized with the Briffa RBAR-weighted method to account for changes in variance due to the reduction in sample size backward in time [Osborn et al., 1997].
2.3. Reconstruction Methods
 To determine the seasonal window for reconstructing streamflow, we conducted a correlation function analysis for individual streamflow months (log transformed) of the current growing season (May–September) with the 27 available chronologies for the common period 1700–1977. Chronologies in years t and t + 1 were included in the correlation analysis, for a total of 54 candidate predictors, because of the known preconditioning effect of previous year water availability on current year growth in the mid-Atlantic region [Cook et al., 1999; Stahle et al., 1998]. First, individual streamflow months were tested for autocorrelation using low-order autoregressive (AR) model [Box and Jenkins, 1970]. The minimum Akaike information criterion (AIC) was used to choose the AR model order p [Akaike, 1974]. Only May and June streamflow showed significant autocorrelation with model orders AR(3) and AR(1), respectively, and these months were prewhitened prior to the correlation analysis. We found that a large subset of the 54 candidate predictors (years t and t + 1), ranging from 50–52, was significantly correlated (r > |0.025|; p < 0.05) with streamflow in each month of the growing season (data not presented). Our results build on Cook and Jacoby's  previous analysis of mean July–September Potomac River streamflow by expanding the reconstructed season of streamflow from May–September. Therefore, we chose to include the full pool of 54 predictors in our modeling process of mean May–September Potomac River streamflow. Mean May–September streamflow showed no significant autocorrelation and was not prewhitened prior to calibration and verification of the reconstruction model.
 We chose a nested principle components regression (PCR) model to reconstruct mean May–September Potomac River streamflow and account for the decrease in the number of predictor chronologies backward in time [Cook et al., 1999, 2002; Meko, 1997]. Such time-varying models utilize the available tree ring chronologies in a period to gain greater predictive skill while lengthening the reconstruction. In our study, the first model was calibrated for the common period of all chronologies (1700–1976). A second model was calibrated one 25 year time step earlier (1675–1976), or when the next chronology drops out, using fewer chronologies available for the longer period, and so forth. An additional time step from 1700–2001 was modeled to include more recent tree ring collections. This resulted in a total of 12 separate regression model runs, each with its own set of calibration and verification statistics. Then, the models were spliced together to estimate streamflow backward in time utilizing the near-maximum number of predictor chronologies available for each time step.
 The PCR method used here is described in detail by Cook et al. [1999, 2002] and, therefore, we will give only a brief explanation of the model procedure. Predictors entered the PCR model if they were significantly correlated (r > |0.025|; p < 0.10) with mean May–September streamflow during the calibration period (1931–1976). A principal components analysis (PCA) was calculated on the remaining pool of predictors. Following the Kaiser-Guttman rule, the first n eigenvectors with eigenvalues >1 were retained for the multiple regression, further reducing the dimensionality of the data. The final subset of principal components in the regression model was determined using the minimum AIC that includes a penalty term for increasing the number of predictors in the model [Akaike, 1974].
 The period of overlap between the instrumental record of streamflow and tree ring chronologies (1907–1976) was split into two periods for calibration (1931–1976) and verification (1907–1931) of the nested PCR models. The calibration models were verified with two rigorous tests of fit, the reduction of error statistic (RE) and the coefficient of efficiency (CE) [Fritts, 1976; Cook et al., 1999]. The RE ranges from −∞ to +1. When RE exceeds zero, the calibration model shows greater skill than the mean of the instrumental data from the calibration period. The CE has the same range and calculation except the CE relies on the verification period mean as a baseline of predictive skill, making the CE more difficult to pass. Finally, we determined the relative influence of each predictor chronology in the common period nest (1700–1976) of the reconstruction by taking the absolute value of the standardized regression coefficients or beta weights [Cook et al., 1999, 2002]. The beta weights represent the principal component loadings of the predictor chronologies in the model and are calculated by multiplying the matrix of retained eigenvectors by the vector of beta weights in principal component space [Cook et al., 1994]. We summed the beta weights for chronologies where the t and t + 1 series were included as predictors, and then divided by the total sum of the beta weights for all predictors in the calibration model to calculate a measure of relative variance explained (0%–100%) for each chronology [Frank and Esper, 2005]. Results are mapped and discussed below in terms of species and site importance.
3. Reconstructed Potomac River Streamflow
3.1. Analysis of the Reconstruction
 Twelve nested PCR models were calculated using the minimum 25 year time step to reconstruct mean May–September Potomac River streamflow from 481–2001. The reconstruction calibration and verification statistics remained significant (p < 0.05) for the entire period and the RE and CE statistics never became negative, suggesting that the model provided more information than the calibration or verification means. While the reconstruction demonstrates statistical strength for the entire 481–2001 period, we have truncated the reconstruction at 950 because of weakening verification statistics and a reduction in sample size. Models before 950 were calculated on only two Juniperus virginiana chronologies from West Virginia, representing at most nine individual tree ring series and five trees. The reduction in sample size resulted in a decrease in the variance explained (r2) and poor model performance with RE and CE near zero.
 For the 950–2001 reconstruction period, the explained variance of the model during the calibration period (1931–1976) ranged from 20% to 53% (Figure 3). The reduction in explained variance backward in time is attributed to the decrease in the number and species type of predictors available for reconstruction. The explained variance during the verification period (1907–1930) ranged from 14% to 50%. The RE and CE were nearly identical in each time step and both remained positive. However, the verification statistics during the common period of all predictors (1700–1977) were not as strong as expected, given the number of chronologies available. Going backward in time, RE and CE increased briefly during the 17th century and fluctuated for approximately 175 years until the number of predictor chronologies fell to five in 1525. The increase in RE and CE suggests that a more parsimonious model may be constructed by lengthening the common period and excluding shorter chronologies. While a reduction in the number of chronologies strengthened the verification statistics, the nested PCR models saw an accompanying decrease in the calibration r2. The number of available predictor chronologies leveled off prior to 1525 with a corresponding plateau in explained variance during the verification period. Therefore, reconstructed streamflow prior to 1525 should be interpreted with some caution.
 Further inspection of the nested PCR models during the calibration and verification periods demonstrated the effect of the changing availability of predictor chronologies on the correlation with instrumental streamflow (Figure 4). First, decadal trends present in the instrumental record were adequately modeled across nested models. Most notably, the duration and intensity of the mid-1960s drought was well replicated in most models. Second, single year drought and pluvial events, such as the 1930 drought, were replicated, but models tended to underestimate extreme values. Third, a 1 year lag between the instrumental record and the proxy records occurred several times. For example, the 1949 peak in streamflow carried into 1950 for some of the nested models. Although chronologies were prewhitened prior to modeling, the carryover is likely the prior year climatic conditions influence current year's growth. In this case, abundant moisture in 1949 likely resulted in extra photosynthate production and growth in 1949 that carried over to 1950, suggesting that not all of the persistence in the predictor chronologies was removed. Fourth, annual to multiyear periods of disagreement between nested models occurred throughout the record (e.g., ∼1910, 1920, and 1940). The reduction in available chronologies backward (and forward) in time and the changing species composition in the predictor pool create discrepancies between nested models during the calibration/verification period. From our initial correlations with monthly streamflow, we know that some species are better correlated with early growing season streamflow than late growing season streamflow. Overall, small differences in predicted streamflow were seen in any given year, but multiyear and decadal trends were preserved reasonably well. In cases of disagreement, differences in the number of predictor chronologies and type of tree species likely affected the nested models.
3.2. Species Importance
 The beta weights for the common period 1700–1976 are shown in Figure 1 (bottom). Fifty-two predictors (27 chronologies) from years t and t + 1 were retained for modeling, and weights were distributed across the region and species type. The mean relative variance explained for all chronologies was 3.7%, ranging from 0.48% to 7.24%. The most important species were Q. prinus, T. canadensis, and T. distichum in terms of abundance and relative variance explained. However, species represented only once (i.e., C. ovata, M. acuminata, L. tulipifera) had some of the highest beta weights (Figure 1). The single species collections were all located at the Fiddler's Green site in the Blue Ridge Mountains of Virginia, presenting a possible conflation between species and site importance. Future collections of these species across the region likely would strengthen the reconstruction. It could be argued that collections should concentrate in the South Branch subbasin because it contributes more to the measured flow of the Point of Rocks, Maryland gauge. However, the trees growing in/near that subbasin do not have a direct proportional relationship to flow. Rather, trees are responding to regional climate signals that control precipitation input into streams and availability of moisture at individual sampling sites. Often in dendroclimatological sampling, investigators select sampling sites that limit moisture to trees (e.g., rock outcrops) and potentially enhance the growth response to climate variation regardless of the subbasin in which the tree grows. Our reconstruction results confirm that precipitation is not distributed evenly across the region. Future examination of the regional climate teleconnections in the PRB might help identify sampling sites that are more sensitive to variation in moisture inputs.
 The majority of tree ring chronology predictors had greater loadings in year t but J. virginiana, C. ovata, T. canadensis, and P. rubens had greater loadings in year t + 1, demonstrating the differential species response to moisture variability through the growing season and potentially differences in phenological site type interactions. For species with large beta weights for t + 1 predictors, late season moisture may influence the next year's growth. The resulting species differences in t + 1 predictors might be exploited further to strengthen reconstructions of August–October streamflow. Approximately 27% of the beta loadings were negative, suggesting that the tree ring response to May–September moisture is not homogeneous across the region and that the distribution of moisture across sites is not even.
 The loss of key predictor species and chronologies as the nested PCR model was shifted backward (and forward) in time resulted in fluctuating calibration and verification statistics (Figure 3 and Table 1). The shift from 1700 back to 1650 lost five predictor chronologies including the L. tulipifera and M. acuminata chronologies from Fiddler's Green, Va. that together explained 14% of the relative variance in the common period 1700–1976. At each 25 year time step, additional species and chronologies that explained 5%–7% each of the relative variance during the common period were not available in the predictor pool. By 1550, only eight predictor chronologies representing four species remained. The eight predictor chronologies collectively explained 24% of the common period relative variance. As previously mentioned, the increase in verification statistics in the 1625–1700 period might be an effect of the reduction in the number of available chronologies from the common period. We experimented with a more stringent rule for inclusion into the predictor pool in which the correlation with streamflow had to be significant at p < 0.05, but found little difference between models through time.
 The reduction in sample size in individual chronologies ending from 1550–1700 is a second explanation for the fluctuation in the calibration and verification statistics. Typically, less than five individual series were used to develop the chronologies in early decades, resulting in a decrease in the expressed population signal (EPS) in each chronology. The EPS is a measure of the common variance in a chronology and weakens as sample size decreases [Briffa, 1984; Wigley et al., 1984]. Future modeling efforts may be strengthened by closer examination of individual chronologies and adjustment of chronology length based on the EPS. Also, the location of individual chronologies in respect to the Point of Rocks gauge may influence the strength of the reconstruction when the number of tree ring chronologies is very small (i.e., 950–1550). The distal location of the Juniperus virginiana and Taxodium distichum chronologies vary climatically in reference to the Point of Rocks gauge and do not fully represent the Potomac River Basin. Our reconstruction results suggest that additional sampling should take place to expand the tree ring network in the mid-Atlantic Region for both spatial, temporal, and species coverage. Alternatively, we could incorporate younger chronologies (<1700) and archeological samples to expand the predictor pool and shorten the common period, taking advantage of the diversity of younger species. Incorporation of shorter tree ring series would likely present bias from juvenile growth and introduce the segment-length curse [Cook et al., 1995]. The segment-length curse refers to the maximum time span of recoverable climatic or streamflow information as it relates to the length of individual tree ring series that are used to build site chronologies. We sought to examine annual to multidecadal periods of streamflow and excluded series shorter than 125 years to avoid the loss of information. Also, we experimented with a shorter common period (<276 years), and thus, an expanded predictor pool of tree ring chronologies. However, the explanatory power of the model was not greatly improved, suggesting that a ceiling may exist for the explanatory power of streamflow reconstructions in eastern North America. While we improved on the Cook and Jacoby  reconstruction, we expected that the expanded predictor pool and addition of new species would have explained more of the variance of the instrumental record. Further research is needed to investigate this potential limitation.
4. Frequency, Intensity, and Duration of Events
 Annual to multidecadal variability occurred throughout the reconstructed record (950–2001) of Potomac River streamflow and abrupt transitions from dry to wet or wet to dry were common throughout the reconstruction (Figure 3a). In the Potomac River Basin, drought and pluvial events challenge water resources managers' ability to supply adequate clean water. By further examining the frequency (events per time period), intensity (a value's departure from the median), and duration of events (years below or above the long-term median), water resource managers may be better able to integrate reconstructed streamflow into management decisions. Intensity and duration were examined by computing the 10 lowest and highest reconstructed n year running means for n = 1, 5, 11, and 25 years (Table 2). The most severe n year events from the instrumental period were calculated for comparison to the reconstructed record of streamflow. Overall, the reconstructed record contained drought events drier than recorded in the instrumental record and pluvial events less wet than observed. Surprisingly, the instrumental record of the 20th century has the fourth driest year (1930) in the last 1000 years. However, the multiyear drought in the 1960s was more severe in the reconstructed record than 1930, ranking in the top 10 driest events in the 1, 5, 11, and 25 year periods.
Table 2. Lowest n year Moving Averages of the Reconstructed (950–2001) and Actual Streamflowa
Events are nonoverlapping averages of the mean May–Sept Potomac River streamflow (m3/s) for n year periods with the center year in parentheses. Ranks 1–10 are the most severe reconstructed events and actual is the most severe observed event.
 The late 16th and early 17th centuries saw some of the most severe single year and multiyear drought events that exceeded the intensity of any events in the observed record. The severe and prolonged drought of the 1620s and 1630s was exceptional in intensity and duration (Table 2). The early 17th century drought ranked twice in the top 5 year droughts, first in the 11 year droughts, and second in the 25 year droughts. Further, the 25 year drought centered on 1634 was better replicated than the top-ranked drought in the 11th century. The intensity and duration of late 16th century drought events further confirms the existence of the 16th century megadrought recorded in tree ring proxies of moisture across much of North America [Stahle et al., 2000, 2007; Woodhouse and Overpeck, 1998]. Additionally, regional drought events of historic significance around the turn of the 16th century (i.e., Roanoke Island and Jamestown droughts) were replicated well in our streamflow reconstruction [Stahle et al., 1998]. It must be noted that the T. distichum chronologies used in our analysis were also used in the work of Stahle et al.  and are at least partly influenced by those chronologies.
 Pluvial events of the late 20th century rank among the wettest in the reconstruction but the intensity of these events was underestimated in the model, suggesting that reconstruction was conservative in the representation of extreme wet events. This is a common feature in hydroclimatic reconstructions from tree rings. The intensity of late 20th century pluvials confirmed the trend of increased moisture in the Potomac River Basin during the past century [Neff et al., 2000; R. Maxwell et al., A 1248 year reconstruction of May precipitation for the Mid-Atlantic Region using Juniperus virginiana tree rings, submitted to Journal of Climate, 2011] and a great portion of the Midwestern United States [Cook et al. 2010; McEwan et al. 2010]. The year 1996 was exceptional in the instrumental record because it had two major flooding events of the Potomac River at Harper's Ferry in January and September. While the reconstruction of streamflow did not represent the full range of pluvial values in the observed record, the reconstruction suggests that pluvial events up to 25 years in duration may have occurred in the past. The 11th century ranked among the top wettest 1, 5, 11, and 25 year periods, coincident with the Medieval Warm Epoch (900–1300) and the associated increase in PRB moisture during that time (Maxwell et al., submitted manuscript, 2011). The 20th century has seen some of the driest and wettest annual to decadal events in the past millennium, but longer and more severe events were reconstructed in previous centuries. We examined drought and pluvial event duration and frequency in the reconstructed and instrumental records by counting n year events below or above the median streamflow (Figure 5). Both the reconstructed and instrumental records show similar counts of 1 to 4 year droughts and pluvials, but the 20th century did not fully represent the frequency of longer event durations that occurred in the reconstruction. While uncommon, the reconstruction contained 7 to 14 year drought and pluvial events that may stress the water supply system beyond levels seen in the observed record.
 To highlight multidecadal trends in the median and variance of the reconstructed record, we created box-and-whisker plots for 50 year periods of the back-transformed proxy streamflow record and compared them to the back-transformed instrumental data (1907–2007; Figure 6). Several observations were made about the multidecadal trends in the reconstruction. First, the median and variance of the back-transformed instrumental record was modeled well, suggesting the reconstruction was suitable for evaluation of multidecadal trends. Second, the 20th century had the greatest variability in the past 300 years, confirming previous results in the Potomac River Basin (Maxwell et al., submitted manuscript, 2011). Third, the period from 1700–1900 had below median streamflow and reduced variability compared to the entire reconstruction. The second half of the 19th century was greatly below the long-term median and had the smallest variance of any 50 year period in the reconstruction. Fourth, the period from 1250–1500 was consistently above the median, with only small fluctuations in variance and few extreme drought or pluvial events. Finally, extreme events (i.e., <10th and >90th percentiles) were not evenly distributed over the past millennium.
 Our mean May–September reconstruction of Potomac River streamflow at Point of Rocks, Maryland (950–2001) is moderately correlated (r = 0.65) with Cook and Jacoby's  mean June–August reconstruction of the same streamflow gauge (1730–1977). The moderate correlation was likely an effect of (1) the lengthened season of reconstruction; (2) differences in the reconstruction models; and (3) the increased number of chronologies and species types. Cook and Jacoby originally reconstructed June–August streamflow using canonical regression of five chronologies representing four species to determine the appropriate reconstruction period and model. Our analysis relied on a similar method of principal component analysis of 27 chronologies representing nine species during almost the same time period (1700–1976). We hypothesize that the expanded range of species aided in the lengthening of the seasonal window of reconstruction by taking advantage of the phenological traits of each species and their differential response to growing season climate variability. Additional small differences in the selection of predictor chronologies and the calibration/verification period may have contributed to differences in the reconstructions.
 The greatest difference between the reconstructions was the increase in the mean and variance from the previous reconstruction to the present model (Figure 7). Cook and Jacoby's reconstruction had an 85.9 m3/s mean and 27.8 m3/s standard deviation, while mean and standard deviation of our reconstruction were 132.8 m3/s and 46.0 m3/s, respectively. The newer model better represents the mean (186.2 m3/s) and standard deviation (90.3 m3/s) of the untransformed 1907–2007 instrumental record. Extreme pluvial events in the observed record contribute to an instrumental variance greater than triple the Cook and Jacoby reconstruction and double our reconstruction. The reduction in the mean and variance of the reconstructions is at least partly an effect of log transforming the streamflow data prior to modeling. Back-transformed values of streamflow may not always have the same mean and variance of the original data [Prairie et al., 2006]. Despite the differences in the mean and variance of the reconstructions, the records showed similar trends in local variance and duration of events. For example, the 1850–1900 period of below median flow and low variance and the 1900–1950 period of above median flow and high variance were present in each reconstruction. Additionally, the duration of the 1960s drought was replicated in both records. The increased variance in the instrumental period from 1977 to the present, in addition to the decrease in the number of predictor chronologies for the most recent 30+ years, may account for the reduced performance of our reconstruction in the 1700–2001 nested PCR model (Figure 3). Future modeling efforts may be challenged to reproduce the extreme pluvial events of the past 30+ years because tree growth is physiologically limited regardless of the increase in available moisture. However, the frequency and magnitude of Potomac River paleofloods may be investigated by examining floodplain trees wounded by flood debris [Sigafoos, 1964; Yanosky, 1983; Yanosky and Jarrett, 2002]. The reconstruction of streamflow in the eastern United States may be more helpful in evaluating drought probabilities rather than reconstructing individual years. The mid-Atlantic Region is predicted to become wetter in the coming decades [Najjar et al., 2000], and the development of multicentury records of streamflow will help water resource managers place anticipated changes in climate and streamflow in the context of past multiyear, decadal, and centennial events.
6. Summary and Conclusions
 Mean May–September Potomac River streamflow was reconstructed from 950–2001 using a network of tree ring chronologies (n = 27) representing nine eastern tree species. We chose a nested PCR model to maximize the number of available predictor chronologies backward in time. Our reconstruction model performed well in the common period, explaining 52% of the variance in the calibration period with a 0.25 RE statistic during the verification period. The fluctuation in calibration and verification statistics during the 1550–1700 period likely was caused by the reduction in sample size in combination with the loss of important predictor species. Surprisingly, C. ovata, M. acuminata, and L. tulipifera were strong predictors of streamflow despite being represented by only one chronology per species. Additional sampling should focus on expanding the tree ring network in the mid-Atlantic Region for both spatial, temporal, and species coverage to strengthen the reconstruction during the 16th and 17th centuries. Our reconstruction of mean May–September Potomac River streamflow was a significant improvement to Cook and Jacoby's  streamflow reconstruction because it expanded the seasonal window, lengthened the record by 780 years, and better replicated the mean and variance of the instrumental record.
 Knowledge of past drought and pluvial events is important for water resource management in the Potomac River Basin. However, the instrumental record of Potomac River streamflow does not adequately represent the full range of variability in the past millennium, as evidenced by our reconstruction of streamflow. Extreme drought and pluvial events ranging from 1–25 years in duration have occurred during the period of the instrumental record, but more severe drought events were represented by the reconstruction. Multidecadal variability in streamflow also must be an important consideration in management practices. Half-century periods of below and above median streamflow were common in the past millennium and may require new strategies to secure the water supply system. Further modeling of streamflow events longer than 50 years may be possible in the mid-Atlantic Region using millennial-length J. virginiana and T. distichum chronologies. With the expanded season of streamflow reconstruction, water resource managers may be better prepared to meet water demand and maintain water quality throughout the low-flow season from July–November. Currently, tree ring reconstruction models have difficulty modeling fall streamflow events, but some portion of the variance may be explained with collection of additional sites and species, particularly those species for which the previous year's moisture has a preconditioning effect on growth.
 In the coming decades, precipitation and streamflow are projected to increase in amount and variability in the mid-Atlantic region, impacting the delivery of adequate and quality water for human use and ecosystem services [Najjar et al., 2000; Neff et al., 2000]. Extreme short-term pluvial events in the observed record of streamflow already have exceeded all reconstructed pluvial events. These positive extremes present difficulty for tree ring reconstructions because trees are physiologically limited in their uptake of water from storm events and do not track subannual high-flow periods well. However, the duration of longer pluvial events was well replicated during the instrumental period, suggesting that the reconstruction may be useful in assessing the probability of multiyear or longer events. Future work will concentrate on (1) expanding the temporal, spatial, and species coverage of the tree ring network in the mid-Atlantic Region for water resource modeling; (2) reconstructing low-frequency events over the past millennium; (3) strengthening the later period of the low-flow season; and (4) communicating and integrating our results into water management practices and water demand modeling through collaboration with regional water resource managers.
 We would like to thank Rob Wilson and ITRDB contributors for kindly providing their tree ring data and the Interstate Commission on the Potomac River Basin for providing streamflow data. Funding for the project was provided by the NASA WV Space Grant Consortium, West Virginia University Eberly College of Arts and Sciences, and the National Science Foundation grant 0925114. Lamont-Doherty Earth Observatory contribution 7452.