Laboratory experiment
We used baited minnow traps to collect live fish from the Mpanga River drainage. Collections were from four sites in June 2006 (Bunoga, Rwebakwata, Kahunge, Kantembwe; Fig. 1) and two sites in May–June 2008 (Kamwenge, Kanyantale; Fig. 1). Kanyantale is a hypoxic swamp site with average oxygen levels of 0.28 mg L^{−1} (±0.03 SE) and an average per cent oxygen saturation of 3.5% in May 2008. All other sites are described in Crispo & Chapman (2008). Briefly, Kantembwe is a hypoxic swamp site with dissolved oxygen levels similar to those observed in Kanyantale. Rwebakwata and Kahunge are river sites adjacent to the swamp and experience seasonal fluctuations in dissolved oxygen because of flooding of the adjacent swamp and the influx of organic debris during wet seasons. Bunoga is a river site that also experiences some seasonal fluctuation in dissolved oxygen, possibly because of surrounding intensive land use. However, oxygen at these three river sites never reaches levels as low as those in the swamp. Although we have no longterm oxygen data for Kamwenge, we expect that this river site experiences relatively high oxygen levels yearround, given its large geographical distance from the swamp and consistently high oxygen levels during our expeditions to this area. We know of no physical barriers to dispersal that exist among sampled sites, suggesting that contemporary gene flow is possible among all sites, at least during the wet seasons when flooding occurs (Crispo & Chapman, 2008, 2010). Habitat patchiness probably restricts dispersal in the swamps during the dry seasons (L. Chapman, personal observation). A large waterfall separates the Kamwenge site from downstream swamps (L. Chapman, personal observation). We expect dispersal between river and swamp environments to be greatest among the sites located closest to the junction between these two environments.
All experiments took place at McGill University following shipment from Uganda of the collected adults. The experiment was divided into two parts, with Bunoga, Rwebakwata, Kahunge, and Kantembwe broods raised in 2007–2008, and Kamwenge and Kanyantale broods raised in 2008–2009. This split was necessary because of spatial constraints in the laboratory and the logistics of transferring live fish from Uganda. For each collection site, seven fullsib families were raised, where a family consisted of the brood of one male–female pair (i.e. each parent used only once in the experiment). An exception was Kanyantale, for which nine fullsib families were used. Pseudocrenilabrus multicolor is a mouth brooder, meaning that fry develop in the mouth of the female parent. Brood sizes reached values of > 60, although minimum brood sizes could not be determined because of cannibalism by the mothers. F_{1} broods from each family were split between highoxygen and lowoxygen treatments 1 week after release from the mouth. A separate 14gallon tank was used for each combination of family and treatment. Filtration was performed using Hagen Fluval underwater filters. Lowoxygen conditions (tank averages of 0.54–1.29 mg L^{−1}) were maintained using a commercial oxygen controlling system (Point Four Systems Inc., Coquitlam, British Columbia). Highoxygen conditions (tank averages of 7.27–8.07 mg L^{−1}) were maintained via constant bubbling of air through the water column. Fry were fed Hikari First Bites fry food for the first 3 weeks after release from the females’ mouths. TetraMin Pro Tropical Crisps were gradually introduced 2 weeks after release and were then fed to the growing offspring for the remaining duration of the experiment. Mortality during the experiment was related primarily to aggression among mature siblings and was not related to oxygen. At approximately 1 month of age, broods were culled to 10 fish per tank. If fewer than 10 fish were present in a tank, we did not perform culling. Ambient temperature remained constant at 25.5 °C.
Gills
At an age of approximately 1 year, the largest two males in each tank (or one male if there was only one mature male in the tank) were harvested for analysis of morphology. We used only males because mouth brooding in females might have affected the development of the gills (Schwartz, 1995; C. O’Connor, E. Reardon & L. Chapman, unpublished data). Some, but not all, females brooded during the experiment, but whether or not an individual female had brooded was not recorded. Fish were euthanized using buffered tricaine methanesulfonate (MS222; pH = 7.0) and were preserved in 4% paraformaldehyde (buffered with phosphate buffered saline; pH = 7.0). Gills were dissected out from the branchial basket on one side of the fish (normally the right side, unless the gills were damaged during the dissection process). The four gill arches were separated, and both sides of each arch (hemibranch) were photographed using a Lumenera Scientific Infinity camera attached to a dissecting microscope. Measurements of five gill metrics were made using Motic Images Plus version 2.0, including total gill filament length (TGFL), average gill filament length (AFL), total number of gill filaments (TNF), total hemibranch area (THA) and total perimeter of the hemibranches (TP) (Muir & Hughes, 1969; Hughes, 1984; Langerhans et al., 2007; Chapman et al., 2008). Units were mm or mm^{2}. TGFL was quantified by measuring the length of every fifth filament, using the average of two measures to estimate the length of the four filaments between them, summing the lengths of the filaments for one gill, and multiplying by two to obtain the overall length of the filaments for both gills. Similarly, AFL was quantified using this procedure, but the total filament length was divided by the number of filaments on one gill, and the final value was not multiplied by two. TNF was quantified by counting the number of filaments on one gill and multiplying by two. THA was quantified by estimating the area around the filaments for each side of each hemibranch, summing these values and multiplying by two (see Fig. 3 in Langerhans et al., 2007). TP was quantified in a similar way, but using the perimeter of the area measured for THA. All of these metrics were positively correlated with total gill surface area measured in a subset of the fish from Bunoga and Kantembwe (log_{10}transformed values; Pearson twotailed correlation: TGFL, r = 0.908, P < 0.001; THA, r = 0.923, P < 0.001; TNF, r = 0.639, P = 0.002; AFL, r = 0.927, P < 0.001; TP, r = 0.851, P < 0.001). These metrics were also correlated with total gill surface area in previous studies (Chapman et al., 2000, 2007).
To test for population, treatment and populationbytreatment effects on gill size, we performed a mixedmodel analysis of variance (anova), instead of a multivariate analysis of variance (manova), so that ‘family’ could be included as a random factor. We standardized gill metrics to a common body mass so that we could perform a principle component analysis (PCA) and used the PCA scores as the response variable in the univariate test. Each of the above gill metrics was standardized to a common body mass using the allometric equation: Y_{std} = Y_{obs}(M_{avg}/M_{obs})^{β}, where Y represents the gill metric; M represents body mass; subscripts std, obs, and avg refer to the standardized, observed (actual) and average (for all fish) measures, respectively; and β represents the slope of the relationship between the gill metric and body mass across all populations (Reist, 1986; Hendry & Taylor, 2004; Chapman et al., 2008). The β values were obtained from analyses of covariance (ancovas) including population and treatment as fixed factors, family (nested within population) as a random factor, the populationbytreatment interaction, and log_{10}transformed body mass as a covariate. All analyses were performed using Type III sums of squares in spss version 16.0. All gill metrics were log_{10}transformed for the ancovas, but nontransformed trait values were used in the above allometric equation. Size standardization via this approach was appropriate because interactions with body mass were nonsignificant in ancovas (results not shown).
We performed a principal components analysis on the log_{10}transformed body mass–standardized gill metrics. We used the correlation matrix and regression method to obtain composite scores. PCA extracted one component (reflecting gill size) with an eigenvalue > 1 (Table 1), and we performed a mixedmodel anova using the scores from this single composite variable as the response variable. We included population and treatment as fixed factors, the populationbytreatment interaction, and family (nested within population) as a random factor (Model 1).
Table 1. Loadings (correlations) of each gill metric on the components extracted from the principle component analyses. Only components with eigenvalues > 1 are shown. Gill metrics were body mass–standardized and log_{10}transformed as described in the text.  Both treatments  Low oxygen  High oxygen 

PC1  PC1  PC2  PC1  PC2 


Gill metric 
TGFL  0.990  0.969  −0.056  0.954  −0.154 
THA  0.983  0.964  −0.228  0.963  −0.191 
TNF  0.610  0.576  0.780  0.566  0.757 
AFL  0.944  0.851  −0.508  0.663  −0.746 
TP  0.967  0.835  0.307  0.752  0.528 
We performed additional analyses to gain insight into familylevel variation in plasticity, effects on individual gill metrics and oxygen effects within treatments. First, familybytreatment interactions could not be tested in the above model because family is a random factor nested within population. Thus, we also ran the analysis without the population term, but including family (a random factor), treatment and the familybytreatment interaction (Model 2). This analysis allowed us to determine whether gill size plasticity varies among families, irrespective of population. We were able to perform this analysis because the population term was nonsignificant (see Results). Second, we performed univariate tests on each gill metric separately to see if treatment and population effects, and their interaction, influence individual gill metrics. These tests included the log_{10}transformed nonstandardized gill metrics as dependent variables, log_{10}transformed body mass as a covariate, population and treatment as fixed factors, family (nested within population) as a random factor, and the populationbytreatment interaction. Interactions with body mass were nonsignificant (results not shown) and were therefore not included in the model.
Within the lowoxygen treatment, the oxygen controlling system recorded the dissolved oxygen concentration every hour for the duration of the experiment. This was necessary to facilitate precision on the control of oxygen in the lowoxygen tanks. For the highoxygen treatment, we had weekly oxygen readings that were taken using a handheld device (OxyGuard Polaris). A commercial controlling system was not used for the highoxygen treatment because these levels of oxygen were easier to achieve. We performed ancovas within each treatment separately to test whether the finescale variation in oxygen concentration that occurred within treatments influenced gill size. For these analyses, the standardization and PCA methods described earlier were repeated using only fish from the highoxygen or lowoxygen treatments. Two PCA components with eigenvalues > 1 were extracted for each treatment (Table 1), and thus two ancovas were performed for each treatment. We performed ancovas with the family mean PCA scores as the response variables, population as a fixed factor, the log_{10}transformed mean oxygen concentration (mg L^{−1}) as a covariate and the populationbyoxygen interaction. All effects were nonsignificant for the highoxygen treatment, and so we do not report the results. These analyses were used to reveal whether observed family effects are because, at least in part, of tank effects related to small differences in dissolved oxygen concentration.
Brain
Brains were extracted from the same fish that were used for the gill metrics, using standard dissection methods (Chapman et al., 2008), and were stored in 4% buffered paraformaldehyde. We obtained the blotted weight to the nearest 0.1 mg and used the average of five measurements per brain for the analyses. We performed an ancova with log_{10}transformed brain mass included as the response variable, population and treatment as fixed factors, family (nested within population) as a random factor, the populationbytreatment interaction and log_{10}transformed body mass as a covariate. Interactions with body mass were not significant (results not shown), and so they were not included in the model. We were not able to perform a post hoc test for multiple comparisons with the covariate (body mass) included in the model. We therefore also standardized the brain mass to a common body mass, as we did for the gill metrics. We then performed an anova as earlier, using the log_{10}transformed body mass–standardized brain mass as the response variable, without the body mass covariate (results not shown). Because population effects were significant (see Results), we followed this with a Tukey honestly significant difference post hoc test for multiple comparisons to identify homogenous subsets with respect to population. We did not test for familybytreatment interactions, as we did for the gills, because population and populationbytreatment effects on brain mass were strong (see Results). Therefore, we could not remove the population effect from the model to test for interactions with family. Using the withintreatment adjusted family means (from the above ancova including the body mass covariate) as the response variables, we also performed ancovas for each treatment separately, including the oxygen covariate, as we did for the gills. Within the highoxygen treatment, the population effect was significant, but the oxygen covariate was not, and so we do not report the results from the analyses within the highoxygen treatment.