Multiple evolutionary origins of legume traits leading to extreme rhizobial differentiation


Author for correspondence:
Ryoko Oono
Tel: +1 612 6266463


  • When rhizobia differentiate inside legume host nodules to become nitrogen-fixing bacteroids, they undergo a physiological as well as a morphological transformation. These transformations are more extreme in some legume species than others, leading to fundamental differences in rhizobial life history and evolution. Here, we analysed the distribution of different bacteroid morphologies over a legume phylogeny to understand the evolutionary history of this host-influenced differentiation.
  • Using existing electron micrographs and new flow cytometric analyses, bacteroid morphologies were categorized as swollen or nonswollen for 40 legume species in the subfamily Papilionoideae. Maximum likelihood and Bayesian frameworks were used to reconstruct ancestral states at the bases of all major subclades within the papilionoids.
  • Extreme bacteroid differentiation leading to swelling was found in five out of the six major papilionoid subclades. The inferred ancestral state for the Papilionoideae was hosting nonswollen bacteroids, indicating at least five independent origins of host traits leading to swollen bacteroids.
  • Repeated evolution of host traits causing bacteroid swelling indicates a possible fitness benefit to the plant. Furthermore, as bacteroid swelling is often correlated with loss of reproductive viability, the evolution of bacteroid cooperation or cheating strategies could be fundamentally different between the two bacteroid morphologies.


Mutualistic symbioses often vary in net benefits depending on partner genotypes and the environment (Bronstein, 1994; Johnson et al., 1997; Denison & Kiers, 2004; Heath & Tiffin, 2007). Some mutualistic species have evolved adaptations that impose selection for more beneficial partners (Kiers et al., 2003; Jander & Herre, 2010). Others have abandoned symbiosis (Hibbett et al., 2000; Sachs & Simms, 2006), which makes ancient mutualisms particularly challenging to understand. Their coevolving symbiotic strategies and life histories may shed light on mechanisms for evolutionary stability in mutualisms.

In the legume–rhizobia symbiosis, one such ancient system, we find two different rhizobial life histories, which depend on the species of legume host. In nodules of some hosts, including alfalfa (Medicago sativa) and peanut (Arachis hypogaea), rhizobia that differentiate into nitrogen-fixing bacteroids undergo major transformations, including swelling or branching and sometimes amplification of the bacterial genome (Mergaert et al., 2006). As discussed later, this extreme differentiation most likely prevents bacteroids from resuming normal cell division (free-living states), even if they are released from nodules during senescence (Sutton & Paterson, 1983). The next generation of symbiotic rhizobia for these hosts is presumably descended from rhizobia that had reproduced within the same nodules but not yet differentiated into bacteroids. By contrast, in hosts such as cowpea (Vigna unguiculata) and birdsfoot trefoil (Lotus corniculatus), the rhizobia still undergo differentiation into bacteroids but this process is not as irreversible. Bacteroids in these hosts are less swollen and have no genome amplification and, therefore, continue to reproduce after leaving the nodules (Mergaert et al., 2006).

The effect of legume host species on bacteroid differentiation was studied extensively by Sutton & Paterson (1980, 1983), but specific plant mechanisms were unknown and only a handful of closely related species were investigated. Currently, it is widely accepted that the size and shape of nitrogen-fixing bacteroids vary widely and are controlled by the legume host rather than the genotype of the rhizobia (Oke & Long, 1999). For example, a single rhizobial strain will differentiate into spherical swollen bacteroids in peanut but remain rod-shaped in cowpea (Sen et al., 1986). Similarly, Mergaert et al. (2006) has shown that recombinant rhizobial strains will transform into different bacteroid morphologies depending on the host species; transgenic rhizobia, which never evolved with one host, nonetheless showed the same level of bacteroid differentiation as the host’s wild type rhizobia.

Extreme bacteroid differentiation has recently been shown in Medicago truncatula to be imposed by nodule-specific cysteine rich (NCR) plant peptides (Van de Velde et al., 2010). These compounds have properties similar to antimicrobial defensins that block bacterial cell division, often causing genomic endoreduplication and alteration of cell shape (Latch & Margolin, 1997). Differentiation may also involve extreme alteration of the synthesis of peptidoglycan, an elastic polymer of bacterial cell walls known to regulate osmotic pressure and cell shape (Lam et al., 2009). The NCR peptide-coding sequences are also present in closely related genera of Medicago that host swollen bacteroids, but not in Lotus japonicus, Phaseolus vulgaris or Glycine max (Alunni et al., 2007 and references therein), those species hosting nonswollen bacteroids.

Similarly, the correlation between swelling and loss of bacteroid reproductive viability has been consistent among many tested species, such as G. max (Gresshoff & Rolfe, 1978; Zhou et al., 1985), P. vulgaris (W. C. Ratcliff unpublished), Macroptilium atropurpureum (W. C. Ratcliff unpublished), L. japonicus (Müller et al., 2001), Trifolium repens (Zhou et al., 1985) and M. sativa (McRae et al., 1989; Vasse et al., 1990; Ratcliff et al., 2008). Although Khetmalas & Bal (2005) characterized rod-shaped rhizobia inside senescing Arachis pintoi nodules as dedifferentiated formerly spherical bacteroids, the preponderance of evidence suggests that swollen bacteroids rarely dedifferentiate back into free-living forms and that it is the undifferentiated cells that repopulate the soil for future symbiosis. Bacteroid morphology is therefore considered a reasonable proxy for reproductive viability until we find a more effective means of observing bacteroids in their natural states to see how they either dedifferentiate and reproduce or are broken down (by plant, rhizobial, or exogenous enzymes).

Some previous studies also correlated bacteroid type to nodule type, although it has not been shown that this correlation is universal. Medicago, Pisum, Vicia and Trifolium (closely related species in the Inverse-Repeat Legume Clade (IRLC)) all have indeterminate nodule types (those with persistent meristems) and swollen bacteroids, whereas Glycine, Phaseolus, Macroptilium and Lotus all have determinate nodule types (transient meristems) and nonswollen bacteroids. It has been observed in many determinate nodules (within Phaseoloid and Dalbergioid clades) that the rhizobia-infected cells divide and enlarge along with the rhizobia inside them (Chandler et al., 1982; Sprent & Thomas, 1984), whereas host cells of indeterminate nodules (IRLC) are each infected with rhizobia from a branch of the infection thread and do not divide further, and neither do the infecting rhizobia (Sprent & Thomas, 1984).

While we have begun to understand the relationship between bacteroid swelling and loss of reproductive viability, we have little understanding of how widely distributed swollen bacteroids are in the legume family. We also know little about the evolutionary effects of swollen bacteroids on the legume–rhizobia interaction. Selection pressures for symbiotic strategies may differ between bacteroid morphologies with implications for rhizobial evolution. For example, when bacteroids themselves are reproductive, hoarding high-energy lipid polymers, such as polyhydroxybutyrate (PHB), inside their cells at the expense of nitrogen fixation may have a direct benefit to the bacteroid fitness. Conversely, when bacteroids are nonreproductive, it is the undifferentiated rhizobia that must reap the benefits of the mutualism in ways that may interact with nitrogen fixation by the bacteroids. Rhizopines, for example, are simple sugar-like compounds produced (possibly by diverting resources from nitrogen fixation) by some swollen bacteroids and catabolized by the undifferentiated rhizobia (Murphy et al., 1995). Only rhizobial strains commonly found in legume species hosting swollen bacteroids are known to possess rhizopine-synthesis genes (Wexler et al., 1995), suggesting that rhizobia with evolutionary dead-end bacteroids may have evolved this alternative strategy for cheating.

Furthermore, if rhizobial cheating strategies or intensities vary, the optimal level of host sanctions (Kiers et al., 2003) may also vary among legume species. Without host-imposed selection, rhizobia could lose the ability to fix nitrogen (West et al., 2002). However, host sanctions have not been adequately demonstrated in legume species with nonreproductive bacteroids and other mutualistic systems have shown that levels of sanction can vary among species (Jander & Herre, 2010). In the legume–rhizobia system, the exact mechanisms of host sanctions are still unknown, although limiting oxygen diffusion into nodules may play some role (Kiers et al., 2003) and reduce the fitness of the undifferentiated reproductive rhizobia inside (Oono et al., 2009). However, at present, we have no evidence that sanctions are universal or that sanctions always work at the whole-nodule level. One could conceive of symbiosome-level sanctions that only target bacteroids or that sanctions differ physiologically between species with nonreproductive and reproductive bacteroids. Given that host-imposed swelling of bacteroids may affect the coevolution of legume–rhizobium symbiosis, we explore here the taxonomic distribution of this trait and its possible evolutionary history in the legume phylogeny. By mapping the life-history characteristics of the rhizobia onto the host phylogeny, ecophylogenetic hypotheses can be tested to understand how the interaction evolved (Armbruster, 1992). This has been previously done in studies pertaining to the evolution of nonmutualists among mutualist lineages (Pellmyr et al., 1996; Hibbett et al., 2000) as well as new life-history adaptations among partner lineages, such as fragrance-collecting vs resin-collecting bee pollination (Armbruster, 1992) or dioecy vs monoecy in figs pollinated by fig-wasps (Weiblen, 2000; Greef & Compton, 2002; Harrison & Yamamura, 2003). In our study, we characterized bacteroid swelling in various legume hosts, to test whether host traits leading to swollen bacteroids were gained or lost among different lineages of legumes. This could indicate whether there are benefits or costs to legume hosts with this trait. Our specific questions are: are host traits that cause swollen bacteroids ancestral or derived? How common is bacteroid swelling beyond the well-studied model legume species and can we assess a broader phylogenetic pattern? We also tested whether there is correlated evolution between nodule type and bacteroid type over a wider set of legume species.

Materials and Methods

Determining which legume species host swollen bacteroids

Multiple methods were used to assess bacteroid swelling, as no single method could be used with all the legume species in our phylogeny (Table 1). Methods for investigating bacteroid type included flow cytometry (FC), fluorescence microscopy (FM), scanning electron microscopy (SEM), and transmission electron microscopy (TEM). All FM and FC data were newly collected for this study. All unpublished SEMs were collected previously by J. I. Sprent.

Table 1.   Bacteroid morphology assessment, nodule type and sequence data sources for legume species in the ancestral character reconstruction analysis
 Nodule typeBacteroid data acquisition methodBacteroidAccession numbers from NCBI database
I/DSub-typeNodule source or referenceS/NrbcLmatK5.8S rRNAtrnL
  1. Legume species are categorized within their subclades with fractions indicating the proportion of genera represented in the analysis, out of the total recognized genera in the clade (Lewis et al., 2005). Methods for investigating bacteroid type were flow cytometry (FC), fluorescence microscopy (FM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM). All nodules prepared for FM and FC were collected for this study. Seed sources or plant location are indicated by superscripts: CC, Cedar Creek, MN, USA; CH, Chapel Hill, NC; Du, Durham, NC; GM, Garden Makers, Rowley, MA 01969; HF, Henry Field’s Seed and Nursery Co., Aurora, IN 47001; JS, previously collected by Janet I. Sprent; PMN, Prairie Moon Nursery, Winona, MN 55987; SH, Sandhills, NC; Ur, Urbana, IL; TA, Texas AgriLife Research and Extension Center, Lubbock, TX 79403 (Mark Burow); UM, University of Minnesota, Dept of Agronomy & Plant Genetics, St Paul, MN 55108 (Keith Henjum); WNS, Western Native Seeds, Coaldale, CO 81222. Nodule types: I, indeterminate; D, determinate; A, aeschynomenoid; L, lupinoid; -i, lacking interstitial cells; Des-U, desmodoid exporting ureide; Des-A, desmodoid exporting amide; P, persistent infection thread. References for nodule types are indicated for each species, respectively, or based on Sprent (2001) or Pueppke & Broughton (1999). Bacteroid state ‘S’ stands for swollen and ‘N’ stands for nonswollen. (1) Elliott et al. (2007); (2) Kalita et al. (2004); (3) Dart & Mercer (1966); (4) Sprent et al. (1987); (5) Fleischman & Kramer (1998); (6) Sen et al. (1986); (7) Loureiro et al. (1994); (8) Higashi et al. (1987); (9) Chandler et al. (1982); (10) Lawrie (1983); (11) Sprent (2001); (12) Izaguirre-Mayoral & Vivas (1996); (13) Hahn & Studer (1986); (14) Cevallos et al. (1996); (15) Mergaert et al. (2006); (16) Banba et al. (2001); (17) Lee & Copeland (1994); (18) Ratcliff et al. (2008); (19) Price et al. (1984).

Genistoids s.l. (7/83)
Baptisia australisIIFC, FMUr; SEMJSS AY386900AY091572AF309831
Cyclopia genistoidesIITEM (1)NZ70124 AJ409895 
Cytissus scopariusII -iFC, FMCHSZ70086AY386902AF351120 
Genista tinctoriaII -iTEM (2)SZ70099 AF007471DQ417001
Lupinus angustifoliusIL -iTEM (3)SZ70064 AF007477DQ417006
Maackia amurensisIIFC, FMDuSZ70137AY386944  
Poecilanthe parvifloraIITEM (4)N AF142687AF187089AF208897
Dalbergioids s.l. (7/53)
Amorpheae (2/8)
Amorpha fruticosaIIFC, FMPMNNU74212AY391785AY426774AF208899
Dalea purpureaIIFC, FMWNSN AY391798AY426794 
Dalbergioids (5/45)
 Aeschynomene indicaDA -iTEM (5)SAF308701AF272083S2AF068141AF208927
 Arachis hypogaeaDA -iFC, FMTA; SEM, TEM (6)SU74247EU307349AF156675DQ131546
 Discolobium pulchellumDA -iTEM (7)N AF270873AF189059AF208963
 Pterocarpus indicusDA -iSEM, TEM (8)N AF142691AF269177AF208953
 Stylosanthes hamataDA -iTEM (9)S AF203594AF203550AJ131247
Mirbelioids (2/32)
Aotus ericoidesIITEM (10)N AY386884  
Gompholobium minus/knightianumIISEM (11)S AY386891AY233086 
Millettioids (13/168)
‘Core Milletieae’ (1/56)
 Tephrosia heckmanniana/virginianaIIFC, FMSHSU74211AF142712AF467497 
Phaseoloids (12/112)
 Amphicarpaea bracteataDDes -UFC, FMCCNAF181930AY582971AF417015EF543424
 Cajanus cajanDDes -USEMJSNAB045790EU307315EU288918EF200131
 Calopogonium mucunoidesDDes -UTEM (12)NAB045792 AY293845 
 Centrosema virginianumDDes -UFC, FMSHSAF308706   
 Erythrina crista-galliDDes -USEMJSNZ70170AY386869  
 Glycine maxDDes -UTEM (13)NZ95552AF142700AF144654DQ131547
 Kummerowia stipulaceaeDDes -UFC, FMSHNU74229   
 Lespedeza cuneataDDes -UFC, FMSHNU74215   
 Macroptilium atropurpureumDDes -UTEM (19)N AY509938AF115138 
 Oxyrhynchus volubilisDDes -USEMJSNAF308717AY509935AF069114 
 Phaseolus vulgarisDDes -UFC, FMHF; TEM (14, 16)NEU196765DQ445990AF069128EF543430
 Vigna unguiculataDDes -UFC, FMHF; SEM, TEM (6)NZ95543AY589510AY748433AB304074
Robinioids (4/34)
Coronilleae, Robinieae (2/11)
 Coronilla variaIIFC, FMGM, SEMJSNU74222AF543846AF218537 
 Robinia pseudoacaciaIIFC, FMCHNU74220AF142728EF494737AF529391
Loteae (2/23)
 Anthyllis vulnerariaDDes-ASEMJSN AF543845AF218499 
 Lotus japonicusDDes-AFM (15), TEM (16)NNC_002694NC_002694DQ311975DQ311703
IRLC (5/54)
Cicer arietinumIITEM (17)NAF308707AY386897AJ237698DQ315487
Glycyrrhiza lepidotaIIFC, FMPMNNAB126685AF142730 AF124238
Medicago sativaIIFC, FMUM (18)SZ70173AY386881AF053142DQ131554
Pisum sativumIIFC, FMHF (15)SX03853AY386961AY143486DQ311717
Vicia hirsutaIIFC, FMGM (15)S AF522157DQ351827 
Species in other major lineages
Indigofera suffructicosaIITEM (12)N AF142697AF467051 
Sophora secundifloraIISEMJSNZ70141AF142693U59885 
Caesalpinioid, Chamaecrista fasciculataII  U74187AY386955EF590760 
Mimosoid, Pentacletra macrophyllaII  AM234250AF521853 AF365051

We obtained nodules for 19 species either from the field or plants grown in growth chambers. Bacteroid swelling in these hosts was assessed by both FC and FM; bacteroids were considered swollen if rhizobial cells showed a bimodal distribution in flow-cytometric forward scatter (Fig. 1a,b), which measures size of individual cells. A bimodal distribution suggests a mixture of smaller undifferentiated rhizobia and larger swollen bacteroids within a single nodule. Additionally, the size and shape of the bacteroids were assessed by FM. Size measurements are averages based on more than five of the largest individual bacteroids found in a random view. Bacteroids were confirmed to be swollen if they were either > 4 μm long, or wider than 1.5 μm (for spherical bacteroids) or branched (regardless of size). Bacteroids were assessed as nonswollen if they were smaller than 2.5 × 1.5 μm. Free-living rhizobia are usually < 2 μm long (Sprent, 2001), and hence would probably not exceed 4 μm long even during cell division. Even bacteroids that are considered nonswollen in our analyses do become slightly larger than free-living rhizobia inside plant nodules, perhaps owing to accumulation of carbon resources or osmotic pressure. Legume hosts with bacteroids of intermediate sizes do exist but these species were not included in the analyses (see the Supporting Information Table S1) because of a coincidental lack of sufficient molecular sequences. When nodules could not be obtained, we examined electron micrographs from unpublished and published studies. We characterized bacteroids as swollen using the same criteria as with the FM method. Some published micrographs had observational statements within the article that suggested either swollen or nonswollen bacteroids as well. Electron micrographs and published data were available for a few of the species for which we already had FM and FC data. Bacteroid data and their individual methods of evaluation are summarized in Tables 1, S1.

Figure 1.

Cytisus scoparius nodule run through the flow cytometer. (a) Two distinct populations of rhizobia were detected, based on forward scatter (cell volume) and side scatter (inner complexity). (b) The bimodal distribution of size indicates small undifferentiated rhizobia and larger swollen bacteroids inside a single nodule. (c) DNA fluorescence owing to SYTO13 staining reveals several distinct populations within a single nodule. (d) The three peaks for SYTO13 detection signify approximate doubling of fluorescence expected from genomic endoreduplication; geometric mean FL1 of 275, 558 and 1033.

Bacteroid preparation and FC  Some nodules were collected from the field, surface-sterilized and dried (Somasegaran & Hoben, 1994) for transport to the laboratory where the nodules were rehydrated in water overnight. Other nodules were harvested from plants grown in the growth chamber and were never dried. Rehydrated or fresh nodules were crushed in ascorbic acid buffer (Arrese-Igor et al., 1992) and centrifuged at 100 g for 10 min to separate rhizobia from plant material. The supernatant, containing bacteroids and undifferentiated rhizobia, was fixed for 30 min in 30% ethanol, pelleted at 5000 g for 5 min and resuspended in phosphate buffer solution (Somasegaran & Hoben, 1994). Nodule rhizobia were diluted (107–10cells ml−1) for flow cytometric sampling and stained with SYTO 13 (final concentration of 625 nM; Molecular Probes, Eugene, OR, USA), which binds to nucleic acids, both DNA and RNA. SYTO 13 was measured on FL1 (530 nm BP filter) on a Becton Dickson FACScalibur (Masonic Cancer Center, University of Minnesota, Minneapolis, MN). Many flow cytometer runs had some background noise, possibly owing to staining of RNA or DNA contents from burst cells, but this noise could easily be distinguished from the rhizobial cells. If genome endoreduplication occurred for swollen bacteroids, SYTO 13 can indicate cells that differ in genomic content. We could often detect two distinct clouds of 1C and 2C cells, and occasionally 4C clouds when bacteroids were swollen (Fig. 1c,d).

Fluorescence and scanning electron microscopy  For FM, after surface-sterilization, nodules were crushed directly on top of slides with forceps allowing a high density of bacteroids to be released. Sterile water was added to the slide with 1 μl of SYTO13 (62.5 μM) and cells were observed under an Olympus IX70 Inverted Fluorescence Microscope (CBS Imaging Center, University of Minnesota, St. Paul, MN).

Preparation of the nodules from the unpublished scanning electron micrograph are described elsewhere (de Faria et al., 1986).

Published electron micrographs and observational data  Bacteroid swelling data from previous studies were all determined from either scanning or transmission electron micrographs and sometimes complemented by published observations suggesting swelling or lack of swelling. For example, multiple bacteroids per symbiosome suggests bacteroids were able to divide after an initial fragmentation of the symbiosome from the infection thread, consistent with nonswollen bacteroids, whereas a single bacteroid per symbiosome suggests that bacteroids cannot resume division after differentiation. Sutton & Paterson’s (1983) compiled dataset suggests that there is a correlation between bacteroid viability and number of bacteroids per symbiosome. In their dataset, all species with low bacteroid viability had a single bacteroid per symbiosome, whereas those with high bacteroid viability always had multiple bacteroids per symbiosome. However, multiple bacteroids per symbiosome may be found if the symbiosome initially engulfed more than one cell or if the rhizobia divided before differentiating into a bacteroid within the symbiosome.

Taxon sampling, phylogenetic analyses

We restricted our analysis to those taxa (40 species in the subfamily Papilionoideae representing 40 different genera) for which molecular sequences (rbcL or matK) were available and morphological data (bacteroid size/shape) could be generated. Both rbcL and matK are genes previously used to construct phylogenies (Doyle et al., 1997; Kajita et al., 2001; Wojciechowski et al., 2004). No new genes were sequenced for this study. In addition to the rbcL and matK genes, 5.8S rRNA genes and parts of the trnL gene were also used, if available, to increase the number of molecular characters (Table 1). We focused on the papilionoids because it is the only subfamily in which swollen bacteroids are known to exist (Sprent, 2001) and because the origin of nodulation for the Papilionoideae may be independent from the other two subfamilies – Caesalpinioideae and Mimosoideae (Doyle, 1994). Reconstruction of ancestral states would be meaningless if analysed across species with different origins of nodulation.

Six supported nodulating subclades (Wojciechowski et al., 2004) were sampled in proportion to species diversity in each clade (Table 1). Some legume species were not included in the analysis, even though bacteroid characteristics are known (Table S1), because including them would lead to a disproportionate representation of one clade over another. Indigofera suffructicosa was included in the analysis despite being outside of the six major subclades because it is a member of the second largest genus (> 700 species; Lewis et al., 2005) within the subfamily, after Astragalus. Sophora secundiflora was also included because of its inferred basal position in the Papilionoideae (Kajita et al., 2001). Two outgroup species were selected, one from each of the other two subfamilies: Chamaecrista fasciculata from the Caesalpinioideae and Pentaclethra macrophylla from the Mimosoideae.

Finding species in the Mirbelioids and core Millettioids that had both available DNA sequences and bacteroid data was not possible, but we believed these two lineages were important for phylogenetic diversity and should not be excluded from our analysis. Therefore, bacteroid traits were assessed based on Gompholobium knightianum and Tephrosia virginianum, whereas the sequence data came from Gompholobium minus and Tephrosia heckmanniana, respectively. Because we recognized the slight possibility of misaligning bacteroid properties to the sequenced species, we also modeled the ancestral character reconstruction with the alternative bacteroid trait and assessed the robustness of our results.

Sequences were aligned based on the amino acid sequence and reverted back to nucleotides using bioedit (Hall, 1999). We analysed 1391 positions for rbcL, 1598 for matK, 164 for 5.8S rRNA, and 347 for trnL. Two intron regions (60 and 317 nucleotides long) in trnL sequences were excluded from the analyses. We assessed potential conflict between the data portions by checking 75% maximum parsimony bootstrap consensus trees for conflicting topologies (Lutzoni et al., 2004).

A mrmodeltest (program distributed by the author J. A. A. Nylander, was used to determine the best-fitting model for each gene following the more conservative Akaike Information Criterion test: rbcL was analysed by GTR + I+ G, matK by TVM + G, trnL by TIM + G, and 5.8S rRNA by TrNef + I + G.

The Bayesian Markov chain Monte Carlo (B/MCMC) analyses were conducted using mrbayes v3.1.1 (Huelsenbeck & Ronquist, 2001). Four simultaneous chains (temperature 0.2) were run three times starting with a random tree, resulting in 12 million generations. Every 100th tree was saved into a file and the first 0.1% of the trees was discarded as burn-in for each run. Log-likelihood scores of sample trees against generations were plotted using tracer v1.4 (Rambaut & Drummond, 2007) to ensure that the Bayesian analysis had reached a stable equilibrium value. For analyses of character evolution we discarded 20 000 trees as burn-in and sampled every 20 000th post burn-in tree for a total of 1000 trees to avoid autocorrelation (Pagel & Meade, 2006).

Analyses of character evolution

Evolutions of bacteroid type and of nodule type were traced over 1000 post burn-in trees from the B/MCMC analysis. A combined maximum likelihood (ML) and Bayesian inference approach was implemented using the program package bayestraits v1.0 (Pagel & Meade, 2006) and a fully Bayesian approach was carried out using simmap v1.0 (Bollback, 2006). In each analysis, we defined constraints for 23 nodes of interest. All of these were supported by a posterior probability of 0.95 or higher except the most basal node of the papilionoids (node 23; Fig. 2), which has a posterior probability of 0.79. The consensus tree has a polytomy and cannot resolve the relationship among the Genistoids, Dalbergioids sensu lato (s.l.). and the other major clades. Although this node has a lower posterior probability than conventionally accepted for constructing ancestral traits, all genera in this clade are monophyletic and the ancestral state of this node is the ancestral state for all the papilionoid species in the analysis.

Figure 2.

 A 50% majority rule consensus tree of 40 Papilionoid species based on 108 000 post burn-in trees from a Bayesian analysis. This phylogeny is based on combined matK, rbcL, partial trnL and 5.8S rRNA sequences. Thickened branches indicate support of posterior probability ≥ 0.95. There were 23 nodes analysed for ancestral character states of swollen (open circles) or nonswollen bacteroids (closed circles). Trait states are indicated on the respective nodes if reconstructed with significant statistical support (probability ≥ 0.95 as well as Maximum likelihood (ML) bootstrap of ≥ 80%). Numbered nodes with no symbol indicate insignificant reconstructions. Extant character states for swollen and nonswollen bacteroids as well as indeterminate (infinity symbol) and determinate nodules (crosses) are indicated next to species names.

The trees were rooted with C. fasciculata as the outgroup for the ML analysis in bayestraits whereas both C. fasciculata and P. macrophylla were used as outgroups for the Bayesian analysis in simmap. Outgroup taxa were excluded from the analyses in simmap and coded as ‘2’ for trait states in bayestraits so that the outgroup trait state would not influence the analyses. For bacteroid type ancestral reconstruction, we coded ‘0’ for nonswollen bacteroids and ‘1’ for swollen bacteroids. Scoring was based on the methods available for the particular species as already described. We also reconstructed the ancestral state for nodule type and coded ‘0’ for determinate nodules and ‘1’ for indeterminate nodules.

In bayesmultistate, we reconstructed the ancestral character states using maximum likelihood. The transition parameters were estimated with 10 attempts per tree and an ancestral state probability was calculated for each post burn-in tree based on the estimated parameter values. The ancestral state probabilities for a particular node were then averaged over all 1000 post burn-in trees. We also constructed ancestral character states using uniform (0, 100) gamma and exponential priors in a Bayesian framework using bayestraits.

In simmap, we used multiple combinations of gamma prior values to assess the robustness of our results, including flat priors for the bias and rate parameters. The ancestral character posterior probabilities reported here are from a model using the following evolutionary rate parameters: α = 3.0, β = 2.0, k = 60. For each tree sampled, 100 draws were carried out from the prior distributions for modeling the rate of evolution.

Analyses of correlated traits

Nodule trait (determinate vs indeterminate) and bacteroid trait (swollen vs nonswollen) were tested for correlated evolution using the program bayesdiscrete in the bayestraits package (Pagel & Meade, 2006). Two maximum likelihood models were run: an independent and dependent trait evolution model. The independent evolution model allows the binary nodule trait and the binary bacteroid trait to evolve independently of each other, requiring four transition rate parameters (two for each trait). The dependent evolution model assumes the two traits do not change states independently and the transition rate parameter for each trait is dependent on the other trait’s original state. This model requires eight transition rate parameters between the four possible states of trait combinations. We computed under each model the log-likelihood for each of the 1000 post burn-in trees. We then compared the independent and dependent (correlated) evolution models by assessing −2*(likelihood ratio) against a χ2 distribution with four degrees of freedom (the difference in number of parameters between the two models; four and eight, respectively). The likelihood ratio was measured as the average difference between the log-likelihood of the dependent model and the log-likelihood of the independent model over 1000 trees.


Legume species hosting swollen bacteroids

Of 40 papilionoid species used in the phylogeny, 14 were identified as hosting swollen rhizobial bacteroids in their root nodules. These 14 species belong to five of the six major subclades: IRLC, Mirbelioids, Dalbergioids s.l., Millettioids and Genistoids (Fig. 2). Each of these clades contained species hosting both nonswollen and swollen bacteroids but we did not find a single species in the Robinioids hosting swollen bacteroids.

Medicago sativa, Pisum sativum (Fig. 3a) and Vicia hirsuta, all in the IRLC, are well-known for their swollen nonreproductive bacteroids (Mergaert et al., 2006). However, Cicer arietinum (1.1 μm ± 0.11 SD) and Glycyrrhiza lepidota (2.4 μm ± 0.49 SD), also in the IRLC, host nonswollen bacteroids, as does Biserrula pelecinus (Nandasena et al., 2004; Table S1).

Figure 3.

 SYTO13-stained rhizobia (bacteroids and undifferentiated cells) harvested from nodule of (a) Pisum sativum nodulated by Rhizobium leguminosarum A34, (b) Arachis hypogaea nodulated by Rhizobium sp. 32H1, (c) Robinia pseudoacacia wild nodule, (d) Tephrosia virginianum wild nodule. Bar, 5 μm.

Mirbelioid species host various sizes of bacteroids, including large swollen (> 4 μm in length), intermediate (between 2.5 and 4 μm in length), and small nonswollen bacteroids (< 2.5 μm in length) (Table S1). Bacteroids in G. knightianum nodules, designated G. minus in the phylogeny (as explained in the Taxon Sampling methods section), were 5.6 μm (± 0.88 SD) in length, which is significantly swollen, whereas bacteroids of Aotus ericoides averaged 2.1 μm (± 0.72 SD), about double the size of free-living bacteria (Lawrie, 1983).

Arachis hypogaea (Fig. 3b), Aeschynomene indica and Stylosanthes hamata, all in the Dalbergioids s.l., have been recognized for hosting unusual spherical bacteroids (Chandler et al., 1982; Sen et al., 1986; Fleischman & Kramer, 1998), but other related legume species within the same clade (Pterocarpus indicus and Discolobium pulchellum) do not share this trait (Higashi et al., 1987; Loureiro et al., 1994). Dalea purpurea (1.5 μm ± 0.56 SD) and Amorpha fruticosa also did not host swollen bacteroids (2.3 μm ± 0.31 SD).

None of the four observed Robinioid legume species hosted swollen bacteroids based on our criteria. These included those Robinioid species with indeterminate nodules (Coronilla varia, 2.1 μm ± 0.66 SD, and Robinia pseudoacacia, 2.3 μm ± 0.31 SD; Fig. 3c) as well those with determinate nodules (L. japonicusMergaert et al., 2006 and Anthyllis vulneraria, 1.6 μm ± 0.27 SD).

Centrosema virginianum (6.9 μm ± 0.72 SD) and T. virginianum (7.0 μm ± 0.71 SD; Fig. 3d), in the Millettioids, have swollen bacteroids based on our criteria. These two appear to be more closely related to each other than to the other Millettioids (Tables 1, S1), but this relationship has weak support in our reconstructed phylogeny as well as in previous studies (Kajita et al., 2001). All other Millettioid members were newly shown or reconfirmed to host nonswollen bacteroids as well as I. suffructicosa (Izaguirre-Mayoral & Vivas, 1996), in the close sister clade of the Millettioids.

Three of the Genistoid legume species (Cytisus scoparius (Fig. 1), Maackia amurensis and Baptisia australis) sampled showed bimodal size distribution of nodule rhizobia and two others (Lupinus angustifolius and Genista tinctoria) have some branching bacteroids (Dart & Mercer, 1966; Kalita & Malek, 2004), both consistent with swollen bacteroids. However, three other Lupinus species that were not used in the phylogenetic analysis had nonswollen bacteroids (Table S1). Hence, host effects on bacteroid states apparently changed during the evolution of this very diverse genus. This within-genus variability was not considered in the analysis because our balanced sampling protocol limited us to one species per genus. Lupinus angustifolius was the best Lupinus species to include for its available matK sequence, the longest sequence of the four genes used in the analysis. The implications of ignoring this variability within Lupinus are discussed later. Sophora secundiflora (1.7 μm ± 0.42 SD), an early branching species from the Genistoids, was categorized as hosting nonswollen bacteroids.

Phylogenetic analyses

Phylogenetic analyses of the four gene partitions combined included 3500 base pairs, of which 1390 were variable. The tree inferred in the current study does not conflict with previously published topologies that used single molecular loci and greater species numbers (Kajita et al., 2001; Pennington et al., 2001; Hu et al., 2002; Wojciechowski et al., 2004). Six previously reported major lineages are supported by posterior probabilities > 0.95. The most recent common ancestor node for all the papilionoids has a lower posterior probability (0.79; Fig. 2, node 23). The position of Indigofera as sister group to the Millettioids agrees with previous studies (Wojciechowski et al., 2004). The relationships among the six major clades within the Papilionoideae have never been strongly supported (McMahon & Sanderson, 2006) and were not resolved in this study.

Evolution of legume traits affecting bacteroid swelling

Bacteroid states were mapped onto a 50% majority rule consensus tree derived from a Bayesian analysis (Fig. 2). Fifteen analysed nodes had high probabilities for both Bayesian (posterior probability ≥ 0.95) and maximum likelihood analyses (average P > 80%) for either swollen or nonswollen bacteroid states (Fig. 2, Table 2). Seven of the nodes were particularly robust because they had high posterior probabilities even with uniform gamma and exponential prior distributions using bayestraits (Table 2). The node for the most recent common ancestor of the Papilionoideae subfamily (node 23) was highly supported as hosting nonswollen bacteroids using all methods. According to the two nonuniform prior analyses, host traits leading to swollen rhizobial bacteroids have evolved at least five times (within the IRLC, Mirbelioid, Millettioid, Dalbergioid s.l. and Genistoid clades).

Table 2.   Ancestral character state probabilities for nonswollen bacteroids at 23 nodes in the phylogeny in Fig. 2Thumbnail image of

Even when G. minus and T. heckmanniana were coded with the opposite traits, the ancestral state for the papilionoids (node 23) remained hosting nonswollen bacteroids with a posterior probability of > 0.95 and an average likelihood of > 80%.

Correlated evolution of nodule and bacteroid traits

We also analysed whether a nodule trait (determinate vs indeterminate growth) and bacteroid trait (swollen vs nonswollen bacteroids) evolved in a correlated fashion. In order to statistically assess the potential correlation of the two traits, we compared the likelihood fit between two models that allowed the traits to evolve independently or dependently using the program bayesdiscrete. The likelihood for the two models was not significantly different. The average log-likelihood for the dependent model was loge 33.08 vs loge 34.39 for the independent model. The likelihood ratio was 2.62, which is not significant against a χ2 distribution with four degrees of freedom. This implies that an independent evolution model does not explain nodule and bacteroid trait evolution significantly better than a dependent model or vice versa.


Our results suggest that legumes inducing bacteroid swelling evolved independently at least five times from an ancestral papilionoid legume hosting nonswollen bacteroids. Extreme bacteroid differentiation is not significantly correlated with indeterminate nodule types. This finding suggests that generalization from model species, M. truncatula (hosting swollen bacteroids in indeterminate nodules) or L. japonicus (hosting nonswollen bacteroids in determinate nodules), is not possible. Moreover bacteroid morphology is linked to the reproductive viability of bacteroids (Mergaert et al., 2006) and can have different implications for rhizobial evolution (Denison, 2000; Oono et al., 2009) depending on their host species.

Legume traits leading to swollen rhizobial bacteroids evolved at least five times

We reconstructed the ancestral state of legume-host effects on bacteroid morphology using maximum likelihood and Bayesian approaches. Our legume phylogeny consists of 40 species, representing only 40 of the 478 genera of papilionoid legumes, but including lineages that have not previously received much attention for their symbiotic properties. Ancestral character reconstruction analysis can depend on the sample size and phylogenetic position of included taxa (Heath et al., 2008). We attempted to sample randomly and diversely in the Papilionoideae, but it is possible that we still missed enough lineages of legumes with swollen bacteroids to overturn our ancestral character state. However, based on the data currently available, the most recent common ancestor of the papilionoids has a very high probability of having hosted nonswollen rhizobial bacteroids.

An ancestral state hosting nonswollen bacteroids suggests five likely independent origins for host-imposed bacteroid swelling. Four of these five lineages already have published cases of swollen bacteroids (IRLC, Dalbergioids s.l., Genistoids and Mirbelioids) but the cases found in the Millettioids were first discovered by this study. Some of the Dalbergioid legumes are known to have spherical bacteroids (Fig. 3b) as opposed to elongated (Fig. 3d) or branched ones (Fig. 3a) in the other clades. This morphological difference suggested there were at least two independent origins and two different underlying mechanisms for swelling. However, the possibility of all five lineages having independent origins was uncertain before the present analysis, suggesting that inducing swelling in bacteroids might have a host fitness advantage. Furthermore, it would be interesting to investigate NCR or antimicrobial-like peptides in the other four lineages to see if the host molecular mechanisms imposing bacteroid swelling are similar.

We also confirmed that closely related legume species often host the same type of bacteroids with typically no variation within a genus, suggesting that changes in this trait are relatively rare. However, a closer inspection of Lupinus, or the Genistoid clade in general, includes at least some examples of closely related species that influence different levels of bacteroid differentiation. Species within Lupinus may host either swollen or nonswollen bacteroids (Table S1; Dart & Mercer, 1966) and Cyclopia genistoides appears to have nonswollen bacteroids, based on TEM (Elliott et al., 2007), yet it is closely related to other Genistoids hosting swollen bacteroids.

Several species not included in the analysis (Table S1), owing to lack of adequate DNA sequences, happen to host bacteroids of intermediate lengths. These intermediate morphologies may indicate transitional stages between reproductive and nonreproductive forms indicating a continuous rather than a binary variable.

Why would hosting swollen bacteroids be a derived trait in legume species?

The multiple independent origins we found for host-imposed bacteroid swelling suggest some fitness benefit to hosts. One hypothesis is that swollen bacteroids fix nitrogen more actively than nonswollen ones (Oono et al., 2009), which can explain the greater production of PHB in some nonswollen bacteroids (Lodwig et al., 2005). Polyhydroxybutyrate accumulation and nitrogen fixation compete for the same carbon resources, as confirmed by greater nitrogen fixation in PHB-negative mutants (Cevallos et al., 1996) and greater PHB accumulation in nonfixing mutant bacteroids (Hahn & Studer, 1986). However, some legume species host nonswollen bacteroids with very little PHB, such as Lotus sp. (Banba et al., 2001). More PHB tends to accumulate when respiration and growth are limited by oxygen or other resources (Anderson & Dawes, 1990; references therein), indicating that nodule physiology, which is unrelated to bacteroid morphology, could be important for PHB synthesis. Genomic endoreduplication or differences in surface to volume ratio might also affect the efficiency of swollen bacteroids (Oono et al., 2009).

Although cheating options may be limited by swollen nonreproductive bacteroids (as discussed in the Introduction) and this may presently benefit the hosts, this is unlikely to be the reason why inducing swelling in bacteroids first evolved among legume hosts. An evolutionary change in rhizobial cheating strategies (e.g. less PHB-hoarding) most likely takes several rhizobial and host generations and would not be an immediate host benefit for inducing swelling and loss of reproductive viability in bacteroids (Oono et al., 2009). However, if inducing swelling has an immediate effect on a bacteroid’s ability to cheat (e.g. blocking PHB synthesis and accumulation), this may cause further selection for this host trait.

In hosts where nonswollen bacteroids may have been regained, such as within the genus Lupinus, perhaps bacteroid swelling is no longer beneficial for unknown reasons. Conversely, some rhizobial strains of Lupinus hosts may have evolved traits to overcome host-induced swelling and loss of reproductive viability.

Bacteroid differentiation and its correlation to indeterminate vs determinate nodule types

To evaluate the effect of nodule type, we grouped legume species into two categories, depending on whether their nodules have determinate or indeterminate growth. There was no consistent relationship between nodule type and host effects on bacteroid swelling, in contrast to some previous generalizations based on fewer species (Denison, 2000). A dependent evolution model for the two traits was not significantly better than an independent one, thus precluding statements about whether the two traits evolve in a correlated fashion.

For simplicity, we used only two categories for nodule type, but we recognize that there are more distinct types of nodules (Sprent, 2007). The determinate nodules of the Dalbergioids are often called aeschynomenoid and have crack entry for infection rather than infection threads. Even determinate nodules with infection threads differ in whether they export amide or ureide to the host. The indeterminate nodules of the Genistoids (including Lupinus sp.) lack interstitial cells, which are often found among the infected zone of indeterminate nodules of the IRLC. Some nodules have persisting infection threads where bacteroids reside, as in the indeterminate nodules of Poecilanthe parviflora. The bacteroids within persisting infection threads are not highly differentiated and may resemble the initial stages of the ancient symbiosis.

It is easy to understand the perceived correlation between nodule and bacteroid types as these two traits are both conserved in closely related legume species. However, assuming this correlation would have suggested an ancestral state of swollen bacteroids with multiple origins of legume hosts releasing bacteroids from inhibition of reproduction, which would drastically change some of our views on the legume–rhizobia symbiosis.

In conclusion, we find multiple origins of swollen bacteroids hosted by different legume species. This suggests swollen bacteroids confer some host fitness benefits, such as optimization for nitrogen fixation efficiency (Oono et al., 2009), which remain to be clarified. Rhizobial strains with a nonreproductive bacteroid life history may also have evolved alternative cheating and cooperation strategies leading to different mechanisms in different legume host species that maintain stability of the mutualism.


We would like to give special thanks to Bruce A. Sorrie, Alan Weakley, Carol A. McCormick, Will Cook, Mark T. Buntaine and Troy Mielke for helping us find legume species in North Carolina and Minnesota. We would like to acknowledge the College of Biological Sciences’ Imaging Center at the University of Minnesota for their assistance in fluorescence microscopy and the assistance of the Flow Cytometry Core Facility of the University of Minnesota Cancer Center, a comprehensive cancer center designated by the National Cancer Institute, supported in part by P30 CA7759. This material is based in part upon work supported by the National Science Foundation under Grant Number 0918986.