Flowering plant immune repertoires expand under mycorrhizal symbiosis

Abstract Immune perception in flowering plants is mediated by a repertoire of cytoplasmic and cell‐surface receptors that detect invading microbes and their effects on cells. Here, we show that several large families of immune receptors exhibit size variations related to a plant's competence to host symbiotic root fungi (mycorrhiza). Plants that do not participate in mycorrhizal associations have significantly smaller immune repertoires, while the most promiscuous symbiotic hosts (ectomycorrhizal plant species) have significantly larger immune repertoires. By contrast, we find no significant increase in immune repertoire size among legumes competent to form a symbiosis with nitrogen‐fixing bacteria (rhizobia). To explain these observations, we hypothesize that plant immune repertoire size expands with symbiote species diversity.

Typically, the fungal hyphae form a continuum from the soil into the outer layers of the root, effectively expanding the soil volume explored by the root system and accumulating critical elements like phosphorus for use by the plant.
In this paper, we will focus on two categories of mycorrhizal associations: arbuscular mycorrhiza and ectomycorrhiza. The association between arbuscular mycorrhiza (AM) and plants is ubiquitous, with 80% of terrestrial plants competent to form these symbioses (reviewed in Wang & Qiu, 2006;Smith & Read, 2008). Morphologically, AM are distinguishable from other mycorrhizal fungi by the presence of densely coiled or ramified hyphae located inside living cells of the root cortex.
AM are named after these ramified hyphae, which are called "arbuscles." Ectomycorrhizal (ECM) symbioses are more recently evolved and much less common among plant species (3%), but they are especially important for forest trees (Smith & Read, 2008). The hyphae of ECM fungi form a network that densely occupies the spaces between the cells of the outer plant root, but they do not penetrate living host cells.
An additional category of plants comprises those species that have lost their competence for mycorrhizal symbiosis (reviewed in Wang & Qiu, 2006;Brundrett, 2017). Nonmycorrhizal species (NM) have evolved multiple times from AM species. They are generally short-lived, non-rhizobial, and nonwoody. Notably, NM species include the model species Arabidopsis thaliana and other members of the Brassicaceae.
In this paper, we examine the possible connection between plant symbiotic competence and the innate immune repertoire.
Plants lack mobile defensive cells, so their resistance to microbial pathogens is mediated by an innate immune response (reviewed in Chisholm, Coaker, Day, & Staskawicz, 2006;Jones & Dangl, 2006;Bent & Mackey, 2007;Cook, Mesarich, & Thomma, 2015;Hacquard, Spaepen, Garrido-Oter, & Schulze-Lefert, 2017). Pathogen perception can take place outside the cell or in the cytoplasm. Outside the cell, transmembrane receptor kinases are sensitive to pathogenassociated or damage-associated molecular patterns (Brutus, Sicilia, Macone, Cervone, & De Lorenzo, 2010;Gomez-Gomez & Boller, 2000;Miya et al., 2007). These receptor kinases signal to downstream targets, eventually triggering a range of immune responses from the plant. A second signaling pathway involves events in the cytoplasm. To evade the host immune response, pathogens secrete effector proteins that interfere with immune signaling or downstream events (reviewed in Lo Presti et al., 2015). Host proteins in the Nucleotide Binding Site family (NBS, also known as NB-ARC) monitor the cytoplasm for these effectors, and probably also for changes in effector targets, and thereby provide a second mode of pathogen detection (Botella et al., 1998;Collins et al., 1999;Wang et al., 1999). The majority of disease-resistance loci (R-genes) discovered in plants map onto genes with NBS domains, and the term "Rgenes" is sometimes used synonymously with NBS genes (reviewed in Sekhwal et al., 2015). A third mechanism for plant-pathogen interactions involves the exchange of small RNAs evolved to silence plant immune responses or microbe pathogenicity, respectively (Cai et al., 2018;Weiberg et al., 2013).
While the traditional view is that the immune system evolved to defend the plant against pathogens, there is a growing list of cases in which immune signaling pathways also mediate plant interactions with nonpathogenic microbes. One important example is the LysM family of receptor kinases, which includes receptors that perceive the fungal cell wall component chitin. In rice, this family includes OsCERK1, which has a role in both fungal pathogen response and the establishment of AM symbiosis (Miyata et al., 2014). In legumerhizobia symbiosis, some LysM receptors perceive chitin-like signals called Nod factors produced by symbiotic bacteria (Madsen et al., 2003). There is also some evidence that the NBS family has a role in symbiosis. NBS genes have been shown to regulate host-rhizobia specificity in some legumes (Yang, Tang, Gao, Krishnan, & Zhu, 2010). These and other examples have prompted recent reviews (Cook et al., 2015;Hacquard et al., 2017) to describe the plant immune system as a "microbe management" or "surveillance" system, rather than a defense system per se.
In this paper, we initially focus on the NBS gene family. The NBS gene family is remarkable for both its overall size -often comprising several percent of the entire genome -and for the wide size variation between species. Counts of NBS range from ~40 in eelgrass (Zostera marina) to ~1,000 in apple (Malus domestica) (Olsen et al., 2016;Velasco et al., 2010). Beginning with the sequencing of the first tree genome, Populus trichocarpa (black cottonwood), several authors observed that woody perennials, especially trees, have relatively large NBS families (see Tuskan et al., 2006;Yang, Zhang, Yue, Tian, & Chen, 2008 and the opinion article Tobias & Guest, 2014). More recently, Plomion et al. (2018) compared the genomes of nine woody and seven nonwoody species and found several large clades of NBS genes that show significant expansions in woody species. They speculated that the expansion of defense-related gene families is related to the long life-span of woody perennials.
In work preliminary to this paper, we saw hints of an alternative explanation for variations in NBS family size. We noted that species with small NBS families tend to be NM. We also noticed that tree species with especially large NBS families were typically ECM. We thus set out to establish the statistical significance of this result, to find other gene families with similar correlations, and to seek possible explanations in the ecology and evolution of flowering plants.

| Genomes
We downloaded the genomes of 39 species of angiosperms, including 9 monocots (2 NM and 7 AM) and 30 eudicots (5 NM, 20 AM, and 5 ECM). Of the 39 genomes, 38 are public. Permission to use the 39th, Salix purpurea, was generously granted prior to the end of the official embargo. A complete list of sources is provided in the Supporting Information Table S1. The majority were downloaded from the Phytozome archive (https://phytozome.jgi.doe.gov; Goodstein et al., 2012).
To provide consistency across genomes, we limited consideration to one transcript per locus, called the "primary transcript" in Phytozome annotations. This was to avoid statistical bias due to large differences in the completeness of various genome annotations. For example, Arabidopsis TAIR10 averages 1.29 transcripts per locus, while many species genomes are still in version 1.0 and are annotated with exactly one transcript per locus.

| Species data
We collected the following information for each species: mycorrhizal and rhizobial symbiotic competence, annual/perennial life cycle, and woodiness (i.e. capable of secondary growth and lignification). This information was collected from a variety of sources, as detailed in the Supporting Information. We were careful to include only species whose mycorrhizal status has been confirmed by published observation. Figure 1 summarizes some properties of the 39 species used, including a taxonomy after (APGIV 2016). The woody category includes trees and vines having woody secondary growth. It also includes Lotus japonicus, which has nonhardy stems but a perennial woody taproot, and Solanum lycopersicum, which, while typically grown as an annual crop, develops a woody stem and roots when grown as a perennial (Peralta, Spooner, & Knapp, 2008).

| Family expansion
To test whether a gene family expands between two sets of species, we calculated a p-value using the exact, one-sided Wilcoxon-Mann-Whitney rank sum test, implemented using the COIN package in R (COIN version 1.2-2, Hothorn, Hornik, van de Wiel, & Zeileis, 2006; F I G U R E 1 Summary of the 39 species of angiosperms used in this paper. Taxonomy shown on the left follows Ref. (APGIV 2016). In the color-coded boxes, brown = woody perennial, orange = rhizobial species, gray = nonmycorrhizal (NM) species, green = arbuscular mycorrhizal (AM) species, light blue = ectomycorrhizal (ECM) species. Bars on the right show the size of the NBS gene family in each species, as counted by PfamScan (Punta et al., 2012). References for mycorrhizal status may be found in Supporting Information Table S1 1,000 R version 3.5.1, Team, 2018). In cases with multiple simultaneous tests, the technique of Benjamini and Hochberg (1995) was used to limit the false discovery rate (FDR) to <5%. We used Excel (v.16,Microsoft) to do a simultaneous regression of family size on woodiness and mycorrhizal competence using the equation where i is an index that ranges over the 39 species in our data set, n i is the number of loci in the gene family of interest in species i, a, b w , and b myc are the three coefficients to be determined, x iw is assigned a value of −1 if species i is nonwoody and +1 if woody, and x i,myc is assigned a value of −1 if species i is NM, 0 if AM, and +1 if ECM.

| Linear regression
Note that x iw and x i,myc are defined to have identical ranges (from −1 to +1) so the relative size of the regression coefficients provides a measure of the relative importance of woodiness and mycorrhizal competence. We tested whether each regression coefficient was significantly different from 0 using a t test (Excel v.16, Microsoft).

| Immune response
To assess whether a gene family is over-represented in the immune response, we used transcriptome data from the following papers. Over-representation was assessed using a hypergeometric test to find the p-value (Excel v.16, Microsoft), then further restricted to FDR < 5% (Benjamini & Hochberg, 1995).

| AM-related gene lists
To assess whether a gene family is over-represented in lists related to AM competence, we used data from the following papers. Over-representation was assessed using a hypergeometric test to find the p-value (Excel v.16, Microsoft), then further restricted to FDR < 5% (Benjamini & Hochberg, 1995).

| NBS family size depends on mycorrhizal competence
We determined the size of the NBS gene family in 39 species of angiosperms for which mycorrhizal status was also available (Figure 1, Supporting Information Tables S1, S2). Figure 2 shows the distribution of NBS family size in host species with different mycorrhizal competence. The median number of NBS loci in NM, AM, and ECM species is 116, 293, and 683, respectively. Comparing these groups using nonparametric statistics, we find that NM plant species have significantly smaller NBS families than AM species (p = 3e-4) and ECM plant species have significantly larger NBS families than either NM or AM species (p = 1e-3 for both comparisons).
Note that all seven NM species in our list are nonwoody while the five ECM species are woody perennials ( Figure 1). Thus, we chose to examine the relative importance of woodiness and mycorrhizal competence for NBS family size using a simultaneous linear regression (results in Supporting Information Table S3). The regression coefficient associated with woodiness was positive, but not significantly different from 0 (p = 0.10 using a t-test). The regression coefficient associated with mycorrhizal competence was about 5× larger, and was significantly different from 0 (p = 6e-4 using a t-test).
We also tested for a dependence on woodiness using the subset of 27 AM species. The AM-competent species were divided into two cohorts of 18 nonwoody species and 9 woody species, respectively ( Figure 1). The distributions of NBS family sizes are compared in Figure 2. Median NBS counts of the nonwoody and woody AMcompetent species are 298 and 276, respectively, and nonparametric statistics find no significant difference between the two groups (p = 0.13 using an exact, one-sided Wilcoxon-Mann-Whitney rank sum test).
Since mycorrhizal competence has such a large effect on NBS family size, we also tested the possible effect of symbiosis with rhizobia. We divided our AM-competent species into two cohorts of 21 nonrhizobial and 6 rhizobial plant species, respectively (Figure 1).
While the distributions shown in Figure 2 suggest a small NBS expansion in the rhizobia-competent species, nonparametric statistics finds no significant difference between the two sets (p = 0.29 using an exact, one-sided Wilcoxon-Mann-Whitney rank sum test).

| Several other gene families also expand with mycorrhizal competence
We next searched for other gene families with properties similar to NBS, namely large size and a correlation with mycorrhizal competence. We compiled a list of gene families using the Pfam annotation of protein domains (Punta et al., 2012).   Table S4). We next required the category size differences to be large, with at least a 50% gain from NM to AM, and again from AM to ECM. The final list is readily curated into just six families

| Properties of the MycEx gene families
The MycEx gene families are prominent in several immune response transcriptomes. In A. thaliana, a nonmycorrhizal model species, all six families were over-represented in the response to P. syringae infection, and the majority were over-represented in response to the immune elicitors flagellin, chitin, and oligogalacturonides (Bernsdorff et al., 2016;Denoux et al., 2008;Wan et al., 2008) (Supporting Information   Table S7, FDR < 5%). The transcriptional response of P. tremuloides (quaking aspen) leaves to a fungal pathgogen shows a phase at 15 days after inoculation (dai) where all MycEx families except PGG-ankyrins are over-represented (Foster et al., 2015). It is notable that this response takes time to develop. At 1 dai, none of the MycEx families are over-represented, and at 4 dai only the RLP are over-represented.

| MycEx gene families have a role in defense signaling
After demonstrating that the NBS gene family shows significant expansions in comparison of AM with NM species, and ECM with AM species, we conducted a search for other large gene families with

| Bulb-type lectin receptor kinases (B-type RKs, also called G-type RKs)
This family includes the S-locus receptor kinases that play a role in self-incompatibility during fertilization (Takasaki et al., 2000), and the LORE receptor kinase that recognizes lipopolysaccharides from Gram-negative bacteria (Ranf et al., 2015). A functional role in sym- biosis has yet to be demonstrated, although Favre et al. (2014) identified three B-type RKs in the conserved genetic module common to AM species.

| Glutamate Receptor-Like (GLR)
These are the homologs of ionotropic glutamate receptors in mammals. Plant GLRs respond to a range of amino acids, and in most cases their functional roles remain to be clarified (Forde & Roberts, 2014

| Receptor-Like Proteins (RLP)
Receptor-like proteins are transmembrane proteins with an LRR domain that extends into the extracellular space, but they lack a kinase domain inside the cell. Their role in signaling is thus mediated by binding to other proteins on the cell surface. Some RLPs are known to play important roles in plant development, and others in immunity (Wang et al., 2008).

| Wall-associated receptor kinases (WAK/ WAKL)
Many WAKs have an extracellular domain that binds oligogalacturonides (OGs), products of cell wall damage. Consistent with this, WAKs have a role in both cell growth and pathogen response (Kohorn & Kohorn, 2012).
We found additional evidence for a connection between the MycEx families and innate immunity from immune response transcriptomes.
While we did not attempt a systematic survey, we found five transcrip-

| MycEx gene families do not expand significantly with woodiness
It is notable that the are not statistically different from 0, while those for mycorrhizal competence are significant. As an additional check, we tested for a dependence on woodiness just among the 27 AM species on our list.
None was found.
As mentioned in the Introduction, there has been repeated speculation that plant defense gene families may expand in woody perennials as an adaptation for a long life-span (see Tuskan et al., 2006;Yang et al., 2008;Plomion et al., 2018 and the opinion article Tobias & Guest, 2014). By contrast, our results suggest that the observed correlations of gene family size with woodiness may be an artifact of a stronger correlation with mycorrhizal competence. It should be emphasized, however, that our analysis relies on Pfam domain assignments (Punta et al., 2012), rather than the more highly resolved sub-families used by Plomion et al. (2018), so it is possible that their clades have a stronger dependence on woodiness than found by us.

| A hypothesis to explain MycEx family expansion
Our results suggest that the diversity of immune signaling components expands with mycorrhizal competence, but we have not yet considered possible explanations. In this section, we consider the hypothesis that this correlation is related to the species diversity of microbes in each kind of symbiotic association.
Arbuscular mycorrhiza fungi are monophyletic, and comprise ~150 species (Schussler, Schwarzott, & Walker, 2001). Field observation and transplant experiments suggest that most AM fungi are promiscuous, able to colonize any AM-competent host plant (reviewed in Smith & Read, 2008). ECM fungal species are much more diverse than AM fungi. The ECM character has evolved independently in dozens of distinct fungal lineages, and includes thousands of species (Tedersoo & Brundrett, 2017;Tedersoo & Smith, 2013). ECM fungi typically form assemblages of 10 or more species that co-occur on the roots of a single host plant. The composition of these assemblages varies across the geographic range of the host, so that a single ECM plant species may be competent to form symbioses with hundreds, or even thousands of distinct fungal species (Trappe, 1977;van der Linde et al., 2018;Van Geel et al., 2018). In addition, the ECM plant species we consider retain their AM competence (Figure 1; Supporting Information Table S1). Thus, nonmycorrhizal, arbuscular mycorrhizal, and ectomycorrhizal plant species lie on a continuum of symbiote diversity, NM < AM < ECM.
As discussed in the Introduction, there is a growing body of evidence that the plant immune system regulates interactions with both pathogenic and nonpathogenic microbes (see, e.g. Madsen et al., 2003;Yang et al., 2010;Miyata et al., 2014; and the reviews Cook et al., 2015;Hacquard et al., 2017). Consistent with this, our results suggest that the diversity of nonpathogenic microbes is a major driver of immune repertoire diversification. A host plant competent to form symbiotic associations with a greater diversity of symbiotic fungal species will need to rely on a greater diversity of signals to distinguish pathogenic from nonpathogenic fungi, especially in those cases where the symbiotic species is closely related to a pathogenic one. Therefore, the expansion of MycEx families may be an adaptation for symbiotic promiscuity.

| Caveats
The symbiote diversity hypothesis outlined above offers a possible explanation for our results, but there are several caveats or potential objections we wish to consider.
First, genomic and transcriptomic approaches have mostly failed to find a significant relationship between the MycEx families and mycorrhizal symbiosis. Lists of genes conserved across AM plant species have relatively few MycEx genes (Delaux et al., 2014;Bravo et al., 2016; and Supporting Information Table S8). The only statistically significant over-representation we found is six RLPs in the list compiled by Delaux et al. (2014). Similarly, transcriptomic studies of genes up-regulated in response to AM colonization find some MycEx genes, but no family is over-represented (Sugimura & Saito, 2017;Recchia et al., 2018;Vangelisti et al., 2018;and Supporting Information Table S8). Transcriptomic studies do report up-regulation of defense-related genes following AM colonization, but these genes are mostly downstream of the signaling pathways represented by the MycEx familes (Calabrese et al., 2017;Hohnjec, Vieweg, Puhler, Becker, & Kuster, 2005). We thus have the paradox that MycEx gene families do not appear to be especially active following AM colonization.
One possible explanation for this result comes from studies of legume-rhizobia symbioses, which share many signaling pathways in common with mycorrhiza and have been more thoroughly studied (reviewed in Parniske, 2008). Transcriptomic studies of rhizobial symbiosis show an up-regulation of plant defenses in response to the initial infection, followed by a suppression of those pathways after the successful establishment of symbiosis (Kouchi et al., 2004;Lohar et al., 2006;Maunoury et al., 2010). If the role of the innate immune system during AM colonization is similar to nodule forma-  (Lundberg et al., 2012). There is also a complex and poorly understood network of interactions between plant roots and the microbes that live at the root-soil interface. The fact that AM and ECM symbioses stand out against this background is somewhat surprising.
Lastly, if the observed expansion of the MycEx gene families is principally to detect signals derived from invading microbes, that implies a much higher diversity of microbe-originated signals than is currently known. One line of evidence consistent with the existence of additional signals comes from recent work sequencing the genomes of dozens of fungal species, both pathogenic and nonpathogenic (Kohler et al., 2015;Lo Presti et al., 2015). This reveals the presence of hundreds of short (<300 aa) proteins with secretion peptide signals and little similarity to known gene families. The extent to which these secreted proteins play a role in microbe-host signaling remains to be explored.
Thus, while the observed correlation between mycorrhizal competence and immune repertoire size is strong, our proposed explanation in terms of symbiote species diversity remains speculative.

| Comparisons with legume-rhizobia symbiosis
We find that the NBS and other MycEx gene families do not expand significantly in the rhizobial legumes. This is consistent with our hypothesis about symbiote species diversity, since many legume species are competent to associate with fewer than ~10 species of rhizobia (reviewed in Dilworth et al., 2008). However, there are several confounding factors that need to be considered. First, since all the rhizobial species in our data set fall in a single family (Fabaceae, the legumes), we must be cautious about generalizing to rhizobia-competent species outside the legumes. Second, the genomes of rhizobia display a high degree of genetic variation, and include evidence for the horizontal transfer of genes related to the establishment and maintenance of symbiosis (Epstein et al., 2012;Sugawara et al., 2013). Thus, the genetic diversity of rhizobia may not be well-estimated by simple species counts. Third, there are important differences in the cell biology of rhizobial and mycorrhizal symbioses. Rhizobia occupy specialized root nodules that isolate them from the soil environment (reviewed in Dilworth et al., 2008;Guinel, 2009). This contrasts with fungal symbiosis, where the mycorrhizal filament network remains in direct contact with both the soil and the cells of the host plant root (reviewed in Smith & Read, 2008). Thus, the maintenance of mycorrhizal symbiosis may present a greater ongoing challenge to the plant immune system than rhizobial symbiosis.

| CON CLUS ION
In this paper, we have shown that the plant immune repertoire expands with mycorrhizal competence, and argued that symbiote diversity is a plausible explanation for this expansion. This idea has interesting parallels with the proposal that the evolution of the adaptive immune system in vertebrates facilitated the competence for a diverse and stable gut microbiota (McFall-Ngai, 2007). The role of symbiote diversity as a driver of immune system expansion may be common to both kingdoms.
Some related work was recently published by Munch et al. (2018).
They identified a clade of NBS genes with relatively few members in the Brassicaceae, and speculated that NM plants may have evolved immune signaling pathways distinct from the NBS system to compensate. We cannot rule out this explanation, but our observation that NM species have deficits in multiple defense pathways, not just NBS, makes this explanation less likely.
There are several ways to expand on the current paper that would improve confidence in our hypothesis. To better assess the significance of rhizobial symbiosis, it would be helpful to have multiple genomes from nonleguminous rhizobial species, and also from leguminous species that do not support rhizobial symbiosis. The hypothesis may be further tested using the genomes of plant species with distinct types of mycorrhizal association, such as orchids and the Ericaceae (reviewed in Smith & Read, 2008). To clarify the relative importance of pathogens, endophytes, and mycorrhizas for immune expansions, it would helpful to have a more complete survey of microbial diversity in all three classes. Another topic of interest is the search for sub-families within the MycEx gene families that show especially strong dependence on mycorrhizal competence, or sub-families that expanded during the evolutionary development of mycorrhiza (e.g. Yue, Meyers, Chen, Tian, & Yang, 2012). These projects may point the way to an improved functional understanding of the relationship between the immune repertoire and fungal symbiosis.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

ACK N OWLED G EM ENTS
We

DATA AVA I L A B I L I T Y
All data are available in the main text or the Supporting information Materials.