Linking concepts in the ecology and evolution of invasive plants: network analysis shows what has been most studied and identifies knowledge gaps

In recent decades, a growing number of studies have addressed connections between ecological and evolutionary concepts in biologic invasions. These connections may be crucial for understanding the processes underlying invaders’ success. However, the extent to which scientists have worked on the integration of the ecology and evolution of invasive plants is poorly documented, as few attempts have been made to evaluate these efforts in invasion biology research. Such analysis can facilitate recognize well-documented relationships and identify gaps in our knowledge. In this study, we used a network-based method for visualizing the connections between major aspects of ecology and evolution in the primary research literature. Using the family Poaceae as an example, we show that ecological concepts were more studied and better interconnected than were evolutionary concepts. Several possible connections were not documented at all, representing knowledge gaps between ecology and evolution of invaders. Among knowledge gaps, the concepts of plasticity, gene flow, epigenetics and human influence were particularly under-connected. We discuss five possible research avenues to better understand the relationships between ecology and evolution in the success of Poaceae, and of alien plants in general.


Introduction
While the science of biologic invasions has primarily focused on the description of invasive species, enumeration of invasive traits, analysis of the ecological mechanisms, and impacts associated with invasion (e.g. Vanderhoeven et al. 2005;Muth and Pigliucci 2006;Dassonville et al. 2008;Monty et al. 2008), a growing body of research has been devoted to the evolution of invasive organisms (e.g. Callaway and Maron 2006;Lee and Gelembiuk 2008;Dyer et al. 2010). Populations of exotics are increasingly reported to experience rapid evolutionary changes concurrent with or soon after their introduction to a new range (e.g. Weber and Schmid 1998;Maron et al. 2004;Tiébré et al. 2007; Montague et al. 2008;Monty and Mahy 2009). Over relatively short timescales, both ecological and evolutionary processes -and their interactions -may affect the outcome of an incipient invasion (Lambrinos 2004). Understanding these interactions has broad practical application for the control and management of invasive species. However, scientific study may not necessarily be well-distributed across the breadth of disciplines that are related to biologic invasions. To what extent have scientists worked on the integration of ecology and evolution of invasive species? Which ecological concepts have been studied with respect to which evolutionary processes? To date, little effort has been made to evaluate the extent of research efforts integrating the ecology and evolution of invasive species.
We set out to characterize the connections and gaps between studies of ecology and evolution of invading species. Interactions between ecological and evolutionary concepts can be very heterogeneous, some of them being well-established and -documented, whereas others remain largely unexplored. Exploring and visualizing that heterogeneity is of particular interest to the scientist who wishes to investigate the mechanisms underlying alien spread in more detail. One potential approach is the application of network theory and network analysis methods to quantitatively characterize the relationships existing in invasion science literature between different concepts in ecology and evolution. The notion of network is used in reference to an interconnected system of various items. Studies of such sometimes highly complex systems led to the development of network theory, an area applied widely in numerous disciplines such as mathematics, physics, computer science, economics, sociology, and biology (Newman 2001;Weitz et al. 2007;Garroway et al. 2008). Formally, a network is a set of vertices, or nodes, connected in a pairwise fashion by a subset of edges (lines connecting two nodes), which can be oriented or weighted (Lesne 2006). Mapping the interactions among components of a system can be accomplished through the geometrical approach of graph theory, which allows simplification and emphasis on underlying structure and relationships (Guimerà and Amaral 2005).
In this study, we use a network analysis approach to characterize and visualize the state of the art in research integrating the ecology and evolution of invasive species. Applying this approach to the Poaceae family as an example, we identify gaps in the current research and propose a set of future research avenues.

Defining the nodes: what to visualize?
Typically in network analysis, nodes represent clearly defined, discrete entities (e.g. genes, individuals, organisms, or computers). In the present approach, the nodes represent ecological and evolutionary concepts, making the discrete definition of these concepts a key step. This process is inherently flexible, and can be tailored to the system of interest. Within this context, we used the following steps to define relevant nodes in the concept network we aimed at visualizing for plant invasion.
Based on group discussion at a workshop entitled Synthesizing Ecology and Evolution for the Study of Invasive Species, held on March 19-22, 2009 at North Lake Tahoe, we first generated a list of 18 relevant concepts in ecology and evolution, comprising four main categories (Table 1). The first category concerned the spatial and temporal scales of invasion. The second category included plant features, with demography, plant development, reproduction, and ecological niche. The third category concerned the influence of the environment on biologic invasions, distinguishing abiotic conditions (climate and soil conditions), biotic interactions (except humans), and human influence (through land use, disturbance and global change). The final category concerned evolution. It included the different sources of phenotypic variation that enable plant species or populations to cope with variability in the environment, i.e. the species' genetic background (genetic diversity and ploidy level), the mechanisms related to population differentiation (hybridization, mutation, stochastic population differentiation, deterministic population differentiation, and gene flow) and the environmental influence on genotypes through phenotypic and trans-generational plasticity.
Next, we compiled a thorough list of keywords related to ecology and evolution of invasive species (229 keywords from 100 randomly selected articles in a comprehensive list about plant invasions). Each keyword was then assigned to one of the 18 conceptual nodes based on general agreement among the authors and other scientists at the workshop (Table 1). Ambiguous and irrelevant keywords were discarded. This a priori approach allowed us to choose the most relevant concepts for the network, and resulted in a network that could be adjusted to fit the objectives, questions being addressed, and the desired amount of work and level of precision.
During the last two decades, sophisticated text-mining procedures have been developed as an automated means to manage increasing amounts of information (Ananiadou et al. 2006). These methods allow the extraction of semantic information from text without necessitating that the end user read the text. Concepts are therefore generated from text. This is fundamentally different from our method, in that our concept selection was made according to biologic significance consensually recognized by experts and independent of their occurrence in the literature. If concepts were generated from text alone, noninvestigated topics would not be identified despite their biologic importance from an ecological and evolutionary point of view.

Connecting the nodes: who studied what?
Semantic networks are typically used to represent relationships between concepts (De Deyne and Storms 2008), but these are strictly qualitative and somewhat subjective. They are represented as directed or undirected graphs composed of vertices, i.e. concepts and edges, which qualitatively represent relationships between vertices. However, connections between concepts can also be regarded from a quantitative point of view by considering the relative intensity of interactions between concepts. We use this approach to determine the relative intensity of research integrating specific ecological and evolutionary concepts. To visually represent such integrative research, an edge can be drawn between each pair of concept nodes  that have been empirically considered together. These connections can be weighted by a symmetric 'connectivity matrix' (Table 2), which quantifies every connection as the number of peer reviewed articles (of primary research) that explicitly address each pair of concepts. The number of original publications unambiguously addressing each pair of concepts is a relevant estimate of the research effort devoted to study the relationship between concepts, independent of the results of those studies. The connectivity matrix can then be used as input data to generate a graphic visualization of the network, in which edge thickness is proportional to quantitative matrix data. Databases are powerful tools for compiling publications relating two node concepts, and most online databases allow for comprehensive search syntaxes. For an accurate network, the selected database(s) must be comprehensive and, most importantly, representative of the current state of research. ISI Web of Science Ò (Thomson Reuters, New York, NY), which we used in this study, is an example of this type of large-scale database.
Because search syntax complexity is limited and some terms may have different meanings or nuances, raw results from databases are likely to include an important proportion of irrelevant articles. Those are, for instance, articles for which the considered concepts are not related to the invasive species of focus or articles addressing concepts only in a perspective view, not experimentally. These studies can obscure emerging patterns in the network graph. This problem can be eliminated by a comprehensive abstract verification, i.e. reading and analyzing title, abstract, and keywords.

Application to Poaceae invasions
We applied our network approach to invasions by Poaceae, a family known to include a large number of problematic aliens (Maillet and Lopez-Garcia 2000), some of them among the first invasive plants studied in an evolutionary perspective (Rice and Mack 1991). We restricted our study to empirical articles, rejecting reviews, proceedings, and perspective papers. We used the Web of Science Ò advanced search tool to compile the number of empirical articles that explicitly addressed pairs of nodes in the field tag 'topic', i.e. the title, abstract, and keywords. These searches were performed from May 18 to June 10, 2009. An initial universal search restricted the results to publications concerning exotic Poaceae. Search syntax was then established for each node, including the keywords determined for each concept (Table 1). The 18 node searches were subsequently combined in pairs. The number of peer reviewed original articles listed for each of the 153 pairs of nodes was compiled in a 'raw matrix'.
Abstract verification, discarding all irrelevant studies, was performed to generate the 'refined' matrix (Table 2). Mean rejection rate was 80.0% (standard deviation 23.0%), but there was a strong correlation between the raw and refined matrices (Mantel test; 10 000 iterations, correlation = 0.860, P < 0.001). Even with abstract verification, the number of articles addressing pairs of concepts in the refined matrix was dependent on the number of keywords involved in the search syntax (Pearson's r = 0.377; P < 0.001). Some concepts may indeed have broader meanings than others, leading to more numerous keywords. Our intent was to consider all concept nodes equitably, i.e. assess the connections between concepts independently of the number of keywords. Therefore, each value of the refined matrix was divided by the number of keywords involved, i.e. the sum of keywords for the two corresponding nodes. This produced a connectivity matrix that is representative of the research effort devoted to connecting pairs of concepts. Within this matrix, we defined knowledge gaps as the least-connected pairs, i.e. cells for which the connectivity value (no. refined articles/no. keywords involved in the search syntax) was less than the tenth percentile. The network representation of the results (Fig. 1), based on this connectivity matrix, was generated using Ucinet Ò software (Borgatti et al. 2002). Figure 1 illustrates the heavily connected network of links between ecological and evolutionary concepts. It shows how interactions between aspects of ecology are far better documented than those involving aspects of evolution and genetics. Intra-ecological concepts (placed at the right-hand side of Fig. 1) show strong connections, representative of a large number of integrative publications. Not surprisingly, the strongest of these connections are between environmental conditions (abiotic or biotic, including human influence) and plant features, and between the scales of invasion and all ecological concepts.

Research visualization
However, a number of connections indicate that evolutionary studies have also placed genetic diversity, hybridization, and stochastic and deterministic population differentiation in a temporal context. This may reflect the inherent advantages of species invasions as natural experiments for examining the temporal aspects of evolutionary processes. Save for epigenetics, which is an emerging field (Liu and Wendel 2003;Bossdorf et al. 2008), all concepts examined were studied in an explicitly spatial context.
Among evolutionary concepts, genetic diversity is the most highly connected, evenly related to both ecology and evolution. Even further, it is the only concept considered in relation to all others. In contrast, other genetic concepts (epigenetics, ploidy, mutation, stochastic population differentiation) were among the most poorly connected in the network. Epigenetics, most likely because of its relative novelty, was not studied in conjunction with any ecological concepts. Another densely connected evolutionary concept was deterministic population differentiation. Indeed, common-garden and reciprocal transplant studies are generally performed in a temporal/spatial context of invasion (population provenance) and frequently consider fitness traits (plant development and reproduction) and environmental conditions (e.g. watering treatments, herbivory, etc.). An emerging pattern is that the different sources of phenotypic variation in exotic Poaceae were rarely studied. Plasticity, for instance, was not connected to hybridization, mutation, genetic drift, or gene flow, and was poorly connected to genetic diversity. Similarly, few studies explicitly addressed both deterministic and stochastic population differentiation at the same time.
Human influence on Poaceae invasion was largely studied in an ecological context. However, its evolutionary consequences were recently addressed (Larson 2002;Bock et al. 2007;Wei et al. 2008) and potentially represent an emerging field of study.
Some particularities may be expected from the Poaceae case study compared with the same matrix performed without family restriction. For example, biologic characteristics of this family, such as the reproductive system, might influence some of the observed patterns. Wind-pollination may decrease the importance of biotic interactions in Poaceae relative to plant families that rely more heavily on insect-mediated pollination. Conversely, some research topics might be over-represented in this invasive grass example. For instance, we might expect human influence to be prominent in this analysis because fire is often associated with grass invasion (Brooks et al. 2004). Similarly, the connections with hybridization concepts may result from the relatively important research effort on Spartina, an extensively hybridizing genus (Ainouche et al. 2009).

Knowledge gaps and future research avenues
Identified gaps corresponded to connections not documented by any study, i.e. null values in the connectivity matrix (Table 2) (tenth percentile of the connectivity values = 0.00). However, not all connections have the same biologic relevance. Even if all the 153 potential interactions between the 18 concepts considered here were possible, some are less straightforward than others (e.g. gene flow · plant development; stochastic population differentiation · environmental conditions). Each researcher in the ecology and evolution of invaders, depending on her or his particular interests, can find directions for new investigations among identified knowledge gaps, especially as a connection between two concepts can raise several questions. We propose and discuss five questions raised by our results, which we believe represent promising opportunities to understand the relationships between ecological and evolutionary aspects in the success not only of Poaceae, but of most plant invaders.
What is the importance of plasticity in an invader's response to human-induced disturbance? (human influence · plasticity) Despite important literature on the response of grasslands communities to fire regimes (i.e. Young and Evans 1978;Gutie¢rrez et al. 1998), we found no studies Figure 1 Network representation of the research effort connecting ecological and evolutionary concepts in exotic Poaceae invasions. Edge thickness represents the ratio of the number of published articles that connect two concepts (referenced in Web of Science Ò , excluding reviews and proceedings) on the number of keywords involved in the corresponding search syntax, after abstract verification to discard irrelevant articles.

Concept network in invasion ecology and evolution
Vanderhoeven et al.
addressing the role of phenotypic plasticity or environmental maternal effects, the latter an understudied invasion mechanism (Dyer et al. 2010) on the success of invasive Poaceae in response to human influence. Providing quantitative data on the reaction norms of individual invaders for example in variable fire regimes is, however, of particular interest for understanding and forecasting the success of invasive plants (Grace et al. 2001). Using clones of individuals, experiments in which each genotype would be exposed to different treatments of human influence could be used to quantify the reaction norms of invaders in response to tillage, burning, etc. How do human activities affect gene flow in invading populations? (human influence · gene flow) Human activities profoundly affect ecosystems and metapopulation dynamics, which in turn may affect gene flow. For instance, wind-dispersed plant invaders have been shown to disperse preferentially along roads and railways (Ernst 1998;Gelbard and Belnap 2003). Barochorous species have dispersal that is enhanced by agricultural practices (Benvenuti 2007), making gene flow attributable to human activities. Many other human influences on gene migration among invading populations are possible. However, this has not been experimentally addressed for Poaceae invaders.
What is the relationship between patterns of gene flow and the temporal dynamics of invasion? (time scale of invasion · gene flow) No connections were found between the 'gene flow' concept and the 'time scale of invasion' in our Poaceae dataset. However, it is likely that the pattern of gene exchange among metapopulations is inherently dependant on invasion dynamics, especially when selection and gene flow have not reached equilibrium (Sakai et al. 2001). For instance, it would be worth documenting the variation in gene flow at different stages of invasion. Early naturalization, rapid spatial spread and the subsequent slowing of an invasion because of (a)biotic barriers may rely on different metapopulation dynamics. In the case of multiple introductions, the spatial pattern of gene flow may be profoundly modified when several early stage, locally invaded areas become connected to create a much wider invasive range. The temporal aspect of introduction also deserves further investigation in relation to its potential effect on gene flow. Indeed, patterns of gene flow, and in turn the potential for invasive populations to evolve, are likely to be affected by the period during which propagules are introduced from the native range. However, gene flow is a spatial and temporal process inherently difficult to characterize, and this is may explain the lack of empirical work on this topic. Such a study would require an analysis of gene flow patterns among populations using molecular markers over a sufficiently long period of time to encompass different stages of invasion. In the case of invaders introduced to a few well-separated areas, such as Senecio inaequidens (Ernst 1998), it may be possible to characterize the gene flow among populations within one area before and after the convergence of these initially isolated invaded areas.
Do polyploids have higher invasiveness in humandisturbed environments? (human influence · ploidy) A growing body of research indicates that polyploidy may contribute to invasive behavior and the spread of alien plant species (Vilà and D'Antonio 1998;Ainouche et al. 2004;Soltis et al. 2004;Pandit et al. 2006), and Bennett et al. (1998) found a higher proportion of polyploids in weedy species than in other plant groups. In addition, polyploids are thought to show higher resistance to environmental stresses (Levin 1983;Bretagnolle et al. 1998). However, the extent to which polyploid invaders are facilitated by human disturbance is still poorly documented and no studies were found that experimentally addressed this question in exotic grasses. Experiments should be conducted to test for better performance of polyploids in human-influenced habitats, e.g. a better response to tillage, fire or other soil disturbance regimes.
Do epigenetic mechanisms allow invasive species to display increased phenotypic plasticity? (plasticity · epigenetics) Bossdorf et al. (2008) recently emphasized the need for considering epigenetic variation and inheritance in a broad ecological perspective. Epigenetic processes may induce heritable phenotypic variation in ecologically relevant traits, without any contribution of genetic variation. The existence of such epigenetic contribution may be assessed through several approaches, including the use of study systems lacking genetic variation. Beyond using genetically uniform organisms as a tool, one could also consider the consequences of such a link between epigenetics and phenotypic plasticity specifically for invasive species. Severe genetic bottlenecks are reported in invasive plants after introduction (e.g. Dlugosch and Parker 2008). Because of the absence of genetic variation, species displaying uniparental reproduction (selfing, clonal propagation, and apomixis) are particularly susceptible to genetic drift  and are thought to suffer from consequent decreased evolutionary potential. In this context, the contribution of epigenetic processes to increased phenotypic diversity is particularly pertinent, as it might explain the broad ecological range of such invasive species. Common-garden experiments, potentially involving contrasting soil conditions, may be able to demonstrate heritable phenotypic variation in such uniparentally reproducing species.

Discussion and perspectives
Studies synthesizing the impacts of multiple factors have been increasingly recognized as vital to the understanding of biologic invasions (Lambrinos 2004;Hastings et al. 2005). Network analysis represents an innovative method for visualizing the current state of research, enabling scientists to identify knowledge gaps among disciplines. As shown in our application example, input matrices can be easily obtained thanks to current bibliographic tools such as online databases. Several software programs can be used for graphic visualization of networks (Huisman and Van Duijn 2005). The time-consuming step of abstract verification allows a finer visualization of the network structure (with on average 80% of articles rejected) and, most importantly, provides a comprehensive list of relevant articles linking concept pairs. This is an important step if one aims to detect gaps among a concept list established a priori. However, the strong correlation we found between raw and refined matrices suggests that this step is not necessary to generate a general picture of the research, expanding the applicability of the method for simple visualization.
In the case of evolution and ecology, the concepts delimited in the present article provide a foundation for further investigation. However, definition of these concepts is an inherent limitation to our method, as it is impossible to avoid subjectivity entirely. Quantitative studies are always difficult to apply to qualitative concepts, and problems like ambiguity or concept overlap are inherent. Nevertheless, this can be as much an advantage as a drawback. Concepts can be defined ad hoc for a specific system or research interest, and can be based on consensus among a given group of experts. Another possible approach is to use text-mining methods. In the past two decades such methods have become very sophisticated, particularly in their application to the medical and molecular biologic literature (Krallinger et al. 2005). Two important features of advanced text mining are the extraction of semantic information and the use of automation over manual methods. As opposed to our a priori approach based on knowledge of the discipline, extraction of semantic information present in the publication analyzed would allow the discovery of words most closely associated with different concepts and most likely to be used in publications that combined concepts. The use of automation over manual methods would also allow the analysis of much larger databases. However, as no algorithm could replace a comprehensive reading by an expert, such methods cannot completely avoid irrelevant co-occurrences of concepts. In the future, both approaches could be valuable in completely characterizing the research structure and semantic patterns in ecology and evolution of invasive plants.
The visualization approach developed here must be seen as complementary to literature reviews. In review articles, the information within empirical studies is summarized so that general trends in data can emerge. In our case, the goal was not to stress general trends in the results, but trends in the research effort -that is, the number of studies addressing a particular combination of concepts. However, network analysis could also be applied to reviews and meta-analyzes, notably by assigning directions and signs to edges. Conversely, meta-analytical methods could be applied to network analysis to reduce subjectivity and improve accuracy (Arnqvist and Wooster 1995). Both approaches are complementary, and further investigation should be made to assess the utility of a combined approach.
The network presented in this study reveals the amount of primary research, in terms of publications, devoted to connecting concepts. Network structure is defined by many factors in addition to the a priori parameters described above. In addition to the biologic relevance of the connections, the interests of specific research teams and the funding available in various sub-disciplines are likely to influence connection architecture and strength. The costs associated with different areas of research can also be inherently different, influencing the number of resulting publications. Also, the number and relevance of journals devoted to sub-disciplines can have a profound influence on the publications of results connecting concepts. In addition, network analysis may be impacted by the underrepresentation in the literature of integrative studies that were carried out, but whose results were never published (perhaps because of negative results).
Including a relevance index (e.g. times cited; journal's impact factor) to balance the contribution of each publication is a possible improvement to the methods used here. Another possible improvement is the inclusion of a chronological dimension to the network by weighting each included study by its publication year. Including a temporal analysis of the research structure among ecological and evolutionary concepts, as well as applying our method to both invasive and native plants, may represent the most promising perspectives of the present work. Invasive species may be highly detrimental, but they nonetheless present superb research opportunities to evolutionary biologists (Lee 2002;Callaway and Maron 2006). One wonders whether invasive species will lead the way for evolutionary and ecological studies, with fundamental research carried out on invasive plant models before natives. Comparing the knowledge gaps between (i) the literature about invasive plants and (ii) the whole (or a subset of the) literature about plant evolution and ecology would help show whether invasive species research follows or leads the scientific effort to understand the ecology and evolution of plants.

Conclusions
This study demonstrated a possible use of network analysis in biologic invasion science to quantify the research effort devoted to connecting a series of important concepts, defined a priori, in plant ecology and evolution. Applied to one of the most studied and most invasive plant families, Poaceae, the method presented here has demonstrated a greater connection between ecological concepts than evolutionary concepts in the literature. An important heterogeneity among connections between ecological and evolutionary concepts in invasive grass research was found. Some connections were commonly addressed in the primary research, whereas others were largely unexplored. The concepts of human influence, gene flow, epigenetics and plasticity were among the connections identified as knowledge gaps. This application of network theory should bring further insights for integrative research into the ecology and evolution of invasive species.