Current status of the multinational Arabidopsis community

Abstract The multinational Arabidopsis research community is highly collaborative and over the past thirty years these activities have been documented by the Multinational Arabidopsis Steering Committee (MASC). Here, we (a) highlight recent research advances made with the reference plant Arabidopsis thaliana; (b) provide summaries from recent reports submitted by MASC subcommittees, projects and resources associated with MASC and from MASC country representatives; and (c) initiate a call for ideas and foci for the “fourth decadal roadmap,” which will advise and coordinate the global activities of the Arabidopsis research community.

activities of the community. These roadmaps were first published in 1990. The first was entitled "A Long-range Plan for the Genome Research Project" 2 ; the second in 2002 was "Beyond the Whole Genome Sequence" 3 ; and the third in 2012 was "From Bench to Bountiful Harvests" (Lavagi et al., 2012). The preparation phase for the fourth decadal roadmap has now begun and we encourage input from the Arabidopsis community as we look to 2030 and beyond.

| AR AB IDOPS IS RE S E ARCH REMAINS CUT TING -EDG E
In 2019 according to the U.S. National Center for Biotechnology Information's PubMed, the annual number of Arabidopsis publications increased after plateauing in 2013 (Figure 1). A significant change occurred in 2018, when, for the first time since we began tracking publication numbers, the number of publications featuring "Rice" or"Oryza" was greater than those featuring Arabidopsis. This might indicate that technological advances in genome sequencing, bioinformatics and gene-editing, amongst others have recently facilitated research in crop species and that discoveries made in Arabidopsis are now being more effectively translated. This is reinforced by a review by Provart et al. (2016), which surveyed 54,033 Arabidopsis papers and found that in the majority of years, more than 50% of the cited Arabidopsis papers from a given year were referenced by papers in which the research focused on a species other than Arabidopsis. This was determined by the absence of "Arabidopsis" in the taxonomic data available for each paper.
The publication and citation data point to the continued value of Arabidopsis research to plant sciences in general, but more importantly in the area of curiosity-driven and discovery-based science.
In 2019, many high-profile "Cell-Nature-Science" (CNS) publications featured Arabidopsis research and described several "firsts" in plant science. These included the discovery in plant nuclei of liquid-liquid phase separations of polyadenylation complexes (Fang et al., 2019), a novel mechanism of protein aggregation for controlling auxin responses (Powers et al., 2019), defining the control and organization of the cambial stem cells that are the progenitors of all woody tissues (Miyashima et al., 2019), identifying novel molecular controls of autophagy , engineering a synthetic switch to control stomatal opening (Papanatsiou et al., 2019), and defining the first complete blueprint for an immunity pan-NLRome ( Van de Weyer et al., 2019).
This breadth of topics demonstrates that research in Arabidopsis continues to be influential from the perspective of both plant biology and biology more generally. In many cases, the qualities of Arabidopsis that attracted researchers over the past half century (compact size, short generation time, ease of transformation, genomic, and informatic resources) still enable breakthroughs to be made apace. We turn now to efforts to coordinate such resources, and highlight newly generated large datasets that will, in turn, enable further discoveries.

| UPDATE FROM MA SC SUBCOMMIT TEE S
The MASC Subcommittees were established in 2002, at the beginning of the second decadal roadmap, and serve to bring together international groups of scientists who work in a thematic research area.
Over the past 18 years these subcommittees have contributed to the MASC annual report, led sessions at ICAR meetings and engaged with the wider community. There are currently eight MASC subcommittees; changes for 2019-2020 include a new "Plant Immunity" subcommittee and the activities of the "Phenotyping" subcommittee being subsumed within the International Phenotyping project reports. The other MASC subcommittees are "Bioinformatics," "Clone-based ORFeomics," "Epigenetics and Epigenomes," "Metabolomics," "Natural Variation and Comparative Genomics," "Proteomics" and "Systems and Synthetic

| Proteomics
The Proteomics subcommittee reports that over the past year a significant development in Arabidopsis research came from a study by Mergner et al. (2020) that quantified the total number of detected Arabidopsis proteins (more than 18,000), their dynamic expression range (up to 6-fold changes in abundance) and their phosphoryla- propose an international project that would catalogue the ORF clones that correspond to the 6,000 remaining uncharacterized protein-coding genes. It appears that these gene products were not identified in the Mergner et al. (2020) manuscript, so it remains to be discovered if these genes are indeed protein-coding and if so, where they are localized and what is their functional significance.

| Epigenetics and epigenomics
Arabidopsis has been the workhorse for elucidating mechanistic underpinnings of numerous epigenetic phenomena. These studies have both discovered and emphasized the importance of small RNAs, histone modifications, and DNA methylation during epigenome establishment and maintenance, in detection of self from non-self, and in responding to various environmental challenges. This research is aided by a new community-facing Arabidopsis RNA-seq database, ARS, which contains gene expression data from more than 20,000 publicly available RNA-seq libraries (http://ipf.sustc.edu.cn/pub/ athrn a/). This resource will dovetail with a soon-to-be released resource of whole genome sodium bisulfite sequencing datasets.

| Bioinformatics
Over the past year the Arabidopsis bioinformatics community had to overcome the unfortunate loss of funding for the Araport resource.
Fortunately, other community projects have stepped up to maintain some of its important features. This includes the hosting of the JBrowse tool at TAIR (https://bit.ly/2Qhb5xC) and the Thalemine instance hosted by the BAR (https://bar.utoro nto.ca/thale mine).
The third component of the revamped Araport is a new Arabidopsisfocused instance of the Genome Context Viewer (https://gcv-arabi dopsis.ncgr.org), developed and maintained by Andrew Farmer and Alan Cleary at the National Center for Genome Resources (Cleary and Farmer, 2018), which enables a dynamic comparison of multiple genomes on the basis of their shared functional elements. The GCV includes two sets of newly assembled genomes from 14 different Arabidopsis ecotypes/accessions. The Bioinformatics subcommittee highlights two outstanding datasets that focus on different aspects of the plant-pathogen response (Cao et al., 2019;Laflamme et al., 2020). They also highlight a newly released "eFP-Seq Browser" that enhances exploration of RNA-seq through the visualization of read map profiles and summary gene expression levels across two large compendia (Sullivan et al., 2019). These resources help to support the key training recommendation for future plant biologists to obtain the skills to develop new software tools in order to extract value from the vast amount of available Arabidopsis 'Omic data (Argueso et al., 2019).

| Metabolomics
The Arabidopsis metabolome is the best studied of all plant species and is used as an exemplar for studies in crop species. The MASC metabolomics subcommittee continues to support the integration of genomic data from natural populations with ecologically relevant metabolomic data that can reveal how a plant has ultimately adapted to environmental stresses. The subcommittee makes the case that as metabolomic platforms become more cost-effective, and as sensitive as NGS (Next Generation Sequencing) platforms, that metabolomics be considered an equal partner to sister techniques in order to develop a "full picture of a plant."

| Natural variation and comparative genomics
During the period of the current decadal roadmap the interrogation of natural variation has been an area of clear success. The hugely successful 1001 Genomes project has led to the development of software tools for further analysis of these publicly available datasets. These including the ViVa: Visualizing Variation (Hamm et al., 2019) and the AraPheno/AraGWAS tools (Togninalli et al., 2020). However, there remains plenty of Arabidopsis geographic variation that has not yet been analyzed. This is highlighted in the MASC country report from Turkey that looks at the distribution of the accessions available for order from the Nottingham Arabidopsis Stock Centre (NASC, Figure 2).
This suggests that there are many accessions growing across diverse geographic locations (including Turkey) that are currently underrepresented in the easily accessible available germplasm. In future there are clearly many opportunities for researchers to build upon our understanding of Arabidopsis natural variation through integration of currently underrepresented germplasm.  Despite reporting a downward trend over the past decade, the ABRC distributes annually almost twice as many seed stocks as NASC (190K versus 100K). However, NASC has seen increases over the past few years that undoubtedly reflects that it is the go-to stock centre for orders from China. The continued success of the stock centres relies on donations of plant material (mostly seeds) from the community and NASC reports that over the past few years German scientists have provided their largest number of donations.

| Plant immunity
Despite losing government funding over 5 years ago, TAIR continues its excellent biocuration services via an innovative and successful subscription model (Reiser et al., 2016). TAIR has expanded its operations to include collaborations with both PhyloGenes (www. that facilitates the publication of brief, novel findings, negative and/ or reproduced results that may lack a broader scientific narrative. Each week TAIR loads 50-90 papers with the term "Arabidopsis" in the title or abstract into their curation queue. This includes a steady number of papers that report on new functions for previously characterized genes and an increase in the number of papers that describe high-throughput experiments and contain large datasets. For a variety of reasons, curating from some papers can be challenging so TAIR have produced a document to advise researchers how to make the details of their research more "findable" (https://conf.arabi dopsis.org/pages/ viewp age.actio n?pageI d=22807345).
Oversight of the Arabidopsis informatics strategy has largely fallen to the International Arabidopsis Informatics Consortium (IAIC), which was funded by the NSF until 2020. In 2018 IAIC hosted a workshop in St Louis and its "take home" recommendation was for the establishment of a centralized "annotation authority" to advise on submissions from groups for new gene names across the Arabidopsis pangenome, to establish a consistent naming scheme, to distribute this format regularly and frequently, and to encourage its adoption (International Arabidopsis Informatics Consortium, 2019).
This article also recommends community-established guidelines and standards for data and metadata formats alongside a searchable, central repository for analysis and visualization tools (such as https://conf.arabi dopsis.org/displ ay/COM/Resou rces). Fortunately, the implementation of these recommendations will be facilitated by a closely linked international community and will undoubtedly be a topic discussed for inclusion within the next roadmap.
The BAR resource continues to be a central hub for researchers to interrogate and visualize their expression data. In addition to its expansion to include data from many other plant species, the BAR is far more than just an "eFP Browser." The BAR website includes access to a broad set of genomic tools and widgets that have a focus on the analysis of Arabidopsis datasets. The BAR has obtained funding from Genome Canada that will allow the development of a custom "eFP" view in ePlant for a researcher's own RNA-seq data as well as the initiation of several new ePlants.

| MA SC COUNTRY UPDATE S
The MASC Country Reports provide an overview on the progress of Arabidopsis research on a national scale, cataloguing important publications, new software tools and community resources. Currently, 34 countries with a MASC representative are asked to submit an annual report and 29 of these were able to submit reports in 2020, notwithstanding shutdowns related to the COVID-19 pandemic. We include some country highlights here.
China is an interesting case study as its country report states that "Arabidopsis is the model plant of choice to many groups. However, only a small portion of these labs is solely dedicated to Arabidopsis research or using [it] as the main model plant…. A major reason behind [this] would be the current funding priority. Whereas there are dedicated grants to basic and applied research in maize, rice, wheat, and virtually each every minor crop, there are no such funding programs towards Arabidopsis research." Although there is somewhat of a plateau in the number of global Arabidopsis publications over the past 5 years, this is not an even trend across all countries (Figures 1 and 3). While numbers of publications on Arabidopsis are steady or falling in some countries, this is offset by the increase in publications coming from Chinese labs, which shows no slow-down since trending upward a decade ago. Therefore, despite a general admission that there is movement toward more applied research, it is encouraging that many country representatives are positive about the state of Arabidopsis research in their home country. This is exemplified by a response from Belgium; "….plant scientists feel an increasing pressure from funding agencies, universities, and research institutes to focus on more applied research aspects. This being said, it is likely that Arabidopsis will remain a major tool to generate and test hypothesis even in applied research projects." Globally, Arabidopsis clearly remains a critical experimental model for understanding "how plants work," which will lead to technological advances and knowledge increases that feed into applied projects in a variety of crop plants.

| MOVING TOWARD THE FOURTH ROADMAP
The It is gratifying that progress has been made in each of these areas, yet work remains to be done. At the end of this period we now understand much more about the molecular and biochemical events that control how a plant grows and senses its environment.
However, there remain significant gaps in our knowledge, including a lack of understanding of the complex linkages between available 'omic data sets or the more simple knowledge of how plants sense many of their required nutrients. Improvements in this area are needed to build a fully predictive model that is accurate across time.
Fortunately, there is an acknowledgment that bioinformatics training and the development of digital infrastructures are key for the future in depth analysis of Arabidopsis-derived datasets.
The loss of funding for Araport was disappointing for the community and, although its key activities have been picked up by TAIR, the BAR and the Genome Context Viewer, efforts are needed to ensure that community resources have longevity. This requires the integration of international infrastructures, particularly between Western and Eastern hemispheres. In some areas international cooperation is excellent, such as in the coordination of conference planning, yet elsewhere it can be improved. These challenges include, but are not limited to, implementation of effective mechanisms of data sharing, cultural and language differences, and availability of global funding initiatives.
With this update, MASC calls on all Arabidopsis researchers to consider areas for inclusion in the next decadal roadmap. We expect participation from long-time community leaders, such as the North American Arabidopsis Steering Committee (NAASC), and collaborators from the United States, Germany, and Japan.
We also expect that there will be contributions toward decadal priorities from a broader group of MASC members, especially those representing countries with significant and increasing Arabidopsis research, such as China or India. The positive sense of East-West collaboration that was felt by those who attended ICAR2019 in Wuhan was a promising beginning to these discussions.

| P OSS IB LE ARE A S FOR IN CLUS I ON IN THE NE X T DEC ADAL ROADMAP
• What are the strategies that might be used to build globally sustainable digital infrastructures to support the integration of multiomic datasets?
• How can both the data and metadata from complex multi-omic experiments be collated and shared for the benefit of the wider community in order to feed into translational pipelines?
• How can we integrate mechanistic and quantitative genetic insights to enable plant acclimation to vastly different climates, within a very short time period?
• How can the community build internationally cohesive and diverse collaborative teams of scientists to answer important questions in plant science?
Over the next year, these ideas will be developed and will coalesce during discussions at a MASC-supported discussion session at ICAR2021 in Seattle. The roadmap will be launched and published prior to ICAR2022 in Belfast and will hopefully lead in the planning of community-driven projects over the coming decade.

ACK N OWLED G M ENTS
GP is supported by UKRI-BBSRC grant GARNet2020 (BB/

M004376/1) SMB is partially funded by an HHMI Faculty Scholar
Fellowship. BU acknowledges support of his work by The Scientific and Technological Research Council of Turkey (TÜBİTAK) (Grant No: 118Z137).

AUTH O R CO NTR I B UTI O N S
GP and NP conceived the manuscript, GP, NP, SB, and BU con-