wallace 2: a shiny app for modeling species niches and distributions redesigned to facilitate expansion via module contributions

(e.g., niche overlap). This expansion was paired with outreach to the biogeography and biodiversity communities, including international presentations and workshops that take advantage of the software’s extensive guidance text. Additionally, the advances extend accessibility with a cloud-computing implementation and include a suite of comprehensive unit tests. The features in wallace 2 greatly improve its expandability, breadth of analyses, and reproducibility options, including the use of emerging metadata standards. The new architecture serves as an example for other modular software, especially those developed using the rapidly proliferating R package shiny , by showcasing straightforward module ingestion and unit testing. Importantly, wallace 2 sets the stage for future expansions, including those enabling biodiversity estimation and threat assessments for conservation.


Introduction
As the complexity and breadth of analyses in ecology and evolution continue to expand, it is increasingly difficult for researchers to implement new methodological tools and ensure reproducibility.This is particularly true for models of species niches and distributions (hereafter ENMs: ecological niche models; also termed SDMs: species distribution models).Such models have a multitude of basic and applied uses (Stanton et al. 2012, Uden et al. 2015), and this research area has seen a flurry of new theoretical and methodological developments (Peterson et al. 2011, Guisan et al. 2017).Unfortunately, the pace at which new methods emerge has made entry into the field difficult.Further, programming scripts that accompany published ENM studies can be difficult to repurpose, and tools facilitating reproducibility and providing interactive visualizations to inspect data and models remain scarce.To help lower these barriers, the ecological modeling application Wallace EcoMod ver. 1 was produced four years ago (R package wallace; Kass et al. 2018;CRAN: https://cran.r-project.org/src/contrib/Archive/wallace/wallace_1.0.tar.gz;Github: https://github.com/wallaceEcoMod/wallace/releases/tag/v1.0.0).
Written in R and providing a graphical user interface (GUI), wallace was designed to be an accessible tool for researchers, as well as a resource for conservation practitioners and educators.Inspired by the 2015 Nielsen Challenge of the Global Biodiversity Information Facility, the package wallace 1 was developed using two packages that enable interactive analyses: shiny for development of applications with GUIs (Chang et al. 2021) and leaflet for map display and navigation (Cheng et al. 2021).Additionally, it harnesses packages with tools for ENMs documented in the literature and available on CRAN.wallace walks users through a modeling analysis, providing visualizations and extensive guidance text and documentation, with clear references to the underlying R packages used.It is hierarchically structured in discrete analysis steps ('components'), each with different methodological options ('modules') organized in a customizable workflow.At any point, users can download R code in an .Rmd file that reproduces the analysis.wallace 1 has seen great user interest, with > 51 000 downloads from CRAN (October 2022).The original paper provided a brief overview of recent SDM software applications, several of them interactive, and how each compared with wallace (Kass et al. 2018).Since then, the utility of R-based interactive applications in this research area has increased, evident by the development of other GUI-based software for ENMs that provide different functionalities: ntbox (Osorio-Olvera et al. 2020) and shinyBIOMOD (https://gitlab.com/IanOndo/shinybiomod).
The characteristics of wallace -accessible, open, expandable, flexible, interactive, instructive, and reproducible -do not exist together in any other currently available software.Notably, these characteristics also make it an ideal resource for education.The application offers users the ability to build and explore both models and mapped predictions without having to stitch together programming code, which is timeconsuming, tedious, and error-prone.Since its release, the application has been used globally in university coursework and by professionals in governmental and NGO positions.Additionally, wallace was featured in a free and comprehensive ENM web course (ENM 2020, Peterson et al. 2022) and educational webinars in multiple languages (Supporting information).Thanks to vibrant worldwide user feedback, mostly via its Google Group (https://groups.google.com/g/wallaceecomod), various software bugs were identified and addressed and several requested new features added (ver.1.1; CRAN archive: https://cran.r-project.org/src/contrib/Archive/wallace/wallace_1.1.0.tar.gz;Github tag: https:// github.com/wallaceEcoMod/wallace/releases/tag/v.1.1).
Building on the success of wallace to date and user feedback, we identified six key areas for advancement, leading us to restructure and expand the software.These key areas were: simplified addition of new modules (and unit tests), ability to make models for multiple species in the same session, production of detailed analysis metadata, increased options for inputting and downloading data, and ability to save and load unfinished analysis sessions.To do so, the core development team made major changes to the underlying structure of wallace (concomitant with a major redesign for the R package ENMeval (Kass et al. 2021) that builds and evaluates models) and honed the process of module addition with collaborators while developing extensions that reflected their areas of expertise.This expanded the application broadly (Fig. 1), with new functionalities centering on the key areas for advancement and including nine new modules (Fig. 2) documented via an updated vignette (https:// wallaceecomod.github.io/wallace/articles/tutorial-v2.html).The new modules include: two for new data acquisition (download paleontological occurrences and climate simulations), two for more flexible user inputs (draw background extent and model transfer to user-specified environmental data), two for metadata (download model metadata and citations for R packages used), and three composing a new environmental space comparison component.Below, we introduce the key advances of wallace ver.2; highlight the new component, modules, and functionalities; explain outreach activities and novel accessibility options; and discuss future directions for growth.

Key advances of wallace 2 Simplified module addition
Open software development in ecology, evolution, and biogeography thrives via community partnerships like the flagship R Project for Statistical Computing (https://www.rproject.org/).Such partnerships have enabled modular toolsets that cover a wide breadth of analyses (e.g., BEAST (Bouckaert et al. 2019), Bioconductor (Huber et al. 2015)).Modularity allows for new features to be shared among groups.Collaborative module addition is fundamental to the Wallace EcoMod project, which emphasizes community involvement as an engine to make the software pluralistic and track advancements in the field.When wallace 1 was developed, shiny programming in R was relatively new, with few modular examples available and no illustrations of functionality for straightforward addition of modules by users.For the first release, the application was engineered to allow the insertion of shiny modules, which necessitated modifying the main shiny application scripts.Although this process allowed for expansion, it nonetheless posed significant barriers because of the specific programming knowledge needed.
Therefore, we set out to improve the module-addition process.With dual goals of gaining crucial insight and expanding the functionality of wallace, the core development team invited collaborators representing four different research labs (Author contributions) to co-develop new modules.Based on this experience, we engaged with a specialist in shiny (DA) to develop a new framework for module addition that significantly reduces the programming burden on contributors (Fig. 3).Unlike the previous implementation, the modifications necessary to contribute new modules to wallace 2 are almost exclusively restricted to module-specific ancillary R scripts written without shiny functions.This reduces the chances of introducing errors as well as the knowledge required regarding both shiny and the structure of wallace (module-addition vignette: https://wallaceecomod.github.io/wallace/articles/module-addition.html).These architectural changes also facilitate unit testing that automates tests of code functionality, which are crucial to preventing errors during development.Although writing unit tests for R functions is straightforward using available tools, as modules in wallace 1 were based on shiny modules, standard R unit tests could not be written for them.In contrast, wallace 2 modules are specified as standard R functions, allowing for typical unit tests (R package testthat; Wickham 2011).wallace 2 now has a comprehensive suite of documented unit tests for all module functions, which serve as templates for future modules (found in the R package in wallace/tests/testthat).

Managing multiple species in a single session
We restructured the application to allow users to manage multiple species in the same wallace session.While the analysis unit in wallace releases prior to ver. 2 was data from a single species, it is now a list of species that allows for batch processing.wallace 2 still exclusively makes single-species ENMs (i.e., not joint models).However, users can now download or upload data for multiple species and either make speciesspecific methodological choices or, using the batch option for given modules, apply the same choices to all species.Additionally, as a product of one of the collaborations, we added a new 'Characterize Environmental Space' component with three new modules ('Environmental Ordination', 'Occurrence Density Grid', 'Niche Overlap') that enable comparisons between two species in environmental space using the R package ecospat (Di Cola et al. 2017;Fig. 1, 2).

Generation of analysis metadata
The ability for researchers to document key methodological details and reproduce published analyses is crucial for ENMs (Merow et al. 2019, Fitzpatrick et al. 2021)  metadata protocols such as ODMAP (Zurell et al. 2020) and best practices and guidelines (Araújo et al. 2019, Sofaer et al. 2019).wallace 1 generated R Markdown scripts (R package rmarkdown; Allaire et al. 2021), which embedded R code within narrative text and could be run to reproduce results.This allowed non-programmers to conduct a reproducible analysis-a first for ENM software.However, there was also a desire for metadata describing analysis decisions, following emerging community platforms.
Therefore, via module partnerships, we made two new additions to generate different kinds of metadata.One partnership dovetailed with the development of the first programming toolset for ENM metadata: the Range Model Metadata Standards (RMMS), implemented with the rangeModelMetadata R package (Merow et al. 2019) as an editable R object that wallace now uses to record methodological decisions by the user.Notably, RMMS provided the technical foundation upon which the ODMAP reporting framework operates (Zurell et al. 2020).The RMMS object is now available for download at any stage of a wallace analysis via a new module 'Metadata' under the 'Reproduce' component tab (Fig. 1).The standardized metadata that wallace now produces can be used to help populate ODMAP (Zurell et al. 2020) and even facilitate scoring of ENM analyses for applied uses (Araújo et al. 2019, Sofaer et al. 2019).Another partnership was with the developers of the R package occCite (Owens et al. 2021), which gives users the option to generate a unique and persistent identifier for occurrence data downloads from the Global Biodiversity Information Facility (GBIF).When users download occurrence data from GBIF and check the 'Include Data Source Citations' option (Fig. 1), they now receive a DOI reference for the occurrence dataset.Finally, a new module 'Reference Packages' produces citations for the R packages used in the analysis, promoting documentation and giving credit to their original developers (Fig. 2).These new metadata functionalities help users achieve documentation and reproducibility of both methods and results, though final processed data must still be archived separately (Anderson et al. 2020).

New database connections and download options
wallace now offers database connections for downloading cleaned occurrence data for plants via the Botanical Information and Ecology Network (BIEN, R package BIEN;Maitner et al. 2018) and a 'Paleo' module for paleontological records using the Paleobiology Database (R package paleobioDB; Varela et al. 2020).Complementing the latter, the new module 'EcoClimate' also downloads environmental data from the ecoClimate database (Lima-Ribeiro et al. 2015; R package ecoClimateR (https://github.com/ecoClimate/ecoClimateR)),consisting of bioclimatic variables from 0.5 degree resolution climate simulations derived from general circulation models (GCMs) for multiple time steps for Figure 3. Steps for adding a module in wallace 2 using the new simplified process, showing mandatory steps (white shapes) and recommended ones (gray shapes; see module authorship vignette https://wallaceecomod.github.io/wallace/articles/module-addition.html).1) Pick short module name.2) Write R function with same name as module, including required arguments.3) Create required module template files using create_module().4) Modify configuration file (.yml) that specifies module parameters.5) Create module controls and functionality: i) define user interface; ii) define server functionality; iii) create user-interface output for results (e.g., plot, table); iv) write functionality to plot results on leaflet map; v) enable module to record information for saving and reloading.6) Compose guidance text.7) Write to downloadable 'Session Code' .Rmd file to enable reproducibility (requires modifying two files (dotted line)).Final) Register module to be included in the interface via register_module() (dashed line).
16000587, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/ecog.06547 by Bcu Lausanne, Wiley Online Library on [08/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License the past, present and future (Fig. 1, 2).Moreover, expanded options now exist in a dynamic 'Save' tab to help users download data in each analysis step (details below).

Ability to save and load unfinished analysis sessions
wallace 2 has the option to save a session in progress and then later continue it.Via the new 'Save' tab within an analysis component (Fig. 1), users can save current progress by storing all session data (e.g., occurrences, environmental data, predictions, plots) and user decisions (e.g., parameters, options) in a .rdsfile (format for storing single R objects).Users can later continue an unfinished analysis by loading the .rdsfile in the new 'Load Prior Session' tab under the 'Intro' component.

New component, modules and functionalities
This release premiers a 'Characterize Environmental Space' component with three modules, and six other new modules spread across existing components.Most analyses in wallace are spatial in nature and thus operate in 'geographic space', but the new component features analyses for pairs of species in the space defined by the environmental predictor variables, termed 'environmental space' (Peterson et al. 2011).The module 'Environmental Ordination' defines the environmental space as the first two axes of a principal component analysis (PCA) on the predictor variables.Within this space, the module 'Occurrence Density Grid' generates a kernel density grid for the occurrence localities, where the densest regions have the most occurrences relative to that region's environment.The third module 'Niche Overlap' can calculate the overlap between the occurrence densities of two species in environmental space and then run permutation tests to determine the significance of the empirical overlap value.The analyses featured in these modules are based on methods described in Warren et al. (2008) and Broennimann et al. (2012) implemented in the package ecospat.
In addition, six new modules expand data acquisition, data inputs, and metadata, and other new features make improvements to existing modules.First, two new modules enable modeling with paleontological data.Paleontological occurrences and environmental data can now be downloaded using the module 'Paleo' (component 'Obtain Occurrence Data') and the module 'EcoClimate' (component 'Obtain Environmental Data').Two other modules enable new customized inputs.In addition to the existing methods that draw shapes around the occurrence localities, background extents defined in the component 'Process Environmental Data' can now be drawn on the map with mouse clicks using the module 'Draw'.Moreover, in the component 'Model Transfer,' predictions can now be made to user-specified environmental layers representing new areas or time periods with the module 'Transfer to User Environments'.Lastly, two modules increase the capacity of wallace to provide metadata on modeling analyses.The module 'Metadata' under the 'Reproduce' component provides a fully documented RMMS metadata object containing all the details of the analysis; the object can be shared with collaborators or used as supporting information for publications.Lastly, the module 'Reference Packages' provides a list of citations for the R packages (and their version numbers) used in the analysis, which promotes documentation and gives credit to the developers of packages that wallace uses.Being cited in papers and reports should increase the incentive for researchers to formalize their code into R packages on CRAN and join the wallace community to integrate them into future releases.New features for existing modules include the ability to: remove occurrence data that lack quantification of georeferencing uncertainty (helping users focus on the high-quality information; Anderson et al. 2020), map model predictions with a specified quantile of omission, specify categorical variables for modeling, use the new ENMeval 2 to tune models via parallel processing and report expanded performance metrics (Kass et al. 2021), visualize aggregate map tiles for downloading 30 arcsecond bioclimatic predictor rasters, and define custom extents for model transfer by drawing or uploading shapefiles.

Example
As an example analysis to highlight some of the advances of wallace 2, we focused on two closely related carnivoran mammal species of the genus Bassaricyon (Family Procyonidae).To do so, we employed bioclimatic variables from the eco-Climate database to compare the species' climatic niches and model their ranges, then transferred one of the models into the deep past and near future.Bassaricyon neblina, or the olinguito, is found in tropical montane areas of western Colombia and Ecuador.Bassaricyon alleni, or the eastern lowland olingo, has a broader range throughout northern South America.The workflow, methodological choices and images of the results are shown in Fig. S1.A more detailed worked analysis of key prior and new functionalities can be found in the current vignette (https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html).

Transparent peer review
The peer review history for this article is available at https:// publons.com/publon/10.1111/ecog.06547.

Figure 1 .
Figure 1.Highlights of wallace 2 interface.(A) Main panel showing active component 'Obtain Occurrence Data' with new species menu (dashed green line) and database DOI citation in log window (dashed red line).Also note new features (dashed yellow line): tabs for Reproduce component (offering reworked downloads of session code, metadata, and package references), support tab with help links, and 'shutdown' button.(B) New option to load prior sessions located in the 'Intro' component.(C) New 'Save' tab with expanded download options.

Figure 2 .
Figure 2. Schematic for updated wallace 2 workflow options.(A) All components or main steps of analysis (yellow for existing, orange for new).(B) R packages used for each component.(C) All modules, or major methodological options within components (blue for existing, purple for new).These new features enable environmental space comparisons for two species; data acquisition from climate-simulation, botanical and paleontological databases; custom data inputs; model metadata tracking; and citations for R packages used.