Atmospheric Sciences Perspectives on Integrated, Coordinated, Open, Networked (ICON) Science

This collaborative article discusses the opportunities and challenges of adopting integrated, coordinated, open, and networked (ICON) principles in atmospheric sciences. From the global nature of the atmosphere, there has always been a need for atmospheric science to be an ICON science. With the help of evolving technology, it is possible to go further in implementing and spreading the ICON principles for productive global collaboration. In particular, technology transfer and applications could be approached with reproducibility in mind, and data‐sharing infrastructure could enable easier and better international collaboration. There are, however, various challenges in following the ICON principles in the acquisition, quality control, and maintenance of data, and the publication of results in a systematic way. Moreover, the extent of such issues varies geographically and hence poses different challenges to implementing ICON principles. In this commentary article, we briefly state our perspectives on the state of ICON, challenges we have met, and future opportunities. Furthermore, we describe how atmospheric science researchers have benefited from these collaborative multi‐dimensional approaches that fulfill the core goal of ICON.


Current State of ICON in Atmospheric Science
The implementation of ICON principles is reflected in atmospheric science research, including in-situ and remote sensing observations, laboratory experiments, real-time data availability, and numerical modeling. Also, field campaign results cannot be appreciated in isolation. For example, the Boreal Ecosystem-Atmosphere Study (BOREAS) (Sellers et al., 1995) aimed to better constrain the role of the boreal forest in the global carbon balance, requiring a coordinated and integrated approach, gathering atmospheric chemists, turbulence experts, ecologists, soil scientists, microbiologists, and satellite experts. Data was openly shared during and after the field campaign to maximize the program's benefit. The success of the BOREAS program results from the adhesion to ICON philosophy that is apparent through today's continental-scale, long-term carbon and water flux measurement networks (e.g., AmeriFlux, AsiaFlux, EuroFlux) (Novick et al., 2018). Networking was involved in data collection, analysis, and validating and improving numerical models presently used to understand land-atmosphere interactions.
Satellite observation and reanalysis data are increasingly assimilated in numerical models. Remote sensing protocols are well coordinated with United States Geological Survey (USGS) and Land Climate Zones (LCZ) schemes (Stewart & Oke, 2012). Many National Aeronautics and Space Agency (NASA) data platforms (e.g., EarthExplorer, Glovis) provide open-source data. While it is possible to analyze some data on web servers (e.g., National Oceanic and Atmospheric Administration (NOAA), Air Resources Laboratory (ARL), NASA Giovanni), the cost associated with obtaining licenses for standard software remains a barrier, especially in developing countries.
Finally, the application of real-time data has recently spread from local to regional scales, with increasing natural hazards. Many datasets are open, and the acquisition, transfer, and assimilation of real-time data are fairly well-coordinated, open, and networked (e.g., following the World Meteorological Organization (WMO) guidelines for climate forecasts). The real-time data are contributed through networking with different agencies (space, meteorological, oceanographic, etc.) of various countries around the globe, making it accessible to a broader scientific community.

Challenges to Implementing ICON and Potential Solutions
Field measurement programs like Tropospheric Ozone Assessment Report (TOAR), International Global Atmospheric Chemistry (IGAC) project, and Global Atmosphere Watch (GAW) generally have a history of an integrated, coordinated, and networked approach. The ground-based observational data are collected through networking among scientists that help to identify the gaps and challenges in various aspects of atmospheric studies related to source apportionment, impacts, and possible solutions.
There are challenges in data collection, both in field campaigns and the laboratory (e.g., high cost of instrumentation and its maintenance, lack of sufficient fundings and trained personnel), and timely data release that depends on the required quality control, and often on the publication of initial results prior to its release. These issues pose various challenges to ICON principles in different countries in forming an integrated atmospheric field data network. Uncertainties also exist in many components of climate models that can alter simulated processes due to the lack of data and understanding (George et al., 2015). Moreover, regional differences in meteorological conditions and climate variability also pose a challenge in designing an effective implementation of such mathematical models globally. Overcoming these challenges requires a networked approach among atmospheric scientists, computational modelers and engineers, networked with funding agencies. There is also a demand for solution-driven research by funding agencies that have sometimes diverted the focus away from basic processbased science, weakening ICON principles on research.
Similarly, there are challenges in integrating updated remote sensing data into global and regional climate models. Whilst heavily focusing on the model physics and parameterization schemes, most models use single/outdated land-use representation resulting in simulation biases, especially over cities and deforested regions. At present, this remains a challenging task, and a coordinated, open, and networked approach is needed to address these biases. Sharing databases, codes and algorithms should be encouraged under the Findable, Accessible, Interoperable, and Reproducible (FAIR) principles, which are part of the Open component of ICON. Finally, real-time data availability is still challenging because of inadequate infrastructure for observations, data storage, and transfer, particularly in developing countries. Consequently, open exchanges of data, code, and software are not always maintained, greatly challenging ICON principles. There is a need for real-time data to be better coordinated by improving and systematizing such processes (Vannitsem et al., 2021). While the relatively recent state of real-time data results in short time series of climate datasets, this challenges the use of machine and deep learning applications for climate forecasting (Jones, 2017), hazard forecasting, and assimilation of global real-time data (Kadow et al., 2020).

Opportunities for Scientific Progress by Implementing ICON Approaches
Field campaigns could benefit from adopting ICON approaches, beginning with the planning stages. Although it is challenging to reach across traditional disciplinary boundaries, adopting ICON principles would maximize the benefits of large and costly field campaigns. Summary publications in a "Notes/Correspondence" section currently offered in several journals, could be systematically set up at the planning stages as a venue to help integrate and coordinate such efforts. Publications should reach stakeholders, underrepresented groups, traditional landowners, and the general public. Protocols for minimal instrumentation requirements (e.g., placement, accuracy, resolution, and calibration) and measurements (e.g., sampling frequency, time stamps, and statistics) should be discussed for a consensus prior to the start of the campaign. All participants should agree upon articulated policies and deadlines for data sharing to ensure a seamless and transparent data transfer. The consistency of ICON approaches should be regularly checked in such campaigns.
Atmospheric science researchers have benefited from collaborative multi-dimensional approaches, a core goal of ICON. For example, laboratory projects can lead to extensive field research, promoting scientific ideas and experimental methodologies. Studies based on changing atmosphere demand an integrative and collaborative method. For instance, assessing the impact of air pollutants on biota requires collaboration between atmospheric chemists and toxicologists (Saxena & Sonwani, 2020).
Opportunities exist in creating processed, cross-validated, and open-access remote sensing data platforms. Cloud computing and web server Geographic Information System algorithms can directly process satellite data under standard, coordinated, and consistent protocols, aligning with ICON principles and avoiding researchers downloading and storing huge raw data files. The latter is especially helpful where computationally intensive regional and global studies are concerned and decreases an individual's computer demands and costs. Open access to global satellite data centers has made it possible to coordinate with other disciplines in atmospheric science as it can scan a wide area and search for promising study sites for field data collection. Cloud-based platforms for online visualization, analysis, and processing of large amounts of global-scale data benefit traditional remote sensing experts, as well as a broader audience (Gorelick et al., 2017).
Implementing ICON approaches in real-time data collection and applications may provide an unprecedented opportunity in extreme weather and hazards forecasting. However, the successful implementation is critically dependent on the amount of available data that are often sparse. Recent coordinated and integrated efforts have focused on the development of low-cost sensors (e.g., 3-D printed weather stations), crowd-sourced observing locations (e.g., Citizens Weather Observer Program or CWOP), and volunteer-supported intensive data gathering (e.g., NOAA heat watch campaign) that have used ICON principles to fill many such data gaps for real-time applications.

Way Forward/Recommendations
Early planning with clearly articulated ICON strategic goals with consensus among participating parties is recommended. Dedicated efforts to engage underrepresented groups, traditional landowners, stakeholders, and early-career scientists' training are also recommended. ICON implementation to experimental research requires a multi-disciplinary approach from atmospheric physicists, meteorologists, biologists, computer programmers, data analysts, and chemists. Interdisciplinary and collaborative research would benefit the scientific community, policymakers, stakeholders, and the general public. Collaborative networks among universities, non-governmental and governmental agencies, and multiple stakeholders are also recommended.
Remote sensing can highly benefit from open access data and programs. Open access tools for visualizing, processing, and analyzing large remote sensing datasets are recommended. Coupled remote sensing-atmospheric model forecasts and warning systems are recommended to assist city planners greatly, disaster management agencies, and local law enforcement. Overall, these suggestions and recommendations based on ICON principles help achieve the research objectives and their broader impacts more efficiently.

Current State
An essential component for global collaboration is the exchange of atmospheric observational data. The WMO Integrated Global Observing System (WIGOS) provides guidelines (WMO, 2018) and regulations for the international dissemination of meteorological and earth observations by national meteorological and hydrological centers and partners. However, in practice, several networks or data platforms are available to provide scientific data outside the WIGOS framework. For example, the FLUXNET community (Baldocchi et al., 2001;Pastorello et al., 2020) has some of the characteristics of ICON. More specifically, FLUXNET is integrating atmospheric, physical, and chemical processes related to the exchange of mass, momentum, and energy in the atmospheric boundary layer; the data produced is coordinated, with fixed variable names and units across the sites; a vast amount of data is shared openly following the specified data-use policies (including citation of data source); and data sharing and processing are networked, including limited support from the community in solving practical issues (e.g., setting up new sites or upgrading instrumentation). One of the elements of the success is the "critical mass" that the FLUXNET community has achieved. Researchers that operate an eddy-covariance site without supplying their data will be less visible in the atmospheric community. In addition, researchers that require data are best served by the FLUXNET community. These strong incentives of data providers and users to benefit from the FLUXNET infrastructure reinforce it as an example of successful ICON principles. At the larger spatial scales, remote sensing data from satellites provide valuable information from the surface up to the upper atmosphere. Many missions such as TIMED (Christensen et al., 2003;Niciejewski et al., 2006) (data website: http://www.timed.jhuapl.edu/WWW/scripts/mdc_rules.pl), COSMIC (Feltz et al., 2017) (data website: https://cdaac-www.cosmic.ucar.edu/) and CHAMP (Park et al., 2020;Reigber et al., 2002) (website: https://isdc. gfz-potsdam.de/champ-isdc/) are openly available to scientists all over the world for research purpose, but more efforts are needed to fully integrate and coordinate these data sources while allowing greater participation in mission development for the benefits of the network of all scientists.
Besides observations, models are an essential tool for studying atmospheric science, from short-term weather predictions to seasonal and climate projections. The Weather Research and Forecasting (WRF) atmospheric model (Skamarock et al., 2019) is a success story of collaborative model development, being used extensively across the academic world, industry, and some operational meteorological centers from lower-income countries. Likewise, following the ICON principles, the Subseasonal to Seasonal (S2S) international project (Vitart et al., 2017) integrates knowledge from several disciplines, coordinates efforts across weather centers, makes data openly accessible for everyone, while building networks of scientists that contribute to outlining future research directions.

Challenges
Software and data represent concrete artifacts around which global collaboration can revolve, but challenges related to their use in atmospheric sciences remain. As numerical models and remote sensing observations have large data storage and computational requirements, it is often necessary to use a High-Performance Computing (HPC) infrastructure to perform the data processing and atmospheric modeling. HPC and the high reliance on legacy and non-portable Fortran code bring extra challenges for collaboration as it is often hard to gain access to the HPC outside large government agencies and academic institutions. In addition, in many cases, it is difficult to transfer data and run code outside the HPC environment. Working on local machines brings several limitations to data download, storage, and processing, but it is often the only option available for many scientists. Moreover, the lack of universal standards on data formats, naming conventions, and units hinders the interoperability 10.1029/2021EA002204 5 of 7 between different data sources. A cultural shift away from a closed in-house HPC environment is necessary to foster global collaboration.
Even when this cultural shift is adopted at both the organization level and individual scientists, practical technical challenges remain. Individual scientists might not have access to computing infrastructure enabling storage of large data sets, the bandwidth to download all the data, or the computing power to run complex atmospheric models and data analytics pipelines. Many datasets have in-house data formats, which are often poorly documented. Even open data portals inconsistently follow FAIR principles, with some even allowing the dump of maps as a non-georeferenced picture and of tables in non-computer readable documents. One challenge met by scientists is the steep learning curve needed to adopt best practices concerning data formats, data access, and open-source code. For example, commercial cloud services can solve the reproducibility problem in computational science. Still the complex cost structure and tedious setup of cloud environments is a barrier to adoption by individual scientists. Permissive and non-restrictive open-source licensing such as MIT and BSD is imperative to foster code reusability and global collaboration, including with commercial weather companies, but licensing can be confusing and hard to understand for individual scientific software developers.
In addition to technical challenges, global collaboration can be limited in developing countries for several reasons. For example, early-career scientists from developing countries struggle to obtain funding and conduct research as funding is preferably available for senior scientists. Language is also an important barrier because non-native English speakers have the extra challenge of communicating science and writing papers in another language. It is worth mentioning that researchers from developing countries, for example, Latin America or India, face several issues for researching as they cannot afford costly subscriptions to international geoscience journals, travel to international conferences and meetings, and even more, elementary resources (e.g., data storage systems, office, and laboratory supplies). Achieving a truly global collaboration should encompass breaking technical and non-technical barriers in atmospheric research.

Opportunities
In terms of computing, the emergence of cloud computing platforms with user-friendly interfaces (Gentemann et al., 2021) can overcome the limitation of data storage, preprocessing, and postprocessing steps while working with atmospheric models. Some international projects like the Climate Model Intercomparison Project Phase 6 (CMIP6) (Eyring et al., 2016) and S2S project (Vitart et al., 2017) have begun sharing their data in Google and Amazon cloud systems. The international platforms, such as GitHub and GitLab, also benefit the scientific community because codes, models, and methods are shared under an open license, following ICON principles. For example, atmospheric models developed for studying trace gases related to surface exchange were made available through software repositories (e.g., Koren et al., 2019;Vilà-Guerau de Arellano et al., 2019). The NOAA Whole Atmosphere Model (WAM) Ionosphere Plasmasphere Electrodynamics (IPE) model, which describes the possible coupling between upper and lower atmosphere, is also stored in GitHub (Akmaev, 2011;Sun et al., 2015) Another opportunity to carry out ICON basis for global collaboration is to fully implement, expand and facilitate the use of "The Climate and Forecast metadata standard" (Eaton et al., 2020) to maximize interoperability between datasets. The use of a self-describing universal standard for atmospheric sciences will unify the data format and help scientists to share atmospheric data and develop seamlessly interoperable software tools. The Copernicus Data Store (Thépaut et al., 2018) is an excellent example of open data infrastructure following the FAIR principles. Datasets include model re-analysis, such as ERA5 (Hersbach et al., 2020), seasonal forecasts, climate projections, satellite-derived products, and historical in situ observations, which can be explored from a web interface or downloaded with a Python API. Open and free programming languages, such as Python (see Table 1), and the associated software tools for managing packages, sharing code, and running fully reproducible code and computing environments on the cloud enable more collaborative development of atmospheric data analysis tools.
It would be ideal to create incentives or mechanisms that allow early-career scientists to participate in diverse international workshops, openly collaborate with scientists from different countries, and establish international funding for different atmospheric science branches where nationality is not restricted. Additionally, the COVID pandemic has given a new insight into global collaboration. Important international conferences (e.g., EGU and 10.1029/2021EA002204 6 of 7 AGU annual meetings) have now been performed online, and the format is planned to continue as hybrid (in-person and virtual) for the near future. These changes let scientists from developing countries easily attend these conferences at a lower cost. However, some barriers still exist, such as different time zones, accessibility to the internet, and the limitation that online events constrain face-to-face human interaction.

Call to Action
Finally, we call out the various people that are involved in atmospheric sciences to make further progress for global collaboration following the ICON principles: 1. For scientists: Put more effort into sharing your data and code in an open and permissive way, aiming for full reproducibility of published results. Invest time in coordinating your datasets by following conventions and by including the necessary documentation 2. For organizations: Coordinate science by promoting open and FAIR data and by providing easy-to-use infrastructure (such as cloud computing) to host large datasets and enable data processing pipelines 3. For funders: Encourage globally networked and multi-disciplinary collaboration in project calls, for example, by requiring international partners and integrated research in funding applications 4. For societies and publishers: Keep meetings open and accessible in the post-COVID period (e.g., by offering virtual participation), make journals open-access and available to lower-income or middle-income countries and provide a flexible cost for publications from scientists in developing countries. Engage local communities and stakeholders as much as possible