Data sharing in modeling and simulation of biomechanical systems in interdisciplinary environments

All digital objects that result from the modeling and simulation field are valid sets of research data. In general, research data are the result of intense intellectual activity that is worth communicating. This communication is an essential research practice that, whether with the aim of understanding, critiquing or further developing results, smoothly leads to collaboration, which not only involves discussions, and sharing institutional resources, but also the sharing of data and information at several stages of the research process. Data sharing is intended to improve and facilitate collaboration but quickly introduces challenges like reproducibility, reusability, interoperability, and standardization. These challenges are deeply rooted in an apparent reproducibility standard, about which there is a debate worth considering before emphasizing how the modeling and simulation workflow commonly occurs. Although that workflow is almost natural for practitioners, the sharing practices still require special attention because the principles (known as FAIR principles) that guide research practices towards data sharing also guide the requirements for machine actionable results. The FAIR principles, however, do not address the actual implementation of the data sharing process. This implementation requires careful consideration of characteristics of the sharing platforms for benefiting the most of the data sharing activity. This article serves as an invitation to integrate data sharing practices into the established routines of researchers and elaborates on the perspectives, and guidelines surrounding data sharing implementation.

compilations; these sources constitute a key outcome of academic research and are therefore also included under the term research data [14].
The solid advancement of science and engineering strongly depends on the communication of new developments, techniques and discoveries.Communication is intended to (i) facilitate the sharing of ideas, knowledge, and findings between researchers, (ii) promote the dissemination of research outcomes to wider audiences, (iii) encourage constructive debates, critical feedback, and the validation of findings, and (iv) enable collaboration.
i.The sharing of ideas and knowledge in science plays a key role for transparency, innovation, and progress.While sharing ideas increases the citation of publications and improves the visibility and recognition of researchers, through democratized knowledge, the scientific community ensures that research findings are accessible to wider audiences.Sharing ideas and knowledge allows the exchange of expertise, helps to avoid duplication of efforts, and ultimately enhances the quality of science.ii.Reaching wide audiences is crucial in science as it expands the potential user base and broadens the impact of research.By disseminating scientific knowledge to a broader audience, including practitioners and the general public, scientists can attract attention and foster engagement with their work.Reaching wider audiences opens avenues for discussing the economic and ethical implications of scientific advancements, encouraging informed public discourse.By breaking down complex concepts and making science accessible to all, researchers can bridge the gap between the scientific community and society, cultivating a culture of appreciation and support for scientific endeavors.iii.Debate, feedback, and validation are integral to the scientific process as they ensure the strength and credibility of scientific arguments.Engaging in debates and receiving feedback from peers helps researchers identify the strengths and weaknesses of their arguments, promotes critical thinking and encourages researchers to consider alternative perspectives.The challenging viewpoints that result from debates suggest a rigorous validation process that ensures transparency and credibility.In this way, debates enable researchers to refine and strengthen their hypotheses and conclusions enhancing the overall quality of scientific investigations.iv.Collaboration is particularly important when addressing complex and interdisciplinary problems.By enabling collaboration, researchers gain access to a wealth of valuable resources, including data, methods, and expertise from colleagues in diverse fields.Collaboration allows scientists to pool their knowledge and skills, leveraging collective intelligence to tackle challenges that require multidimensional approaches.Through collaboration, researchers can benefit from different perspectives, innovative ideas, and complementary strengths, which ultimately leads to more robust and comprehensive solutions.By sharing resources and working together, scientists can accelerate the pace of discovery and enhance the reliability of results.
Collaboration among distributed experts is not only a common practice in engineering research but also a must in engineering industry.For instance, the exchange of simulation data and finite element analysis data in the context aeronautics industry was investigated by Charles et al. [9].With the aim to facilitate collaboration between different teams and providers, the authors proposed the development of a data standard as a common format for data representation and exchange, and the development of data translators to interface with specific software tools.The challenges in following this strategy include interoperability issues between different software tools, the complexity of finite element analysis data, the validation and verification of exchanged data, and the need for customization of the standard.Despite the promising potential of the strategy, the authors acknowledged that data validation, customization, and integration with other standards and tools remain challenging.This example helps to understand all the challenges that are included under the term data sharing.
From an academic perspective, the challenges and practices of data sharing and reuse were explored by Wallis [39].The author identifies two roles in data exchange: data producers and data reusers.Data producers are willing to share their data when they see a clear opportunity and a specific user who requests it, but they also face uncertainties and risks, such as losing credit or having their data misused for different purposes than the original ones.Data reusers, on the other hand, need to spend time and resources to evaluate and prepare the data for their own use.To overcome these difficulties, the author suggests that data producers should create opportunities themselves to involve data reusers in their projects, so that they can support the interpretation and reuse of their data.This strategy, called courting data reusers, can enhance the visibility, impact, and quality of the data, but it also requires careful consideration of the costs and expectations involved.
In the last years, the scientific community has recognized the importance of making data more reusable and accessible.With the aim to promote reusability and to test reproducibility, Erdemir et al. [16] proposed initiatives that begin with the publication review process.Erdemir et al. showed two cases: one on full body human gait, and another on a computational representation of the knee joint; and requested the reviewers to reproduce the simulation results.Although reviewers were able to reproduce the results with minor discrepancies, they were not able to reproduce the simulation results in the first evaluation.During the review process, the authors tested software versions, operating system, and control algorithm parameters and concluded that the quality of the original experimental data set was the cause of the difficulties in the reproducibility tests.Erdemir et al. acknowledge that the work load of reviewers and authors increased by the reproducibility evaluation request, but argue that adopting the reproducibility request in the scientific review significantly increases the credibility of the modeling and simulation research field.
Human gait, knee joints and all computational biomechanics models rely on a large amount of data, not only for the parameters of the model equations but also for the selection and preprocessing of geometries and other data sources.A good understanding of the components of a simulation system enables researchers to build increasingly detailed models.The components of a system, either at the scale of cells or organs, interact under different conditions that must be partly defined in the model equations and partly during the preprocessing stage, which includes specifying boundaries of structures, closed volumes, meshes, particular regions for each type of tissue and the interaction between them, assembly of the components, definition of material properties, boundary conditions and so forth [10].Providing all intermediate derivative data, along with the source data and final model, is crucial for facilitating collaboration and the advancement of research.By making the complete workflow and associated data available, future users can focus on their scientific or clinical goals instead of starting the model from scratch.This approach supports the execution, reuse, and modification of previous results, enabling more efficient and impactful research endeavors [17].An example of this approach can be found in the work of Chokhandre et al. [10].
Chokhandre et al. [10] presented a complete workflow and database of digital assets generated during the development of a cohort of eight knee models for finite element analysis.The authors provided all products of the modeling workflow, such as tissue segmentation labels, surface geometries, surface and volume meshes, template and customized finite element models of the knee joints, and simulation results of passive flexion.The authors aim to overcome the challenges related to model development, customization, calibration, and benchmarking, which often require a lot of time and resources.By making the data, models, guidelines, and tools publicly accessible, the study facilitates data sharing and reuse in musculoskeletal research and clinical decision making, and encourages clinical innovation.The results and data shared by the authors is part of the Open Knee(s) project, which is a growing database repository of specimen specific finite element models of the human knee joint.The Open Knee(s) project is one example of a repository platform developed in the field of biomechanics, which is an open source project that provides free access to data, models, software, and analysis and visualization tools for researchers and developers.Some of the most popular model repository platforms that share those characteristics are (i) Physiome Model Repository, (ii) SimTK, and (iii) FEBio [17].
i. Physiome Model Repository incorporates semantic annotation of computational models allowing models to be accessible, comprehensible, reusable, and discoverable.This repository was designed to store and manage CellML models.CellML provides a standardized format for describing the structure, behavior, and interactions of biological entities, such as cells, subcellular components, and biochemical reactions.The open standard CellML is a markup language to encode mathematical models of biological processes that are based on systems of ordinary differential equations (ODEs) and differential algebraic equations (DAEs).Although CellML models ignore spatial gradients, spatial information can be handled by a complementary standard called field modeling language (FieldML) that allows wider applications such as constitutive material laws.CellML is designed to support the design and definition of biological processes, and to provide a repository of validated models [34,2].ii.SimTK is a broad collaborative platform that includes various software projects, (including the Simbody project, which is an open-source software for multibody dynamics and biomechanics simulations).SimTK provides a platform for sharing computational models and tools across different disciplines, including biomechanics, molecular dynamics, systems biology, and medical image analysis [15].iii.FEBio is a specific tool for 3D biomechanical applications.It is a software tool that offers modeling scenarios, constitutive models and boundary conditions relevant to a variety of applications.FEBio focuses on solving problems that involve nonlinear large deformations, multiphasic materials, and multiphysics simulations, and provides a software suite for running, analyzing and visualizing simulations, and it gives access to a model repository [28].
We have discussed how communication and collaboration lead to technological challenges and solutions in the field of modeling and simulation.Repository platforms come as proposed solutions that present models in a curated and well-administered manner, with the main purpose of facilitating reusability.However, the models available in these platforms need to be verified for proper functioning before testing customized scenarios.This critical test process is comparable to the procedures followed by Erdemir et al. and Chokhandre et al. [16,10], and is related to the general debate of reproducibility.
In Section 2 we elaborate on the concepts that guide the discussion around reproducibility, and arrive at a suitable demarcation line for replicable or non-replicable results that allow us to highlight the need for modeling and simulation in biological systems.Section 3 provides a summary of the usual research practices in the modeling and simulation research field and recommends the additional practice of data sharing.In Section 4, we introduce the FAIR principles (findable, accessible, interoperable, reusable) for data sharing, we mention wrong interpretations of them, and also comment on the scope of the NFDI (National Research Data Infrastructure) initiative that focuses on the implementation of the FAIR principles in the data management practices in Germany.Section 5 outlines key characteristics of data repositories that enable researchers to share their data under the FAIR principles.Finally in Section 6, we present discussions and conclusions remarks.

PERSPECTIVES ON REPRODUCIBILITY
Along with validation, replication is a standard approach in building knowledge upon previous findings.Although replication is used across different fields like medicine, sciences, engineering, social studies, and humanities, not all scientific results have to be replicable (numerous research areas investigate unique events such as climate change, supernovas, volcanic eruptions or past events) [13].For all phenomena that can be repeated at will, replication means repeating experiments and studies, to test if techniques and results are reliable.However, many findings in biomedical research are not reproducible or consistent [22,3], and this has led to a big debate about the overall trustworthiness of scientific findings and its impact on the scientific community.The debate, known as "replicability crisis," refers to concerns raised about the quality of research when some studies could not be successfully reproduced.
In a survey on 1578 researchers, Baker [1] showed that reproducibility is a widespread concern in science.Of the respondents, more than 70% reported failing to reproduce experiments of other scientists, and more than 50% reported failing to reproduce their own experiments, see Figure 1.In contrast, less than 20% of the respondents reported that they had ever been contacted by another researcher who could not reproduce their work.Approximately 38% of the researchers attempted to publish successful replications, failed replications or both; 24% published successful replications and 13% published failed replications, see Figure 2. The survey findings also revealed a lack of consensus regarding the definition and criteria for reproducibility.Additionally, the study highlighted several obstacles and discouraging factors associated with publishing replication attempts.These challenges involve difficulties in contacting original authors, the lack of interest from journals, and concerns about potential damage to one's reputation or relationships within the scientific community.However, the majority of researchers maintained confidence in the authenticity of the original findings and the credibility of published literature.
Reproducibility in digital medicine was examined by Stupple et al. [37].In this field (that employs algorithms and data analysis to generate new insights for clinical practice) reproducibility refers to the ability to produce consistent results through an independent study that follows the same procedures as the original.The authors argue that reproducibility is affected and overlooked due to flaws in the conventional peer review process: reviewers analyze a paper voluntarily, with little quality control, transparency, or incentives aligned toward thoroughness.In spite of those flaws, the importance of reproducibility in digital medicine is underscored by the growing access to diverse datasets and more powerful computing capabilities.The availability of extensive and varied data, combined with advanced computing tools, offers researchers unprecedented opportunities to verify and replicate findings across different contexts.Those opportunities, however, encounter several challenges including the vulnerability of clinical data, the requirement for robust research practices such as pre-publications and registration of analyses in advance, and the promotion of open and accessible research.Although data and code sharing are already common practices in the field, reproducibility is currently not guaranteed.Whether reproducibility is a must or it depends on particular fields, is still a debate.
The debate on replicability and replication in science depends on how we define and use these terms.Replication is the attempt to reproduce a previous finding, while replicability is the quality of a finding or a result.Central to this discussion are "direct" and "conceptual" replications: the former repeats the original experimental protocol and materials to produce similar outcomes, while the latter tests the same effect using different experimental protocols or materials, focusing on generalization and robustness [21]."Scoping reproducibility," a notion introduced by Leonelli [26], underscores the significance of re-running experiments to detect sources of outcome variation; this viewpoint proposes that

Successful reproduction
Have you ever tried to publish a reproduction attempt?

Published
Failed to publish

F I G U R E 2
In spite of the results in Figure 1, only a fraction of participants tried to publish replications.Source: Adapted from Baker et al. [1].
the lack of variance identification is not necessarily indicative of a crisis.The true origin of the crisis narrative are the reported failures to reproduced existing results.Although the perceived crisis is enhanced by publishing biases, such as studies with small effects are less likely to be published, a positive narrative centered on challenges and transformation emerges from the debate [18].The evolving perspectives indicate that when dealing with replicability and replication, it is vital to consider three key factors: research questions, experimental setups, and subjects of interest [21].
Replicability is influenced by the wide-ranging research questions and experimental setups within different research disciplines.The humanities, which study meaning and style, accept multiple valid answers because they depend on historical and social contexts; for example, global migration is a complex topic that needs different perspectives to formulate viable solutions.This example shows that the research setup is important to evaluate replicability.Leonelli [26] observes that experimental setups are diverse, for example standardized experimentation, semi-standardized experimentation, and participant observation; and says that the main issue is standardization and control; for example, computer science has high replicability because it has controlled setups, while participant observation has low replicability because it has uncontrollable variables.In contrast, fields like biology and chemistry have semi-or non-standardized setups, and they require the consideration of the third factor, that is, the nature of the subject of interest.
The concept of "localism" is related to the nature of the subject of interest.According to this concept, replicability is not a universal rule for all sciences, especially for observational sciences and humanities.Localism says that failure rates depend on the specific fields or subfields.Under this perspective, research with less standardization is not less valuable, but more sensitive to the local context; therefore, defining boundaries of different subfields becomes crucial, where high standardization allows direct replication, while the complexity of living things makes it impossible.The nature of the subject of interest, which is the distinction between living and non-living entities serves as a useful demarcation line.
Based on the concept of "uniformity of nature," direct replicability is identified as a "reasonable proposition" in certain contexts, or constrained to subjects classified as "interactive" or "indifferent" [35,11,32].All these denominations considers subjects that exhibit complexity, historicity and dependence on the environment.Though direct replication is viable in physics or chemistry, its effectiveness declines in fields like social psychology where historical and socio-economic factors can influence outcomes.Subjects with memory, like humans, can break the uniformity assumption and change the replicability level.In response to the challenges posed by time and environmental influences, researchers working with animal models have incorporated systematic heterogenization into their experimental setups.This involves utilizing groups of animals with diverse characteristics.This departure from rigid standardization aims to improve replicability.
We discussed how several aspects, including review processes, experimental setups and the classification of subjects have a strong effect on replicability.In addition, considering that modern technological progress shows that context and historical factors are relevant even at the molecular level in life sciences [20], deciding which research fields can achieve replicability and cannot needs a careful, case-by-case approach.In this context, the distinction between living and non-living subjects is only a general guideline for demarcation, and suggests that to understand and predict the behavior of biological systems, it is not enough to rely on experimental observations alone, as they may be limited, incomplete, or inconsistent.Therefore, it is important and necessary to use mathematical models and simulations to complement and enhance the experimental data.The next section describes the common practices of the modeling and simulation research field.

RESEARCH PRACTICES
Modeling and simulation in biomechanics contribute to a deeper understanding of biological mechanics and drive advancements in various applications, from medical treatment to sports performance.Some of the key features that require careful consideration to make modeling worthwhile are: (i) complex research questions, which necessitate modeling due to multidisciplinary collaboration that aims to integrate expertise from several research fields including medicine, engineering, and image processing; (ii) limited data sources, which modeling compensates for when experimental techniques are invasive, non-ethical, or prone to errors; (iii) tradeoffs in the outcomes, which balance objectives and constraints that result from different levels of detail, complexity, accuracy, and computational costs, and help evaluate strengths and limitations of the outcomes; (iv) the potential impact on research and clinical practice, which reflects the relevance, novelty, and significance of the research question and its outcomes, by enhancing the understanding of biomechanical phenomena, such as injury, disease, or treatment, at different levels of organization, from molecules to organs.According to the Agency for Healthcare Research and Quality [12], there are several principles that researchers follow in conducting and reporting model and simulation studies, Figure 3, and are briefly described below.The principles help communicate and identify strengths and limitations of the model, as well as the areas for improvement and future research.that describe how they interact with each other.The relationships between components should be supported by relevant data, evidence, or theory.In addition, the modeler should explain and justify why the chosen components and relationships are appropriate, valid, and sufficient for answering the research question or solving the decision problem.The decisions made in accordance with this principle significantly impact the model's behavior and performance.Additional considerations on the classification of models into structural, micromechanical, or multiscale is not a common practice but facilitate the definition of the scope and the relationship between model components, and also data sources and interpretations [7].
As researchers in the modeling field, we adopt a comprehensive perspective to accomplish our objectives in the most rigorous way.Most of the time, we follow the principles without needing to consciously focus on them.However, in some cases, collaboration with colleagues highlights deficiencies in one or more principles, prompting us to improve our work, and to ensure that the principles are fulfilled.Thus, the principles for good practice in modeling and simulation naturally integrate into the daily routine of modelers.There is, however, an additional aspect that is usually overlooked: the need to incorporate practices aiding fellow researchers to build upon our work, specially those with different backgrounds or newcomers to the field.Data sharing is the additional aspect that needs to become an integral part of our working routines.The next section elaborates on the data sharing process, presents the FAIR guiding principles, and discusses misinterpretations and implementation of these principles.

DATA SHARING PRINCIPLES
As mentioned before, complex research questions usually necessitate a modeling approach, and the results, that is, algorithms, computational models, simulations, and compilations, are research data.To ensure transparency, reproducibility, and reusability, all components of the research process must be available; therefore, the communication of the modeling results requires that data are findable, accessible, interoperable, and reusable, which are the four pillars of the FAIR principles.By following the FAIR principles, data producers and publishers can ensure that their data are properly collected, annotated, archived, and documented, and that data can be easily discovered, accessed, integrated, and analyzed by others [40].
The increasing demand for enriched and updated information heavily relies on computational tools.The data retrieving procedure usually involves researchers searching for data online.Often, they need to be creative, formulating multiple ways to describe their search and potential sources.This indicates that computational availability is not yet optimized, and that common practices for submitting new data collections should prioritize machine-usability, allowing both computational resources and researchers to effortlessly benefit from standardized protocols for reporting, storing, and accessing data.Thus, the FAIR principles serve as guidelines for humans as well as for machines: any interpretation and implementation of the FAIR principles should produce machine-actionable results.While this observation gives freedom to all research fields, this freedom is producing incompatible solutions between those research fields [23].
Data and metadata are the two key technical concepts that guide the interpretation of the FAIR principles.Data refers to any digital resource, including data itself, software tools, and code files; while metadata refers to any description that enables the findability and reusability of the specific data: the data is the object described by the metadata.Accordingly, the term "(meta)data" is used when the principles apply to both data and metadata.(Meta)data is intended to provide machines the capacity to (i) identify the type of data, (ii) determine if it is useful within a context (iii) determine if it is usable, in terms of license, consent, or other use constraints, and (iv) take appropriate action [40,23].
The term FAIR was initially coined at a Lorentz workshop in Leiden, the Netherlands, in 2014.During this workshop, experts from different fields discussed the challenges and opportunities of data-intensive science.The initial draft of the FAIR principles emerged from this discussion and is accessible at [27].Subsequently, the principles were refined and officially published in 2016 by Wilkinson et al. [40], and are reproduced in Table 1.An extensive discussion about the interpretation and implementation of the FAIR principles was presented by Jacobsen et al. [23], misinterpretations were discussed by Mons et al. [29] and the most common misinterpretations are briefly described below.
• The FAIR guiding principles are not prescriptive like standards.While standards dictate specific approaches, FAIR principles are guidelines that permit various methods to make data and services Findable, Accessible, Interoperable, and Reusable.
• In the context of data sharing, FAIR implementation may be confused with RDF, linked data, or Semantic Web; FAIR is none of them, although all of them are related concepts that aim to make data on the web more understandable and interoperable for both humans and machines.RDF (Resource Description Framework) is a standard model for representing and exchanging data on the web [8].Linked data is a set of best practices for publishing and connecting data on the web using RDF and other standards that allows data from different sources to be linked and integrated [6,4].Finally, semantic web is a vision of a web that is enhanced by machine-readable semantics and logic.RDF, linked data, and other technologies like ontologies (controlled vocabularies that belong to specific domains, that apart from being a list of agreed terms, also captures relationships between these terms), and inference languages are all TA B L E 1 The refined FAIR principles [40].Semantic Web technologies that enable applications to reason about data and provide intelligent services, such as search, recommendation, or analysis [5].The FAIR principles do not mandate any specific technology.
• FAIR is not just about humans being able to find, access, reformat, and finally reuse data.The FAIR Principles emphasize the vital role of computers in accessing data publications independently.The goal is to minimize the extensive time researchers spend on data discovery and preparation.
• FAIR emphasizes providing clear conditions for data access, reuse, and citation without mandating data to be open or free.FAIR promotes transparent yet controlled accessibility, ensuring data reusability.
While the FAIR principles provide a solid foundation for data sharing, it is important to note that they do not encompass the specific implementation aspects of the data sharing process.In Germany, the National Research Data Infrastructure (NFDI) was created to establish data management standards in alignment with the FAIR principles.The primary goal of NFDI is to ensure the long-term accessibility and availability of research data by creating and advancing services, organizations, and infrastructures that facilitate the effective handling of research data [30].The NFDI is organized as a network of consortia, each of them focusing on a specific research field and developing its own objectives and work program that may differ from other consortia depending on the level of development of the infrastructure in the consortium and its interconnections with other national and international infrastructures [31].Several research fields are grouped into four main areas: engineering sciences, humanities and human sciences, life sciences, and natural sciences.Of particular interest to the field of modeling and simulation is the engineering sciences area, which contains consortia for data science and artificial intelligence (NFDI4DataScience), interdisciplinary energy system research (NFDI4Energy), engineering sciences (NFDI4Ing), materials science and materials engineering (NFDI-MatWerk), and computer science (NFDIxCS); additionally, due to its cross-disciplinary nature, the mathematical research data initiative (MaRDI) is also a consortium of interest for the field.To avoid parallel developments, ensure interoperability among the projects, and provide common services and solutions for all consortia, NFDI also supports a cross-cutting project called Base4NFDI.The NFDI initiative is a general implementation of the FAIR principles that encompasses all aspects of data management; however, the specific implementation of data sharing involves a careful assessment of the characteristics and functionalities of the sharing platforms or data repositories; we describe those characteristics in the next section.DATA REPOSITORY CHARACTERISTICS All data repositories are FAIR to varying degrees, but not all of them provide the same features.Some repositories may provide only service (like FigShare, Zenodo, EUDAT), while some others provide installations or services (like Dataverse and DSpace).There are several possible aspects that make an academic data sharing repository desirable.Besides fees, storage, and sustainability (see Table 2), the following features are essential when choosing a data sharing repository [33,25].
• Infrastructure: To support operations, the repository offers basic features such as open-source tools, local or remote storage options, installation, or service frameworks.
• Data licensing: To meet legal and ethical standards, the repository offers different license and authorization options to state rights and obligations of each party.Licensing allows clear, compatible, and customizable terms for sharing data, ensuring proper usage, acknowledgement, and resolution of disputes.
• Version control: The repository allows users to access and compare various iterations of a dataset.Users can select and download specific versions of the dataset for their particular needs.
• Data security: By setting different levels of permissions and user roles, the repository allows owners of data-sets to control who can access and modify their data-sets.The repository applies encryption, authentication, authorization, auditing, and anonymization techniques.
• Data curation: To ensure that the data is accurate, consistent, complete, understandable, and reusable by others, the repository provides services that improve the quality, usability, and preservation of the data.
• Content organization and control: The repository structures data in a linear or hierarchical manner, facilitates efficient access control through user profiles, and implements embargoes for regulated data releases.
• Data discovery: For users to find, explore and access the data they need, the repository offers tools such as search engines, filters, categories, recommendations, or visualizations, and helps to narrow down search criteria and refine results.
• Data citation: To attribute data sources, assess impact, and enhance data visibility, the repository offers tools like DOIs (Digital Object Identifiers), citation generators, exporters, and trackers.
• Interoperability: For data and metadata to be easily exchanged and used by different systems and applications, the repository adopts common standards and formats for representing data and metadata, as well as common vocabularies and ontologies for describing their meaning and context.
The popularity of a data repository varies depending on the research field, and researchers often have multiple options to choose from.This diversity of choices offers both advantages and disadvantages in relation to the FAIR principles.Due to the lack of standardization in repository features, desirable elements like customization can potentially lead to compatibility issues between repositories, undermining the core FAIR principles.Therefore, while standardization is not TA B L E 2 General characteristics of common data repositories [38,33].the primary aim of implementing the FAIR principles, careful consideration of repository features, in alignment with the trends of each specific field, significantly impacts the outcomes of the data sharing process

DISCUSSION AND CONCLUSIONS
The perceptions and practices of data sharing among researchers indicate that while data sharing enhances advancement, transparency, and collaboration, it also poses challenges due to legal and ethical concerns, incentives, and technical barriers.Legal and ethical considerations, such as privacy and intellectual property, can create difficulties in the pursuit of data dissemination; incentives to encourage data sharing are challenging and often obstacles; and technical barriers make the process intricate and demanding.The practices toward data sharing vary based on data types from specific fields and resources, the commitment to incorporate those practices, and the expectations that are influenced by community norms, policies, and individual motivations [24].Data sharing practices were analyzed by Fecher et al. [19].They systematically reviewed research papers and secondary data users, and proposed that, from the perspective of the original researchers, data sharing practices work under the influence of six factors: data donor, research organization, research community, norms, data infrastructure, and data recipients.Each factor may support data sharing for the reasons explained in this article; however each factor may also hinder data sharing practices: data donors may face barriers such as lack of time, skills, resources, or incentives to share; research organizations may impose restrictions, regulations, or sanctions for data sharing; research communities may discourage data sharing by creating competition, distrust, or fear of misuse among researchers; norms may complicate data sharing by creating conflicts, uncertainties, or inconsistencies among different norms or jurisdictions; data infrastructure may limit data sharing by creating technical difficulties, costs, or incompatibilities among different infrastructures; and finally, data recipients may affect data sharing by misusing, misinterpreting, or ignoring the shared data.The relation between the six factors clarifies the motivations driving researchers to share their data.
Researchers may share their data in a one-to-one collaboration scenario; however, this approach limits the exchange of valuable insights and hinders the collective progress of scientific knowledge.Data generated in one research project typically remains confined to that specific endeavor, leading to missed opportunities for synergy and comprehensive understanding.Although this way of collaborating is productive, by doing so, researchers miss the benefits from the wealth of information others may possess.To increase the chances for collaboration, expanding visibility and findability is paramount.For this reason, effective data sharing practices must be machine-actionable.This means that data, along with associated metadata, should be structured and formatted in a way that allows seamless integration with artificial intelligence (AI) tools.These tools can identify patterns, correlations, and insights across vast datasets that would be challenging to uncover manually; these tools also help achieve an integral understanding of complex phenomena.The benefits of sharing our data are achieved by following simple steps: 1. Data preparation: Organize, clean, and anonymize data, eliminating sensitive information.Document with comprehensive metadata, including title, authors, collection dates, methodology, variables, units, and other pertinent details. 2. Selecting an appropriate data-sharing platform: Follow the recommendations of Section 5, considering community norms.3. Establishing access and sharing preferences: Specify access control levels (open or restricted) and provide a data license for potential reuse.4. Uploading and describing the data: Create folders, directories, or datasets to group related files or data.Use standardized data formats whenever possible.5. Publishing and citing the data: Share data by DOI or URL.Increase visibility and discoverability by promoting your data through publications, presentations, or announcements.Make use of cross referencing.
The FAIR Principles provide a framework for guiding data sharing practices and implementation [40].These principles allow researchers to make implementation choices based on their specific requirements while ensuring a high degree of automation in data management.However, the freedom to operate under the FAIR Principles has resulted in the development of various technical solutions, which can sometimes lead to compatibility issues between different research fields.Despite initiatives such as the NFDI, the European Strategy Forum on Research Infrastructures and the Research Data Alliance driving the adoption of FAIR practices, coordinating a widely accepted FAIR implementation approach remains a global challenge [36,39].
to reproduce an experiment?Someone else's My own F I G U R E 1 A large number of participants from all fields of research have failed to reproduce experiments.Source: Adapted from Baker et al. [1].

1 . 2 . 3
Define the scope clearly.The modeler should specify the context, purpose, objectives, and target audience of the modeling and simulation study.The context should include relevant background information, such as the problem statement, the research question, the hypothesis, or the decision problem.Defining the scope clearly ensures that the modeling and simulation study is relevant, appropriate, transparent, and credible.Define model structure and assumptions.The modeler should describe and explain the logic, components, relationships, and parameters of the model, as well as the rationale and evidence behind them.The model structure and assumptions should be consistent with the scope of the modeling and simulation study.The model structure and assumptions determine the behavior, performance, and validity of the model.They also influence the interpretation, generalization, and applicability of the model results.3.Define the model components and the relationships between them.The modeler should identify and describe the main elements, variables, or factors that constitute the model, as well as the logic, rules, equations, or functions Principles for good practices in modeling and simulation.

4 . 5 . 6 . 7 .
Define data sources and interpretations.The modeler should use relevant, reliable, and sufficient data to support the model development, analysis, and validation.The data sources should be described in terms of their origin, quality, availability, and limitations.The modeler should not only collect and use the data but also thoroughly interpret its implications within the context of the research question or problem being addressed.Assess uncertainties in the inputs of the model.The modeler should acknowledge and account for the variability, imprecision, or incompleteness of the data used to build the model.The modeler should use appropriate methods and tools to quantify and propagate the uncertainty in the inputs.Uncertainty in the inputs affects the reliability, robustness, and accuracy of the model.Incorporating uncertainty inputs provides a more realistic representation of the inherent variability and unpredictability in biological responses.Conduct sensitivity and stability analyses on the model.The modeler should perform and document systematic and comprehensive analyses to test the robustness and reliability of the model results.Sensitivity analyses involve varying the model inputs, such as the data, parameters, or assumptions, within plausible ranges or scenarios, and observing the effects on the model outputs.Stability analysis, on the other hand, ensures that the model remains valid and consistent over time or under changing conditions.These analyses aim to reinforce the credibility, and increase the capacity for informed decision-making of the model.Assess the model for its alignment with the research question and scope.The modeler should evaluate and demonstrate how well the model answers the research question or solves the decision problem that motivated the modeling and simulation study.The assessment prevents the risk of producing outputs that are either overly broad or too narrow in their applicability, and ensures that the outcomes of the model are directly relevant to the intended research objectives.

FindableF1(
Meta)data are assigned a globally unique and persistent identifier F2 Data are described with rich metadata (defined by R1 below) F3 Metadata clearly and explicitly include the identifier of the data it describes F4 (Meta)data are registered or indexed in a searchable resource Accessible A1 (Meta)data are retrievable by their identifier using a standardized communications protocol A1.1 The protocol is open, free, and universally implementable A1.2The protocol allows for an authentication and authorization procedure, where necessary A2 Metadata are accessible, even when the data are no longer available Interoperable I1 (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.I2 (Meta)data use vocabularies that follow FAIR principles I3 (Meta)data include qualified references to other (meta)data Re-usable R1 (Meta)data are richly described with a plurality of accurate and relevant attributes R1.1 (Meta)data are released with a clear and accessible data usage license R1.2 (Meta)data are associated with detailed provenance R1.3 (Meta)data meet domain-relevant community standards