In‐depth analysis of patterns in selection of different physiologically based pharmacokinetic modeling tools: Part I – Applications and rationale behind the use of open source‐code software

PBPK applications published in the literature support a greater adoption of non‐open source‐code (NOSC) software as opposed to open source‐code (OSC) alternatives. However, a significant number of PBPK modelers are still using OSC software, understanding the rationale for the use of this modality is important and may help those embarking on PBPK modeling. No previous analysis of PBPK modeling trends has included the rationale of the modeler. An in‐depth analysis of PBPK applications of OSC software is warranted to determine the true impact of OSC software on the rise of PBPK. Publications focusing on PBPK modeling applications, which used OSC software, were identified by systematically searching the scientific literature for original articles. A total of 171 articles were extracted from the narrowed subset. The rise in the use of OSC software for PBPK applications was greater than the general discipline of pharmacokinetics (9 vs. 4), but less than the overall growth of the PBPK area (9 vs. 43). Our report demonstrates conclusively that the surge in PBPK usage is primarily attributable to the availability and implementations of NOSC software. Modelers preferred not to share the reasons for their selection of certain modeling software and no ‘explicit’ rationale was given to support the use of OSC analysed by this study. As the preference for NOSC versus OSC software tools in the PBPK area continues to be divided, initiatives to add the rationale in using one form over another to every future PBPK modeling report will be a welcomed and informative addition.


| INTRODUCTION
In the past two decades, physiologically based pharmacokinetic (PBPK) models have become the fastest growing subfield in pharmacokinetics (El-Khateeb et al., 2021). Industry and regulators alike have acknowledged the importance of model-informed drug development (MIDD), being hailed as an indispensable tool for personalised medicine (Hartmanshenn et al., 2016;Jamei, 2016;Marsousi et al., 2017;Sager et al., 2015). A recent invited review in this journal (El-Khateeb et al., 2021) estimated the growth rate for PBPK modeling publications in the past 20 years to be greater than 40-fold.
This comes as no surprise, given that the mechanistic nature of PBPK models lends itself to a multitude of applications which enable them to surrogate clinical trials and to provide informative insights into novel drug disposition (Zhuang & Lu, 2016). The virtues of PBPK modeling have been reported previously (Rostami-Hodjegan, 2012;Wang & Ouyang, 2022), highlighting how the incorporation of known variations in drug metabolism and propagation of interindividual variability to drug pharmacodynamics within the context of quantitative systems pharmacology is achievable. In addition to overcoming several clinical challenges, PBPK modeling can help with regulatory submission decision-making (Grimstein et al., 2019;Zhang et al., 2020). PBPK modeling has evolved from simple rate equations and mathematical descriptions to standardised whole-body PBPK simulations, which distinguishes it from traditional pharmacokinetic modeling (Jamei, 2016). At present, PBPK modeling is typically carried out using specialised computer software which addresses the challenges mentioned previously (Kuepfer et al., 2016). The model equations are often computer-coded and solved using a variety of different algorithmic frameworks and computer languages that are available . At present, two distinct modalities of software exist: open source-code (OSC) and non-open source-code (NOSC). Both approaches offer versatility to the user and can answer questions from multiple disciplines (Clewell III et al., 2008).
Initially, due to a previous lack of specialised PBPK modeling software, PBPK models were developed inadvertently using OSC software which, in essence, were general-purpose programming tools originally utilised for engineering and mathematical systems (Andersen et al., 2005). The would-be modeler is required to depend upon specific expertise and inter-disciplinary programming skills to write custom mechanistic models from scratch via computer code (Rimvall & Cellier, 1985). The development of these models requires a substantial amount of time and effort to build from the ground up, and in the past have had standardisation issues with regards to regulatory submissions Chiu et al., 2007;Paini et al., 2017;Tan et al., 2018). However, an immense degree of flexibility and transparency is provided to the user. Over the years, OSC software has evolved to include PBPK-specific modules and equation libraries, paired with a graphical user interface (GUI) enabling the instant generation of model frameworks that already contain the standard code, allowing the user to expand and customise the frameworks further (Jamei, 2016).
The turn of the century saw the development of PBPK software which incorporated system-dependent data in addition to pre-coded model frameworks for a variety of population types (i.e., pregnancy, paediatric). This bypasses the need for the prerequisite of programming/mathematical knowledge and the understanding of critical concepts (i.e., mass balance, flow parameters) and any previous expertise in coding (Willmann et al., 2005). This software is generally referred to as NOSC or 'closed-source', whereby the programme source code and subsequent differential equations are hidden from the user. Commercial NOSC software packages traditionally provide modelers with a user-friendly GUI for processing functions such as simulation, parameter estimation, and sensitivity analysis. The caveat here is that the modeler cannot freely alter the tissue compartments, metabolic reactions, or subsequent complexities included within the model frameworks, and predetermined differential equations (which cannot be modified) are used to create the model. It should be highlighted that no software is completely binary in terms of its ability to accommodate modifications that users seek to apply. NOSC software has a degree of flexibility, allowing the user to pick from a select range of models and/or to enter bespoke equations during model construction (Rostami-Hodjegan & Bois, 2021). The integration of Lua, a high-level and versatile scripting language in Simcyp, serves as an example, albeit with rules imposed for valid script executions and limited parameter estimation facilities (Abduljalil et al., 2016). Nonetheless, the majority of the code in NOSC software is protected with certain aspects that the user is unable to modify; these primarily involve essential mathematical functions, core user interface elements, etc. (Rostami-Hodjegan & Bois, 2021 A recent request was issued to the modeling community to devote more time and resources to OSC models in MIDD in the interest of promoting a wider use of OSC models and soliciting user and developer feedback (Lippert et al., 2019). Whilst these are ideal goals, linking them with OSC modality is not substantiated and many arguments are often conflated with those pertaining to the cost versus free debate, which is not the same as the OSC versus NOSC debate (Rostami-Hodjegan & Bois, 2021). There is a common misconception that all OSC (Open-Source Software) is free, but in reality, a significant proportion of OSC software packages -with a few exceptions such as R-package and OSP -require the payment of a licence fee.
For example, the cost of an institutional site-wide licence for MATLAB at prominent academic institutions in the United Kingdom can exceed $100,000 per annum (obtained under the Freedom of Information Act July 2022). However, this expense is not imposed on the end-user (i.e., researcher) who receives free access; this leads to the misunderstanding that OSC software is free for all researchers.
In the same vein, the price of NOSC software may exceed tens of thousands of dollars, though in certain situations, research licences RAJPUT ET AL.
-275 are entirely free (or accompanied by minimal administrative fees) with limits on commercial consultancies, whilst in other situations, they are accompanied by substantially discounted prices (with reduced or complete functionalities) (Rostami-Hodjegan & Bois, 2021).
As indicated in this journal (El-Khateeb et al., 2021), based on the latest data, PBPK applications published in the literature support a greater adoption of NOSC software as opposed to OSC. However, a substantial proportion of PBPK modelers are still using OSC software. Understanding the main rationale in using OSC software for PBPK applications is of interest to modelers. Albeit insightful, the recent review (El-Khateeb et al., 2021) did not include toxicology -a sector that is mandated to employ OSC software by regulators and was prominent in early PBPK modeling (Tan et al., 2018). Additionally, no previous analysis of PBPK modeling trends has included the rationale of the modeler. Therefore, an in-depth analysis of PBPK applications utilising OSC software is warranted to determine the true impact of OSC software on the rise of PBPK.
Reproducibility is another aspect of multi-layered models in the systems biology space that must be considered. A recent study (Tiwari et al., 2021) sampled 455 models and found that approximately 50% of OSC models were not reproducible. In contrast, proponents of NOSC modalities highlight robustness, user-friendliness, and scalability as benefits of using NOSC software. This piques interest in the reusability of both modalities as analysing reusability may help to instill confidence in the acceptance of certain PBPK modalities by regulatory agencies. It should be mentioned that reusability and reproducibility are not synonymous (Aldibani et al., 2022) and have not previously been investigated. Therefore, in this two-part analysis of the literature, there will be a focus on: 1. How modelers rationalise the PBPK software they use, with a specific emphasis on OSC software, to determine whether NOSC alternatives could be used 2. How this affects reusability of PBPK software Due to the sheer complexity of the topic, reusability will be discussed in the second part of this report (Aldibani et al., 2022). Part one aims to contribute to the paucity of available literature on the use of OSC software in PBPK modeling applications by providing insight into the modeler's rationale in order to determine the foundation of their reliance on OSC and whether it can be justified. A variety of reasons are anticipated due to the sensitive nature of the topic as the debate on the modalities of PBPK modeling software is escalating.

| Data collection
Publications focusing on PBPK modeling applications that used OSC software were identified by systematically searching the scientific literature for peer-reviewed original articles using Web of Science™ database (https://www.webofscience.com/wos/woscc/basic-search).
The search was based on two subsets: (1) OSC software and (2) PBPK model. The specific search terms used for each subset are outlined in Publications using population pharmacokinetic models were excluded as well as those that did not contain the word 'PBPK'. Studies that used PK-Sim version 6.9 or less were omitted as they were classed as

Software
Programming tools for a computer that allow the user to create models involving data analysis and/or simulations.

Model
A collection of equations and algorithms created within software (through a modeling language) to analyse datasets or simulate certain scenarios.
Data User-specific model parameter values or sets of observations that models aim to mimic.
Open-source code (OSC) Computer software with major sections of the computer code available under a license that permits any user to analyse, use, edit, and redistribute the code for any purpose.

Non-open-source code (NOSC)
A proprietary code whose use, modification, and distribution are licensed by the publisher. The code is built and provided to the user as an editable set of files.

Open science
If components of a model's code are not publicised, the model is termed a 'Black Box'. Despite the user's inability to modify or retrieve sections of the code, this does not mean the algorithm of the model is not widely accessible by the public. Contrastingly, the phrase 'Glass Box' denotes a model whereby the algorithms are visible to the consumer but are incapable of modification.
an OSC software or PBPK application had five or less usages, the articles were classified as 'Other' to distinguish them from the listed software and applications (Table 2). If a PBPK model was reused but the OSC software was not specified, the original publication mentioned by the author that produced the model was screened to identify the software used. For all stratifications, overlapping entries were allowed.

| Determining the organisation affiliated with each article
The affiliations of the authors and corresponding addresses provided in the article preview were used to establish and assign the type of organisation that contributed to each publication. Hospitals, educational institutions, and affiliations with research centres, organisations, institutes, and foundations were classified as 'Academia or Research Organisation'. Any commercial enterprise licensed to conduct research, develop, sell, and/or distribute petrochemicals, pharmaceutical drugs, pesticides, cosmetics, and household products, were grouped under 'Industry'. Any public or governmental entity responsible for complying with the legislative norms governing the medication development process or dealing with chemicals in environment and food (e.g., the Food and Drug Administration (FDA) and the Environmental Protection Agency, USA (EPA)) was defined as 'Regulatory Agency'.

| Determining application areas of PBPK models
The extracted articles were classified based on their application areas. A single study may be assigned to multiple application areas.
To facilitate the representation of the data, some applications were combined, resulting in 10 distinct categories (Table 2).

| Geographical mapping of the collected literature
The address of the primary author provided in the article preview on the Web of Science database (https://www.webofscience.com/wos/ woscc/basic-search) was used to assign a geographic location to each article. Each article preview was screened for the country of residence which was extrapolated to the corresponding continent and documented.

| Identifying rationales for using OSC software
The full text of each article was examined for an explicit indication by the author(s) as to why the OSC software was used for PBPK

| Assessing reproducibility in non-OSC software
Representatives of commercial M&S (modeling and simulation) companies, Certara ® (Masoud Jamei) and Simulations Plus (Viera Lukacova), were contacted and supplied with a randomised batch of the articles from the dataset. They were graciously asked to comment on whether each article in the randomised batch could be reproduced entirely in their NOSC software, Simcyp and GastroPlus, respectively.

| Data analysis
Microsoft Excel 2019 was used to perform descriptive, thematic, and trend analysis of the data. All graphical illustrations were created using Vizzlo (https://vizzlo.com/).

| RESULTS
A total of 171 original articles satisfied the inclusion criteria and were extracted from the narrowed subset. As expected, a wide range of publications employing OSC software were identified (Figure 3). The major players were OSP, MATLAB, GNU MCSim, and ACSLXtreme, with infrequent use of STELLA, ADAPT, NONMEM, WinNonlin, T A B L E 2 Definitions of PBPK application areas and the merged groups used in data analysis.

Application Definition Group
Bioavailability Drug performance studies used to determine the impact of changes in the physicochemical properties of a drug or manufacturing process  Convenience to the author as PBPK model reused ACSLXtreme (2)  29% GNU MCSim (1) ADAPT (1) MATLAB (2) Free software GNU MCSim (1) 38% OSP (7) Regulatory requirement ACSLXtreme (1) 19% GNU MCSim (1) R-CODE (2) User-friendly MATLAB (1) 5% Internal endorsement OSP (2) 9% the most frequently used OSC software in the dataset. Further analysis revealed that there are local preferences, since the majority of its users are German (53%). A contrasting trend was seen in the further analysis into publications from East Asia where preferences were observed for traditional/homemade software, for instance ACSLXtreme and Phoenix WinNonlin ( Figure 5).

| General landscape of OSC PBPK modeling
The growing significance of PBPK applications has been documented in previous quantitative analyses and reviews (El-Khateeb et al., 2021;Kanter et al., 2012;Sager et al., 2015), albeit with limited scope. As stated previously, the recent report in this journal (El-Khateeb et al., 2021) excluded the toxicology area -a sector that was prominent in early PBPK applications (Tan et al., 2018)  ACSLXtreme and GNU MCSim were the preferred software for toxicology PBPK primarily due to a long-standing pattern of PBPK use within the community of environmental toxicologists (Mitchell & Gauthier, 2016). This persisted until the 1996 introduction of GNU MCSim, whose capabilities were unique in certain areas of toxicology, and which was freely distributed (Bois & Maszle, 1997). and 'learning by doing' (Allbritton, 2003). It is also argued that OSC systems offer the ability and flexibility to make rapid changes to the RAJPUT ET AL.
Although, this may explain in part why OSC is more popular in academia (34% of overall use) versus industry (16% of overall use) in the case of PBPK (El-Khateeb et al., 2021), cost and affordability are important parameters that cannot be overlooked. Hence, commercial PBPK software providers offer alternative arrangements for licence fees in the case of not-for-profit applications.
An attempt was made to explore 'lack of availability' or 'feasibility' as a potential rationale for using OSC models instead of NOCS software (see Supplementary Material). Despite including a small sample size of 10 PBPK models, the survey revealed that >70% of cases could be conducted using NOSC alternatives, eliminating this justification as the primary factor.

| The tale of PBPK modeling in toxicology linked to OSC systems
In the 1980s, the area of toxicology witnessed a dramatic shift in PBPK applications, paving the way for numerous subsequent advancements (Tan et al., 2018). Recent discussion has centred on the decelerating rate of increase in toxicological applications of PBPK during the past two decades, in contrast to its rapid expansion in the pharmaceutical sector (Tkachenko et al., 2018).  (Federal Register, 2016). OSC models frequently encounter significant obstacles during model assessment (Federal Register, 2016;Tiwari et al., 2021), in addition to issues related to a lack of trained, competent assessors to evaluate the code and to validate the modeling outcome Chiu et al., 2007;Paini et al., 2017;Tan et al., 2018). This has put natural hurdles in the path of using more mechanistic models and thus has favoured simpler PBPK models, which are relatively easy to assess (El-Masri et al., 2016). Our analysis supports the fact that the OSC approach to PBPK modeling has been pre-eminent in toxicology, with a counterintuitive effect on keeping the applications of PBPK in toxicology well below those in pharmacology, contrary to how it was before the turn of the century and the original wishes of the EPA.

| The philosophical conclusions backed by observed data
The topic of 'Open Science' is not novel and can be traced back to the early 17th century, when the idea was a distinctive aspect of the Scientific Revolution, making science accessible to a broader audience and signalling a break from the previously dominant ethos of secrecy (David, 2007 whilst the latter is generally accepted by the overwhelming majority of researchers and does not require any debate. Reproducibility is a common challenge not just in PBPK modeling but for many other fields of computational science (Mendes, 2018;Papin et al., 2020;Peng, 2009Peng, , 2011. This is mainly because the majority of reports and manuscripts fail to provide sufficient details, such as mathematical equations, to reconstruct the model code from the ground up (Peng, 2009(Peng, , 2011Tiwari et al., 2021). A previous assessment (Tiwari et al., 2021) of reproducibility in systems biology modeling reported missing parameter values, discrepancies in model structure, and the omission of starting conditions as reasons for why models fail to be reproduced. From this analysis, it was clear that modelers preferred not to share the reasons for their selection of certain modeling software. If PBPK models are to be effectively reproducible, it is essential to re-examine the peer-review process of modeling research especially in drug development and regulatory space (Schnell, 2018;Tiwari et al., 2021).
Additionally, in sectors such as industry and regulatory bodies, while there are constant efforts to innovate, an incumbent dependence remains on model consistency and reliability for quality assurance (Frechen & Rostami-Hodjegan, 2022). When it comes to OSC software, these sectors are seeking a level of reliability and security that is simply not achievable. This is because standard operating procedures (SOPs) become focal (Frechen & Rostami-Hodjegan, 2022 It becomes much more reasonable to include the logical justification for software choice when the value of time and money is considered, especially for companies that are paying high-end prices.

| Limitations of the study
The analysis presented in this report is by no means comprehensive or unbiased. Although we made every effort to provide a representative sample for the analysis, there may be bias towards certain categories of software use. The analysis may not fully capture the complexity of reasons why users select a particular software and provides only hints that need to be explored further.
We will provide the database for other investigators upon request to re-evaluate the angles that we might have missed or provide other interpretations that can be extracted. We acknowledge that the process of cross-checking by multiple individuals may not be sufficient to address consistency of assigning reports to certain reuse cases. The process of inferring rationales for software use, in the absence of explicit statements by authors, would be subjective and open to interpretation of implicit comments in the reports. Therefore, we encourage other researchers to build on our initial work to investigate this topic further and to put an end to claims and counterclaims regarding rationale in using various modalities without being backed by data.
We also acknowledge that the full economic costs of conducting modeling is not restricted to the cost of software, and we encourage further research into the costs of human capital and expertise for OSC versus NOSC PBPK modelers.

| CONCLUSION
Despite the limitations mentioned above, this report represents a valuable initial step towards understanding the rationale behind the use of OSC modality of PBPK tools. The findings in this report are presented with an open mind. We have seen that many users of PBPK modeling software do not appreciate the significance of providing justification and rationale for the choices they make. Potential reasons may include cost, capability, internal personal convenience, or adaptability. In order to gain in popularity, PBPK modeling needs to be simplified and made more transparent for everyone involved. This cannot be done without explaining why a particular software is chosen. Only in rare cases are individuals interested in the equations, algorithms, etc. Recent years have seen a dramatic increase in attention paid to the applications, and now, more than ever is their significance being recognised.