Global meta-analysis of over 50 years of multidisciplinary and international collaborations on transmissible cancers

Although transmissible cancers have, so far, only been documented in three inde pendent animal groups, they not only impact animals that have high economic, envi ronmental and social significance, but they are also one of the most virulent parasitic life forms. Currently known transmissible cancers traverse terrestrial and marine en -vironments, and are predicted to be more widely distributed across animal groups; thus, the implementation of effective collaborative scientific networks is important for combating existing and emerging forms. Here, we quantify how collaborative ef fort on the three known transmissible cancers has advanced through the formation of collaborative networks among institutions and disciplines. These three cancers occur in bivalves

Therefore, in the event of outbreaks of new lineages of existing transmissible cancers, or the discovery of new transmissible cancers in the future, the rapid formation of international collaborations spanning relevant disciplines is vital for the efficient management of these diseases.
A collaboration is defined as two or more scientists from the same or a different institutions that compile a paper together (Newman, 2001). Examples of ground-breaking developments through large-scale, international collaborations in the field of science include the discovery of the Higgs Boson, an elementary particle in the Standard Model of particle physics theorized in the 1960s (ATLAS Collaboration, 2012), the mapping of the human epigenome, the chemical compounds and proteins that attach to DNA and switch genes on and off (Stunnenberg et al., 2016), and the NASA Twins Study investigating how the human body adapt to and recover from long-term exposure to the extreme environment of space (Garrett-Bakelman et al., 2019). The benefits of such international collaborations have been investigated by a range of scientific fields, including conservation (Kark et al., 2015;Mazor, Possingham, & Kark, 2013), ecology (Goring et al., 2014;Leimu & Koricheva, 2005) and medicine (Årdal et al., 2016;Deeks et al., 2016). These studies have highlighted the importance of obtaining key insights into complex systems, particularly with respect to cancers in humans (The International Cancer Genome Consortium, 2010) and infectious diseases (Årdal et al., 2016;Deeks et al., 2016). Without doubt, deciphering complex biological scenarios requires strong collaborations between multidisciplinary groups (often at an international scale).
Cancer is one such complex system that has been described in most of the main groups of multicellular organisms, including plants  (Metzger et al., 2016;Murgia, Pritchard, Kim, Fassati, & Weiss, 2006;Pearse & Swift, 2006, see Table 1 for an overview). While transmissible cancers seem rare, the recent discovery of two transmissible cancers in Tasmanian devils (DFT1 in 1996, Hawkins et al., 2006Pearse & Swift, 2006;andDFT2 in 2016 Pye et al., 2016), as well as a new lineage of DN in two new bivalve species (increasing the number of DN lineages to 6, Yonemitsu et al., 2019), present the question of whether transmissible cancers are more common than previously thought (Metzger et al., 2016;Ujvari, Gatenby, & Thomas, 2016b). Several environmental, host and cell factors must converge (e.g. survival during transit, a permissive host environment and the propagule pressure which is the number of times and the frequency a host is exposed to a potential infection) for the emergence and persistence of transmissible clonal cell lines (see the "perfect storm theory" Ujvari et al., 2016a), with contagious cancers likely having evolved and gone extinct over evolutionary time. Once the neoplastic process has crossed the threshold of contagiousness, malignant cells become new parasitic "species," and their ecological consequences can be major (e.g.> 85% population decline in 20 years in Tasmanian devils; epizootic outbreaks and mass population declines in marine mollusc populations, Mateo, MacCallum, & Davidson, 2016), making these cancers one of the most virulent parasitic life forms. Contagious cancers have likely evolved and gone extinct over evolutionary time in various species. However, due to our limited ability to detect transmissible cancers across evolutionary timescales, it is currently not possible to determine how common they were in the past, their current prevalence or potential prevalence in the future .
Such complex systems require the construction of efficient collaborative networks across institutions and disciplines. It is, therefore, important to establish the optimal structuring of efficient networks. While knowledge about CTVT has built over a 150-year timeframe (Novinsk, 1876), DFTD has only been studied for about 13 years (from 2006, Pearse & Swift, 2006). These different timeframes provide a unique opportunity to investigate how scientists have organized themselves in collaborative networks to obtain insights of these cancers, and to delineate how groups should organize themselves in the event of a new transmissible cancer emerging.
Thus, here, we conducted a meta-analysis of the currently known three transmissible diseases, using bibliometric and social network analyses to quantify: (1) how collaborations are organized, (2) how the organization of these networks has changed over time and (3) the efficiency of information sharing in these networks. We applied our results to suggest how future collaborations should be optimally structured to respond to outbreaks of new lineages of existing transmissible cancers, or the discovery of new transmissible cancers in the future, which could also be applied to scientists working on other infectious diseases.

| Selection of studies for the meta-analysis
Following the protocol of Dujon and Schofield (2019)  data-mining, emerging diseases, meta-analysis, outbreak, quantitative review, research impact database and Google Scholar for relevant publications with specific terms in the topic field, which included the title, abstract, keywords and keywords plus (i.e. words that frequently appear in the titles of the articles cited within a publication). For DFTD, we used the following terms: "Tasmanian devil cancer," "Tasmanian devil facial tumour" and "Tasmanian devil tumour." For CTVT, we used the following terms: "dog transmissive tumour" and "canine transmissible venereal tumour." For DN, we used the following terms: "bivalve neoplasia," "disseminated neoplasia," "bivalve disseminated sarcoma" and "bivalve haemic neoplasia." Until 2016, it was not known whether cases of disseminated neoplasia in marine bivalves were transmissible cancers (Metzger et al., 2016;Yonemitsu et al., 2019); however, due to the phenotypic similarities between the cancerous haemocytes in studies published before and after 2016, we assumed these older cases were also transmissible cancers and pooled them into a single publication group. In addition, for all three groups of transmissible cancers, and to locate additional articles that might not have been identified by the initial search, we checked the reference list of relevant papers based on the predefined keywords. In addition to original research articles, literature reviews were included in our study, because they also facilitate substantive, thorough, sophisticated research to advance our collective understanding of complex topics (Boote & Beile, 2005) institutions from more than one country and (5) the great circle distance (in km) between the location of the institution and the site in which the disease was described the first. As reference points, we used Saint Petersburg in which the first experiment demonstrating CTVT was transmissible was conducted (Novinsk, 1876), Tasmania for DFTD (Pearse & Swift, 2006) and Oregon Bay for DN (Farley, 1969). We expected institutions located close to these sites to be the first ones that studied the respective transmissible cancers.
In addition, to determine the scope of the journals in which studies on transmissible cancer are published, the subject area(s) of each journal in which the articles were published was determined using the Scopus subject area classification (which classifies journals into 27 major thematic areas, Elseiver, 2012).

| Analysis of temporal and geographical trends
The citation data collected for each of the three cancer types represent a cross-sectional study. We, therefore, used linear regression models to investigate how publications accumulated citations over time, Poisson regression models to investigate trends in the number of institutions involved in a publication and logistic regression models to investigate the percentage of publication involving an international collaboration or published in a journal with more than once subject area (Zuur, Ieno, Walker, Saveliev, & Smith, 2009). Full details of model fitting and validation are provided in Supplementary Method 1.

| Social network analysis
A social network is a collection of social actors, each of which is acquainted with some subset of the others (Newman, 2001 Methods 2).
To quantify the circulation of information within each of these collaborative groups, we computed the average path length, which measures the average shortest distance between two nodes (i.e. by how many institutions the nodes are separated from each other on average). The average path length is an indication of the speed at which information sequentially travels in the network. In addition, we computed the network clustering coefficients, which ranged between 0 (no connection between any of the nodes) and 1 (all the nodes are connected to each other). These were interpreted as the probability that two institutions within a collaborative group are involved into a published study over a given period of time (Barabási et al., 2002;Bunn, Urban, & Keitt, 2000;Minor & Urban, 2008;Opsahl et al., 2017). Furthermore, to investigate whether social networks could be classified as small-world networks, we computed a small-world-ness coefficient by comparing the clustering and path length of a given network to an equivalent random network with same degree on average (following Humphries & Gurney, 2008). A small-world-ness coefficient ≫ 1 indicates a network with smallworld properties.
Then, we used simulations to diagnose the type of networks formed between the institutions collected from the studies on the three transmissible cancers. In a simulation, the number of nodes, the number of links per node, the number of links per publication and the starting point of each link are kept identical to the observed network; however, the end point of each link was allowed to connect to any institution to generate a simulated network, in which any institution randomly collaborates with any other institution (following Opsahl et al., 2017). For the circulation of information between scientists, such simulated random networks are inefficient; thus, comparing the metrics calculated from the observed network to these networks allows the efficiency of the scientific collaboration network to be quantified (Opsahl et al., 2017).
Simulations were repeated 1,000 times. For each iteration, the average path length, the clustering coefficient and the small-worldness coefficient of the simulated network were computed. These three metrics were also computed from the observed networks, and compared with the distributions obtained from the simulations.
This approach allows a probabilistic interpretation of the metrics.
Values falling outside the distributions generated from the simulations show that the network properties deviate from those of a random network (Opsahl et al., 2017). To identify possible temporal changes to network structure, the whole procedure was repeated

| Reporting of statistical results and software
All statistical analyses were performed in the Bayesian framework.
Throughout, we report the estimated parameters followed by their 95% credible intervals in parentheses (Kruschke, 2015). All Bayesian models were computed using the MCMCglmm package (Hadfield, 2010) in R software version 3.3.2. (R Development Core & Team, 2013), and the models were fitted using noninformative priors (Hadfield, 2010). Social network metrics and simulations were performed using the Igraph R package (Csardi & Nepusz, 2006).
Geographical data were assimilated using the RWorldmap package (South, 2011).

| Temporal trends in studies on transmissible cancer
The

| Geographical trends in studies on transmissible cancers
The geographical distribution of the institutions varied among the three transmissible cancer types (Figures 1f, 2)

| Multidisciplinary aspect of studies on transmissible cancer
A total of 19 subject areas were identified from the journal scope

| SOCIAL NE T WORK ANALYS IS
The social networks built for institutions studying the three transmissible cancers clearly differed at a global scale, due to variation in the number and geographical distribution of institutions, but also of the diseases (Figure 2b, d, e). ics or even the field of biology as a whole (Newman, 2004;Opsahl et al., 2017). Those metrics were however similar to those of the field of psychiatry or the field of physics as a whole (Newman, 2004;Wu & Duan, 2015), nologies that allow quick and long-distance (face-to-face) communication between scientists located in different countries (Wagner & Leydesdorff, 2005). The year 2006 was also the year DFTD was described and the novelty of this discovery may also partially explain the increase in the number of publications for this cancer type (while it may be too early to see any effect of DN which was only shown to be a transmissible cancer in 2015). In addition, increased mobility through cheaper travel options also likely contributed to this phenomenon (Scellato, Franzoni, & Stephan, 2015). This is especially well institutions studying DFTD are not optimal and that there is more potential for collaborations between these institutions (Opsahl et al., 2017). A possible explanation for this is that Tasmanian devils are only found in a relatively small geographical area, making it more difficult to obtain data and to establish collaborations compared to DN and CTVT which are globally distributed. This unexploited potential might not be compatible with the rapid decline of Tasmanian devil populations (Lachish, McCallum, & Jones, 2009;Lazenby et al., 2018) and with the relatively high risk of emergence of new types of tumours (two independent transmissible cancers appeared between 1996 and 2019, Pye et al., 2016;Stammnitz et al., 2018). This species will likely require collaborations as efficient as possible to mitigate the effect of transmissible cancers .
There are multiple benefits in forming collaborative networks to obtain insights on transmissible cancers. Efficient and highly con- Current key challenges in obtaining insights on transmissible cancers include determining how many actually exist (Ujvari et al., 2016b), as well as their evolutionary ecological impact, especially in the context of increased pressure on ecosystems and the economy Preece et al., 2017). Transmissible cancers can only emerge under the confluence of specific conditions, termed the "perfect storm" (Ujvari et al., 2016b). For contagious cancer cells lines to emerge, several micro-and macro-environmental factors (e.g. permissive immune system, presence of transmission routes, optimal conditions to survive in transport), and tumour cell traits (high proliferation rate, genetic and phenotypic plasticity, shedding of high number of cells etc.) must align. Transmissible cancers present a selective force on the host akin to parasites and may have been critical drivers of major transitions during the evolution of multicellular organisms, such as the origin of sexual reproduction  and the development of the immune systems (Ujvari, Gatenby, & Thomas, 2017). Thus, transmissible cancers represent an essential, but so far understudied, selective force during the evolution of organisms, and ultimately in ecosystem functioning.

| CON CLUS IONS
This study demonstrated that, despite exhibiting differences in their global geographical distribution, institutions working on transmissible cancers organize themselves into highly connected small-world networks. It is likely that scientists establish collaborations with specialists in the target area, as well as supporting fields or research, to develop effective action strategies.

AUTHOR S CONTRIBUTI ON
AMD, GS and BU designed the study, AMD and GB collected the data, AMD performed the statistical and social network analyses, and AMD, GS and BU led the writing of the manuscript with inputs from NR, TF and RHR.

DATA AVA I L A B I L I T Y S TAT E M E N T
The sources of the data used in this publication are described in the methods section (Web of Science, Scopus and Scimago websites).