A standard protocol to report discrete stage‐structured demographic information

Stage‐based demographic methods, such as matrix population models (MPMs), are powerful tools used to address a broad range of fundamental questions in ecology, evolutionary biology and conservation science. Accordingly, MPMs now exist for over 3000 species worldwide. These data are being digitised as an ongoing process and periodically released into two large open‐access online repositories: the COMPADRE Plant Matrix Database and the COMADRE Animal Matrix Database. During the last decade, data archiving and curation of COMPADRE and COMADRE, and subsequent comparative research, have revealed pronounced variation in how MPMs are parameterized and reported. Here, we summarise current issues related to the parameterisation and reporting of MPMs that arise most frequently and outline how they affect MPM construction, analysis, and interpretation. To quantify variation in how MPMs are reported, we present results from a survey identifying key aspects of MPMs that are frequently unreported in manuscripts. We then screen COMPADRE and COMADRE to quantify how often key pieces of information are omitted from manuscripts using MPMs. Over 80% of surveyed researchers (n = 60) state a clear benefit to adopting more standardised methodologies for reporting MPMs. Furthermore, over 85% of the 300 MPMs assessed from COMPADRE and COMADRE omitted one or more elements that are key to their accurate interpretation. Based on these insights, we identify fundamental issues that can arise from MPM construction and communication and provide suggestions to improve clarity, reproducibility and future research utilising MPMs and their required metadata. To fortify reproducibility and empower researchers to take full advantage of their demographic data, we introduce a standardised protocol to present MPMs in publications. This standard is linked to www.compadre‐db.org, so that authors wishing to archive their MPMs can do so prior to submission of publications, following examples from other open‐access repositories such as DRYAD, Figshare and Zenodo. Combining and standardising MPMs parameterized from populations around the globe and across the tree of life opens up powerful research opportunities in evolutionary biology, ecology and conservation research. However, this potential can only be fully realised by adopting standardised methods to ensure reproducibility.


| INTRODUC TI ON
Population ecology has come of age. The development of theories, experimental approaches and statistical methodologies have resulted in the publication of demographic information for an increasingly representative sample of the world's biodiversity (De Magalhᾶes & Costa, 2009;Levin et al., 2022;. These data span the taxonomic tree from microbes (Jouvet et al., 2018) to macro-vertebrates (Fujiwara & Caswell, 2001), and cover virtually all continents and biomes-though with important taxonomic biases (Conde et al., 2019;Römer et al., 2021). The potential of this impressive and rapidly increasing amount of information is starting to be realised. Indeed, through combining these demographic models, researchers have identified functional traits that explain variation in plant life history strategies (Adler et al., 2014; also see Bernard et al., 2023), short-term (transient) characteristics that drive the demographic dynamics of plant populations in variable environments (McDonald et al., 2016), and ways in which life history strategies allow species to persist alongside a changing climate (Jelbert et al., 2019;Paniw et al., 2019).
One of the most widely used tools for describing and analysing species' complex life histories is the matrix population model (MPM, hereafter). Briefly, in an MPM, individuals of a population are classified by discrete stages and/or ages (st/age hereafter) according to some biological (Caswell, 2001, p. 31) or statistical/sampling criteria (Salguero-Gómez & Plotkin, 2010). These individuals are followed in discrete time steps, typically adjusted by the generation time of the species. Indeed, time steps can vary from 12 to 24 h as in nematode worms Caenorhabditis elegans and aphids Myzus periscae (Bruijning et al., 2019;Li et al., 2014), to monthly/annual periods in mammals and plants (Coulson et al., 2001;Ferreira et al., 2016), all the way to 50 years in slow-growing red woods (Namkoong & Roberds, 1974).
From these data, researchers estimate losses through mortality, transition probabilities among st/ages and their per-capita a/sexual contributions via reproduction (Nordstrom et al., 2021;Omeyer et al., 2021).
A single MPM can be used to calculate a vast repertoire of biologically meaningful outputs. These outputs include proxies for the performance and viability of populations, such as deterministic ( ) or stochastic population growth rates ( s ) (Doak et al., 2005), quasiextinction risk (Davis, 2022), population response to perturbations of underlying vital rates such as survival or reproduction (Caswell, 2001, p. 206), transient dynamics (Capdevila et al., 2020;Ezard et al., 2010;Stott et al., 2011) and life history traits, such as rates of senescence (Baudisch et al., 2013), degree of iteroparity  and age at maturity (Caswell, 2001, p. 124). This wealth of demographic inference highlights why many advances in demography and life history theory utilise MPMs (Franco & Silvertown, 1996;Pfister, 1998;Saether et al., 2013;Tuljapurkar, 1989). In the latest data release, COMPADRE v. 6.22.5 [COMADRE v. 4 Gascoigne, pers. obs.). However, one of the challenges of the digitization process is the tremendous variation in how data are collected, presented and used to parameterize MPMs.
Data standardisation improves reproducibility and promotes data sharing across research disciplines (Reichman et al., 2011). Data standardisation, and the associated detailed metadata, is therefore key for research to be replicated, validated, openly discussed and ultimately for science to advance (Powers & Hampton, 2019;Reichman et al., 2011;Salguero-Gómez et al., 2021). Examples of these standards include reporting sample size and variance of estimates and detailing the full list of original sources of data (Gerstner et al., 2017). In this context, standards can be used as checklist items to improve publications quality and reproducibility and to aid the peer-review process (Reichman et al., 2011). Furthermore, metaanalyses (Gurevitch et al., 2018) and phylogenetic comparative analyses (Healy et al., 2019;Salguero-Gómez et al., 2017), which offer valuable opportunities to examine general patterns and identify gaps in knowledge, rely on data conforming to certain standards.
MPMs are being adapted, extended and applied beyond their original, species-specific context in comparative demography.
However, not all MPMs are built and reported equally. The current presentation of MPMs in COMPADRE and COMADRE may give the false impression that all MPMs are published in a homogeneous format, despite differences in how and why the MPMs are produced (Caswell, 2001). This impression may have emerged from the amount of verification the COMADRE and COMPADRE digitisation team does behind the scenes (e.g. validating model outputs, author correspondence for additional information). While verification is an inevitable aspect of database curation, most of our efforts are spent communicating with authors rather than digitising data. Our goal here is to (i) present the current standard of MPM communication in the literature, (ii) identify common issues in MPM communication and their impacts, (iii) suggest ways to support the clear communication of MPM data and metadata, (iv) highlight advantages for authors and the scientific community at large and (v) introduce a standard method for sharing MPM data and metadata.

| MPM COMMUNI C ATI ON: CURRENT S TATE OF AFFAIR S
To present the current practices in MPM data and metadata communication, with the ultimate goal to evaluate the need for standardised data and metadata reporting, we performed a survey of researchers and screened a subset of papers that have been used to generate MPMs stored in COMPADRE and COMADRE.

| A survey on matrix communication
We surveyed expert population ecologists, who we identified as having published peer-reviewed papers that include MPMs, regarding our current ability to communicate MPM data and metadata for reproducibility purposes. Specifically, we asked how well peer-reviewed publications relay the attributes of MPMs necessary for reproducibility. Additionally, we asked if researchers thought a standardised method of matrix communication is 'necessary for the coherent communication of MPMs in the literature' (the full list of 11 questions can be found in Supporting Information). The survey was distributed using Google Forms. We identified 1390 potential participants based on the criterion of being the lead and/or corresponding author from a publication containing at least one MPM. Over 50% of corresponding email addresses were outdated and not contacted further. Of the remaining approximately 650 researchers, that were contacted, 60 participants completed the survey. As expected, researchers report a great deal of heterogeneity in components of MPM communication (Figure 1). The best communicated attributes according to these survey participants are trait names (i.e. the phenotype by which the MPM was structured-stage/age/size classes), census duration and projection interval while the worst communicated attributes are life cycle graphs, formulae defining the vital rates and population vectors (i.e. number/frequency of individuals in each st/age). Importantly, 83% of survey participants agreed that the discipline needs a standardised method for MPM communication.

| A screen of papers in COMPADRE and COMADRE
To quantify how well MPM data and metadata are communicated in peer-reviewed publications, we screened 300 randomly sampled papers containing MPMs already digitised in COMPADRE and COMADRE (150 papers each). Across the different key attributes of MPMs that we examined, there was considerable variation in how reliably authors provided the data and metadata necessary for digitising, archiving, and performing comparative analysis ( Figure 2 contain overall more explicit data and metadata than animal studies (COMADRE; Figure 2). Furthermore, we used this information to categorise the quality of each of the examined 300 papers according to their reproducibility-defined as their inclusion of components of MPM communication (Figure 3). The distribution of component communication across kingdoms is similar. Crucially, only 13.9% of papers in COMADRE and 15.8% of papers in COMPADRE contain all the information necessary for comparative analyses and accurate projections ( Figure 3). Thus, approximately 85% of papers require emailing authors to request undisclosed information.

| COMMON ISSUE S IN MATRIX CONS TRUC TION
Here, we identify key issues in the parameterization of MPMs to illustrate the impact of methodology on demographic inference. To do so, we draw from the findings from the previous section and our experience curating COMPADRE and COMADRE. We outline the following issues for two reasons: (i) to advise demographers in how to identify them in the literature and (ii) to prevent these issues persisting in future publications. We note that a comprehensive list was recently made available by Kendall et al. (2019, see also Che-Castaldo et al., 2020. Here, we add to these previous papers by outlining steps for researchers to avoid/mitigate these issues in their own research. A summary of these issues, from occurrence to impact, is detailed in Figure S1. type directly affects matrix construction, and census timing and frequency can inadvertently influence demographic outputs (Emery & Gross, 2005).
Typically, an MPM comes in two forms regarding the spread of reproduction between censuses: birth-flow or birth-pulse (see Caswell, 2001, p. 22). The distinction is based on whether reproduction occurs continuously (i.e. birth-flow) or in a narrow temporal window (i.e. birth-pulse). Birth-pulse MPMs are further categorised into pre-versus post-reproductive census. Although both pre-and post-reproductive censuses often lead to similar demographic inference (see Cooch et al., 2003), their difference lies in when populations are censused relative to the position of the narrow reproductive window. In the former, populations are censused immediately before a reproductive window, while postreproductive censuses follow on from a reproductive window. A pre-reproductive census requires the inclusion of offspring survival in reproductive matrix elements, while a post-reproductive census requires the inclusion of parent survival in reproductive matrix elements. We often encounter mistakes in the accommodation of offspring or parent survival in reproductive matrix elements (see also Kendall et al., 2019). A key step in matrix construction that can prevent the incorrect accommodation of survival is drawing the life cycle graph (as per Ebert, 1999, p. 61) with respect to census timing (demonstrated in Ellner et al., 2016, p. 13), as well as explicitly detailing the census type used to parameterize the MPM. However, sometimes drawing the life cycle graph may be unfeasible or uninformative. For example, the graph for an age classified model with 100 age classes is too large to draw and too redundant to be useful; but, they can be simplified with a dashed line if multiple adjacent classes have the same demographic rates (e.g. Ebert, 1999, p. 2). Models with many stages and highly connected transitions are not feasible to draw the life cycle graph (e.g. the graph for Calathea ovandensis in Neubert and Caswell (2000)).
But even in complex situations (e.g. the series of seasonal graphs for the emperor penguin in Jenouvrier et al. (2010)) the graph may be helpful in organising the structure of the model. The y-axis indicates the percentage of peer-reviewed publications in COMPADRE and COMADRE that contain a given attribute necessary for the clear communication of MPM information and its reproducibility from a random subset of 150 papers of the 643 total papers from COMPADRE and 150 out of the 395 total papers from COMADRE (300 papers total). The attributes are: Location: province/city/landmark; MPM: was the MPM included in the manuscript; Census duration: start and end dates for data collection; Vital rate formulas: decomposition of matrix elements into their underlying components (i.e. contributions from survival, growth, and reproduction); Projection interval: the time period between observations; Latitude-longitude: spatial coordinates; Life cycle graph: the visual representation of demographic transitions and a/sexual per-capita contributions; Population vector: st/age distribution of individuals at time t associated with reported MPMs.
Census timing and frequency affects model construction, making a constructed MPM impractical for demographic inference if the life history of the examined organism is not considered. Consider a researcher comparing the demographic processes of fruit flies and fruit trees. The researcher first notices that there are four discrete stages to the fruit flies' life history: three juvenile stages encompassing the development from egg to instar to pupae, and one adult stage where individuals disperse and reproduce. Since development from egg to adult takes ~10 days in this species, the researcher decides to perform the census every 10 days for both the fruit fly and the fruit trees over a 3-month period. However, because neither mortality nor reproduction occur across such a short census in the fruit tree population, the resulting fruit tree MPM, when projected forward, will persist forever, neither increasing nor declining. This same issue would occur the other way around. If 5-year intervals were deemed sufficient for the fruit trees, then individually measured fruit flies would never survive across time steps. A solution to this problem exists, using periodic matrix models to include periods much shorter or longer than other periods. For example, Hunter and Caswell (2005) analysed the Sooty Shearwater Puffinus griseus including two harvesting periods of several weeks in duration and then an annual interval for the species, with a lifespan of decades. Smith et al. (2005) and Shyu et al. (2013) used periodic seasonal models to accommodate life cycles in which some stages are only present for part of the annual cycle. The approach (Caswell, 2001, section 13.1) is powerful and general.

| Unrealistic stage-specific survival
Issues in parameterising stage-specific survival, whilst easy to diagnose, can result in an array of unnatural life histories. Transition and survival probabilities are bounded between 0 (i.e. the event never happens) and 1 (i.e. always occurs). As such, the stage-specific survival of an MPM, the summed nonreproductive elements in a given column of the MPM A must not exceed 1. When it does, individuals in that stage have an unrealistic chance of surviving >100%, resulting in an incorrect representation of the organism's life history.
Stage-specific survival values >1 typically arise due to rounding errors, typos, inclusion of unstated a/sexual reproductive events.
As such, it is generally advised to omit these MPMs in comparative analysis (Jones et al., 2014). Unstated a/sexual reproductive events occur when a given element a i,j in the MPM A contains both survivaldependent processes, such as growth/shrinkage, but also fertility, and these have not been reported separately. Ideally, authors would carefully identify whether various vital rates are being confounded with survival-dependent demographic processes in each MPM element. For the comparative demographer using COMPADRE and COMADRE, we recommend either avoiding MPM models where stage-specific survival >1 or altering the model so that the stagespecific survival is fixed to a maximum of 1 (e.g. Buckley et al., 2010).
In many published MPMs, some life stages have an estimated survival probability of 1 or an incomplete life cycle, likely the result of small sample size or rare event along the life history of the species. Perfect survival (i.e. mortality = 0) is unlikely to be accurate, and may need to be estimated or imputed (Johnson et al., 2018). A reproducible approach to infer realistic survival and transition values was recently proposed by Tremblay et al. (2021), using a Bayesian approach to estimate parameter values using priors in addition to the observed data to obtain posterior MPMs. An advantage of this approach is that the confidence intervals of the parameters that represent probabilities (i.e. stasis, transition, survival) are obtained from a beta distribution. This advantage of using a Bayesian inferred multinomial Dirichlet distribution for estimating the mean values is F I G U R E 3 Across plant and animal MPM papers, most publications do not contain sufficient information for reproducibility. Proportion of papers in COMPADRE and COMADRE grouped by their open-access information in peer-review publications regarding matrix population model (MPM) data and metadata. Following the same scheme as in Figure 2, papers were ranked into six groups from 'inadequate' to 'MPM + VR + POP+ECO' (i.e. fully reproducible). 'Inadequate' refers to papers missing the MPM and/or projection interval (i.e. an MPM specific time interval necessary for projection), without which most demographic outputs cannot be calculated. 'MPM': paper contains the MPM and projection interval but no vital rate formulas describing the matrix elements. 'MPM + VR': contains all of the information for 'MPM' along with vital rate formulas for the matrix elements. 'MPM + VR + POP': contains all of the information for 'MPM + VR' along with the population vector. 'MPM + VR + ECO': contains all of the information for 'MPM + VR' along with latitude-longitude coordinates and census duration of the examined population. 'MPM + VR + POP + ECO': contains all of the information for 'MPM + VR' along with population vector/ distribution, latitude-longitude coordinates and census duration. that the researchers can infer variance and skew of the posterior distributions to further inform MPM construction and demographic inference (e.g. Tremblay et al., 2009aTremblay et al., , 2009b. And finally, since sample size can be a key driver of unrealistic stage-specific survival, sample size and uncertainty (e.g. confidence interval, standard deviation) must be reported to (1) relay the precision of the estimated survival value to your audience and (2) for accurate inclusion of survival values in meta-analyses and comparative methods.

| Incorrectly parameterizing fertility
Fertility often presents a challenge to constructing accurate MPMs.
This challenge is partly due to the ambiguity of the term 'fertility'. The issue arises when the per-capita contributions of reproductive adults to new recruits (e.g. eggs, neonates, seeds, etc.) do not represent the links over the full projection interval of the study. Remember that the entry a i,j in an MPM is the (expected) number of stage i individuals at t + 1 per stage j individual at time t. If stage i is some kind of 'newborn' individual, then a i,j must include all the processes between time t and time t + 1 (Caswell, 2001, p. 61). Reproductive output, in turn, is a composite demographic process of the number of offspring produced in a reproductive event and the relevant survival that will penalise how many new offspring will actually make it to the next observation. Failure to accommodate this vital rate decomposition can result in the introduction of a one-timestep lag into the organism's life cycle, as newly created offspring spend a projection interval 'in limbo' before their onward transitions. The best-known example is in the classic model of teasel Dipsacus sylvestris by Werner and Caswell (1977), in which flowering plants at time t were described as producing seeds at time t + 1, which only germinated to seedlings at time t + 2. The issue was discussed and corrected in Caswell (2001).
Furthermore, this issue has been reported, for instance, in reproductive structures such as seeds that do not actually undergo a permanent seed bank. An MPM with this issue will typically (Kendall et al., 2019), but not always (Nguyen et al., 2019), underestimate the asymptotic population growth rate, . Naturally the challenge will then be in estimating the relative importance of the seed bank and the lifespan on nongerminated seeds. The effect of incorrectly parameterizing fertility on is greatest in cases of extreme growth, such as invasive species, or extreme decline, such as critically endangered species (Rueda-Cediel et al., 2018). Furthermore, this issue can also cause overestimation of the transient envelope (see Ezard et al., 2010). Thus, we recommend reporting the fertility vital rate formulas with the associated MPMs and clearly identifying the values of these underlying vital rates (as in Box 1).

| Indirectly calculating vital rates
Estimating vital rates often involves combinations of direct and indirect measurements. Direct measurement empirically derives vital rates from individual-based data where identified individuals are censused multiple times, as in cohort life table studies, mark-recapture methods and many quadrat studies of marked plants. However, vital rates can be hard to observe in species with high offspring production, complex phenology and/or small population sizes (Beissinger & Westphal, 1998).
Since some MPM methods require a full life cycle to obtain key metrics (e.g. transient metrics: Stott et al., 2011), external study sites or literature sources are often used to parameterize components of the MPM to 'close the loop' in incomplete life cycles (Omeyer et al., 2021).
Another method to indirectly estimate vital rates involves using ex-situ methods to obtain upper and lower bounds on recruitment (or other vital rates) and explore the parameter space within those bounds. The approach was introduced by Caswell et al. (1998) in a study of the effects of bycatch mortality on the harbour porpoise.
Age-specific survival and fertility schedules were selected from other species with similar life cycles, re-scaled to match the longevity of the harbour porpoise, and used to produce uncertainty distributions for population growth and the effects of the measured bycatch. Reporting the distribution and associated parameters provide a measure of uncertainty from which to inform the construction of an MPM (Tenhumberg et al., 2008). Furthermore, the use of hierarchical models to estimate missing values and borrowing strength from other populations or species may improve parameter estimation (James et al., 2021;Tremblay & McCarthy, 2014).
And lastly, integrated population models represent a valuable framework for indirectly estimating the demographic rates and population dynamics (size and structure) by combining data sources, particularly combining longitudinal individual data with population census data (Plard et al., 2019;Schaub & Kéry, 2021). Integrated population models allow for the construction of population models (including MPMs) by (1) combining data sources, (2) defining a life history a priori (this is often some form of stage-structured population model) and (3) quantifying the maximum likelihood of demographic rates encoded in the life history given the data sources. Integrated population models are particularly useful when uncertainty around data acquisition is known (e.g. in capture-mark-recapture studies) (Riecke et al., 2019).

| Population vector
An estimate of the structure of the population, classified by age or stage, is a useful piece of information when available, but it will only sometimes be available. Current population structure provides a logical starting point for projections of short-term and long-term population viability (Werner & Peacock, 2019). Furthermore, using the population vector (i.e. abundance and stage distribution) for projections helps to account for the effects of transient dynamics, which measure the BOX 1 Example presentation of a hypothetical three-stage plant matrix population model (MPM) using a clear and explicit presentation of data applicable to most MPM construction techniques.

(A) Matrix type
A simple deterministic density-independent matrix.* *This free text field allows for the brief description of matrix type. If the matrix is structured by one variable the matrix is simple. If not, the matrix is considered general (e.g. age x stage). Deterministic refers to if the demographic rates that build the MPM are held constant (deterministic) or drawn from a distribution (stochastic). Density-independent versus density-dependent indicates if the demographic rates are or are not influenced by population density.  (Jones & Hubbell, 2006). However, many types of demographic rate estimation do not provide any information on numbers and structure.

(B) Life cycle diagram (C) Census description
Cohort life tables, that follow a cohort of individuals as they age, are blind to the structure of the population in which the cohort develops.
Indeed, there may be no such population (e.g. the entire history of laboratory cohort-based demography going back to Pearl in the 1920s (Pearl et al., 1927)). Mark-recapture estimation of rates from longitudinal data draws all its inference from the marked individuals and makes no inferences about the number and structure of the unmarked. The literature on mark-recapture methods for estimating rates recognises that estimating population numbers is thus much more difficult than estimating rates (Lebreton et al., 1992)  *This measure of uncertainty may also be the estimate's standard deviation, variance or a confidence/credible interval at the discretion of the researcher.
BOX 1 (Continues) rate estimation). Therefore, if projection from an actual structure is desired, that initial condition may be more appropriately measured in a separate census, rather than extracted from the measurements of rates that inform the MPM.

| Omitting cryptic life stages
The identification and estimation of vital rates in cryptic stages poses a challenge in population ecology. Cryptic stages represent points along an organism's life cycle that are somewhat hidden from or overlooked by population ecologists when building population models (Doak et al., 2002). A life stage could be cryptic because it is logistically challenging to observe or observable but indistinguishable from a similar seeming class (Nguyen et al., 2019). In plants, cryptic stages can emerge from seed banks for plants, such as orchids, where the seeds are too small to be identified in the field (Paniw et al., 2017) or some herbaceous perennials (e.g. Astragulus scaphoides) where prolonged periods of vegetative dormancy can allow individuals to stay underground for one or more growing seasons (Gremer & Sala, 2013).

| One-sex versus two-sex models
Much of demography focuses on females, under the assumptions that fertility is determined by females without limitation by males (see Caswell, 2001, p. 568 (Archer et al., 2022). These studies typically assume a 1:1 sex ratio, sex-congruent vital rates and that reproduction is not male-limited (Compagnoni et al., 2017;Miller & Compagnoni, 2022). While one-sex models are common in animal MPMs (currently 77% in COMADRE v. 4.21.8), care must be taken to not make assumptions about sex-ratio dependent dynamics within these systems (Archer et al., 2022) Indeed, these assumptions may not be met when any of the following are true: there is a bias in sex ratio (Archer et al., 2022), there is reproductive skew (Sky et al., 2022), or a high sensitivity of population dynamics in response to mating choice (Veran & Beissinger, 2009). Furthermore, sex-dependent detectability can further confound estimates of sex-ratio and their associated impacts on vital rates if not taken into account. Two-sex models that do not assume dominance by one sex are nonlinear and require specification of a mating function that describes fertility as a function of male and female abundance (Caswell, 2001). Defining such mating functions is generally difficult or impossible, except in the particularly easy case of strict monogamy (Jenouvrier et al., 2010).
Reporting sex ratios can greatly expand the scope of a study (Shyu & Caswell, 2016a, 2016b; for example evaluating the impact of sex ratio and the Allee effect (Boukal & Berec, 2002). Unfortunately, this reporting is rarely done in work archived in COMPADRE and COMADRE. Moreover, if there are differences in vital rate values between sexes, such as survival, growth, and/or reproductive output, a one-sex MPM may neglect important processes (Archer et al., 2022;Caswell, 2001, p. 568). In plants, reporting two-sex dynamics is even more rare (0.2% in COMPADRE v. 6.22.5). However, this low percentage likely reflects the rarity of dioecy or other mating systems with two or more sexes in plants (Käfer et al., 2017) and the commonness of polygamous mating systems which makes male-limited reproduction rare (see Compagnoni et al., 2017;Miller & Compagnoni, 2022).

| Irreducibility and ergodicity
The property of irreducibility has implications for the eigenvalue spectrum of a matrix, and hence biologically relevant outputs (e.g. population growth rates, stable stage structures). These implications are well known in the literature on MPMs (Caswell, 2001). An irreducible matrix is one in which the life cycle graph is completely con- 3. Spatial models in which dispersal is one-directional, as in river systems or oceanic currents.
4. Age × stage-classified models (Caswell, 2009;Caswell & Salguero-Gómez, 2013). In these models, reproduction produces (by definition) individuals in age class 1, but the model includes all combinations of age and stage, including impossible combinations of age class 1 and stages that do not exist at age 1.
Reducibility may or may not be easy to spot from the life cycle graph, but it can be tested numerically. The matrix A is irreducible if and only if the matrix (I + A) s−1 is positive (Caswell, 2001).
Irreducibility, together with primitivity, is a sufficient condition for ergodicity, guaranteeing that the population will converge to the same stable structure regardless of the initial condition. A reducible matrix may not have this property; clearly, for example, a population started with only post-reproductive individuals will not converge to the same structure as one started with some pre-reproductive individuals. With regard to ergodicity, it is also known that an MPM is ergodic if and only if all entries of its dominant left eigenvector (v ) are positive (Stott et al., 2010). In short, despite appropriate model structure and correct parameterisation, demographic data may lead to reducible and/or nonergodic matrices.  Table S1.

| Partitioning demographic processes
It is important to define what each matrix element in an MPM represents. Various demographic processes can overlap into the same matrix element in an MPM, particularly in species with a fast and/or plastic lifecycle relative to the MPM projection interval. For example, the value in an MPM that represents the link between large individuals at time t and smaller individuals at time t + 1 might correspond to sexual reproduction, clonal reproduction, fission, retrogression, or a composite of multiple processes. The mathematical derivations of key life history traits (e.g. generation time, life expectancy, rate of senescence, degree of iteroparity) require that these processes be clearly separated (Jones et al., 2022).
This is critical for the family of analyses based on Markov chains; the matrix U defines the transient state transitions in an absorbing Markov chain (Caswell, 2011(Caswell, , 2013. By reporting the underlying demographic rate structure in a life cycle diagram and its consequent full matrix population model A, one can separate matrices into survival-dependent processes (e.g. progression/growth, retrogression/shrinkage, fission, fusion, stasis) in the submatrix U, sexual reproduction in the submatrix F and clonal reproduction in the submatrix C ( Figure 4). Importantly, both F and C submatrix elements must incorporate survival according to census type (i.e. pre-/post-reproductive census).
Reporting the matrix A as well as the submatrices U, F and C lends two key benefits: (1) explicitly indicates how the values in A are generated from underlying vital rates; and (2) the submatrices can be used to calculate a vast plethora of demographic measures that cannot be calculated from A alone, such as longevity (mean and variance), occupancy times (means and variances), lifetime reproductive output (means and variances), net reproductive rate, generation time and entropy (Keyfitz entropy (Keyfitz, 1968) and Demetrius' entropy (Demetrius, 1992)) just to name a few.

| Attribution of secondary data sources
Secondary data sources are critical for reproducibility. These data sources provide information and support for methodologies used in MPM construction. In some cases, MPMs simply use secondary data to complete the life cycle, whereas others are constructed purely from secondary sources (see Table S1). Secondary sources include

F I G U R E 4
Decomposing an MPM into its submatrices allows for the isolation for otherwise masked vital rates. Matrix A represents the MPM. Since individual transitions can be represented by multiple demographic rates (e.g. retrogression, sexual reproduction and clonal reproduction), decomposing A into its U, F and C submatrices allows for targeted demographic inference about what demographic transitions are driving the dynamics of the population.

| Archival of information in COMPADRE and COMADRE
We propose that the COMPADRE and COMADRE matrix databases provide the most appropriate way of archiving and accessing MPMs.

| A S TANDARD PROTO COL FOR REP ORTING MPMS
Here, we introduce a proposed checklist for how to report an MPM in publications (Box 1). We recommend using the checklist when designing data collection as well as when writing up the MPM for publication. We recommend using this template as Supporting These new methods produce models whose structure does not fit into the frameworks for reporting that seemed so comprehensive in the past. These recent advances in MPM theory and methods, enable researchers to link population dynamics and demography to environmental conditions and multiple individual traits (e.g. sex and age ; age and kinship (Caswell, 2019b(Caswell, , 2020) rather than a single trait. These advances also offer benefits for the study of population responses to extreme climate (Jenouvrier et al., 2022), as well as more nuanced investigations of comparative and evolutionary demography . In turn, in this section, we overview some exciting areas of structured demography that can open novel research questions for the modern demographer and list some of the challenges they pose for communication and reporting.

| Nonlinear dynamics
Nonlinear MPMs are those in which entries of the projection matrix depend on the population state (numbers and structure) and may be frequency-or density-dependent. Frequency-dependent nonlinearities depend only on the relative abundance of stages; they occur in two-sex models in which mating depends on the relative abundance of males and females, and in population genetic models where dynamics depend on the relative abundance of genotypes The analysis of nonlinear MPMs focuses on demographic outcomes different from those of linear models; equilibria, attractors, bifurcations, oscillations and stability (see Caswell, 2001, Chapters 16 and17, andCushing et al., 2003 for the most detailed analysis yet). However, what makes these models problematic for the current status of COMPADRE and COMADRE is that the unit of the model is not a matrix, but rather a matrix function, in which the entries of the projection matrix are functions of the state of the population.
Sensitivity analyses are available to study pretty much any demographic outcome in response to any parameter (Caswell, 2019a), but reporting the functions that define the MPM is not at all standardised.

| Environment-dependence
A similar problem arises in environment-dependent MPMs. In such models, some or all of the demographic rates are functions of some aspects of the environment; for example, polar bears as functions of statistics of Arctic sea ice , sifaka as functions of rainfall (Lawler et al., 2009), the emperor penguin as a function of seasonal sea ice patterns in the Antarctic (Jenouvrier et al., 2012) and the North Atlantic right whale as functions of time and of trends in time (Fujiwara & Caswell, 2001). As with nonlinear MPMs, the model is not a matrix, but a function that maps from the environmental variable(s) to the entries in the matrix. Protocols for reporting such functions are not yet available but are important to develop.

| Multistate models
An exciting emerging area of demographic research is the construction and analysis of multistate MPMs, in which individuals are classified by more than one state variable. This includes age and stage (Caswell & Salguero-Gómez, 2013), stage and spatial location (Hunter & Caswell, 2005), stage and genotype (de Vries & Caswell, 2019), stage and infection status (Klepac & Caswell, 2011), age and unmeasured heterogeneity (Hartemink et al., 2017), and stage-specific incidence of disease (Caswell & Van Daalen, 2021). A detailed presentation of the methods is given in  and the extension to more than two state axes (so-called hyperstate matrices) is given in (Roth & Caswell, 2016). The incorporation of additional states enables researchers to tease apart various sources of individual heterogeneity, the variance of life history outcomes for individuals from the same population model, and to ask deeper comparative and evolutionary questions. For example, maternal age has a strong impact on vital rates in monogonont rotifers (Bock et al., 2019).
Applying vec-permutation methods (Caswell, 2012) to build multistate MPMs has allowed researchers to quantify the populationlevel impacts of the observed maternal age effect and to investigate the evolutionary processes that can lead to this type of senescence in rotifers (Hernández et al., 2020). Multidimensional MPMs and Markov chain approaches have been particularly important in the study of 'luck' in life histories, which explores why some individuals live long and prosper, while others do not (Snyder & Ellner, 2018).
The within-group variation is called individual stochasticity or 'luck' and arises from the fact that vital rates are probabilistic processes.

| DISCUSS ION
Demographic research has come a long way since the introduction of age-based (Leslie, 1945) and stage-based matrix models (Lefkovitch, 1965). Advances in this field have been fuelled partly by clear communication of methods and associated code. We aim to continue this expansion with MPM communication.
As the depth and breadth of the literature continues to expand, we are starting to build a comprehensive picture of demography across the spectrum of life (Adler et al., 2014;Healy et al., 2019;Salguero-Gómez et al., 2017). Through the work of the COMPADRE and COMADRE databases, we have come to appreciate the utility and opportunities of a standardised way of compiling MPMs.
Indeed, a significant portion of the time (>50%) we spend curating these databases is actually not on digitising, error-checking, and complementing data, but on contacting authors for clarification and request of missing data and metadata. Through this arduous process, we have identified valuable-yet typically missing-information in MPMs. Whilst the missing data highlighted here as being particularly important primarily reflects the interests and perspectives of comparative demographers, including the data outlined in the standardised method would benefit demography as a whole.
This paper intends to act as a useful reference for authors, editors, reviewers, managers/conservationists and comparative demographers. Furthermore, we hope this manuscript will promote a constructive discussion on the purpose, construction and presentation of stage-based demographic information. Box 1 contains a comprehensive example of the key information we believe should be incorporated into the publication of any MPM. Should the methods suggested here be adopted, there will be clear benefits for the growth of the COMPADRE and COMADRE demographic databases; however, we believe these benefits extend beyond COMPADRE and COMADRE users towards the whole field of population ecology and fields that use MPMs for their own inference (e.g. conservation biology and biodiversity monitoring). A greater level of detail and transparency when describing how and why an MPM is produced will result in greater accuracy, accessibility, reproducibility and citability-this has clear benefits to the field as a whole and to individual researchers. In addition, greater consistency and transparency facilitates peer review, and indeed, these guidelines may offer a tool that can be cited by associate editors and peer-reviewers who may frequently advocate some (or all) of the steps suggested herein.
Furthermore, adoption of the steps suggested here may increase confidence in the results presented and facilitate learning/uptake of MPMs by early career researchers.
Finally, we close with a caution. We have used the term 'accurate' at points throughout this paper, applied to MPMs, but we must acknowledge that there is no such thing as an accurate model, would be as complicated as the real system. That does not end well (Borges, 1999).

ACK N O WLE D G E M ENTS
We thank the hundreds of population ecologists who have contributed open-access matrix population models ready for fully reproducible research, and those who have, throughout the last 15 years, answered our emails asking for additional data and metadata. We also thank Chloé R. Nater and one anonymous reviewer for their helpful comments. Lastly, this paper is in memoriam of our dear friend and colleague James W. Vaupel, who sadly passed before the submission of this manuscript. His multiple contributions to demography will no doubt outlive multiple Bristlecone pine generation times.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no conflict of interest.

PEER R E V I E W
The peer review history for this article is available at https:

DATA AVA I L A B I L I T Y S TAT E M E N T
The code used in this paper can be found at Zenodo (Gascoigne,