A sparse observation model to quantify species distributions and their overlap in space and time

Camera traps and acoustic recording devices are essential tools to quantify the distribution, abundance and behavior of mobile species. Varying detection probabilities among device locations must be accounted for when analyzing such data, which is generally done using occupancy models. We introduce a Bayesian time-dependent observation model for camera trap data ( Tomcat ), suited to estimate relative event densities in space and time. Tomcat allows to learn about the environmental requirements and daily activity patterns of species while accounting for imperfect detection. It further implements a sparse model that deals well will a large number of potentially highly correlated environmental variables. By integrating both spatial and temporal information, we extend the notation of overlap coefficient between species to time and space to study niche partitioning. We illustrate the power of Tomcat through an application to camera trap data of eight sympatrically occurring duiker Cephalophinae species in the savanna – rainforest ecotone in the Central African Republic and show that most species pairs show little overlap. Exceptions are those for which one species is very rare, likely as a result of direct competition.


Introduction
Thanks to their automated and non-intrusive nature of observation, camera traps, acoustic recorders and other devices that allow for continuous recording of animal observations have become an essential part of many wildlife monitoring efforts, especially those that aim at quantifying the distribution, abundance and behavior of mobile species (O'Brien et al. 2010, Burton et al. 2015, Caravaggi et al. 2017). However, the inference of these biological characteristics is not trivial due to the confounding factor of detection, which may vary greatly among recording locations. Animals are, for instance, more likely to trigger a picture when passing a camera trap in the open savanna than in a dense rainforest. Hence, variation in the rates at which a species is recorded (e.g. the photographic rate) may reflect differences in local abundance, but 929 might just as well reflect differences in the probabilities with which individuals are detected, or more likely a combination of both (reviewed by Burton et al. 2015, Sollmann 2018. Accounting for varying detection rates is thus critical when comparing species densities between locations. Since local detection rates are generally not known, they have to be inferred jointly with abundance or other state variables of interest. Commonly used methods to do so are variants of occupancy models that include detection probabilities explicitly (MacKenzie et al. 2002). The basic quantity of interest in these models is whether or not a particular site is occupied by the focal species. While the detection of a species implies that it is present, the absence of a record does not necessarily imply the species is absent. Since the probabilities of detection and occupation are confounded, they can not be inferred for each site individually. Detection probabilities are thus either assumed to be constant across sites or, more commonly, assumed to be a function of environmental covariates governed by hierarchical parameters (MacKenzie et al. 2002).
A problem of occupancy models is the assumption that there exists a well defined patch or site that is either occupied by a species, or not (closure assumption; MacKenzie et al. 2002). However, the notation of a discrete patch is often difficult, particularly in the case of mobile species that move between camera trap sites, complicating the interpretation of occupied versus empty sites (Efford andDawson 2012, Steenweg et al. 2018). In addition, summarizing such data by a simple presence-absence matrix ignores the information about differences in population densities and activities at occupied sites. Occupancy is therefore not necessarily a good surrogate for abundance (MacKenzie and Royle 2005, Efford and Dawson 2012, Steenweg et al. 2018, even though it has been advocated for birds (MacKenzie and Nichols 2004), and identifying an alternative measure has previously been highlighted as a key challenge in wildlife surveys (Burton et al. 2015).
To address this issue, we here introduce Tomcat, a time-dependent observation model for camera trap data, that extends currently used occupancy models in three important ways: First, we propose to quantify the rate at which animals pass through a specific location, rather than occupancy. This measure does not easily allow for an absolute quantification of density because it is not possible to distinguish mobility from abundance. But it can be readily compared in space to identify important habitat for a particular species, or in time to identify changes in abundance or activity. It therefore appears more useful than occupancy to monitor changes in species abundances over time, as for many species, changes in population size will be reflected in the rate at which individuals are detected prior to local extinction.
Second, we explicitly model daily activity patterns. Several models have been proposed to estimate such patterns from continuous recording data (Frey et al. 2017), including testing for non-random distributions of observations in predefined time-bins (Bu et al. 2016) and circular kernel density functions (Oliveira-santos et al. 2013, Rowcliffe et al. 2014, with the latter allowing for the quantification of activity overlap between species (Ridout and Linkie, 2009). Jointly inferring activity patterns with the rate at which animals pass a location allows us not only to account for imperfect detection, but also to quantify overlap between species in both time and space, shedding additional light on species interactions.
Third, we explicitly account for the sparsity among environmental coefficients. This is relevant since many environmental covariates are available and it is usually not known which ones best explain the variation in abundance of a species (Kriticos et al. 2014, Title andBemmels 2018). Enforcing sparsity on the vector of coefficients avoids the problem of over-fitting in case the number of recording locations is smaller or on the same order as the number of environmental coefficients.
In this article, we begin by describing the proposed model in great detail. We then verify its performance using extensive simulations and illustrate it by inferring spatiotemporal overlap of eight duiker species of the subfamily Cephalophinae within the forest-savanna ecotone of Central Africa.

A time-dependent observation model
We begin by describing an observation model for continuous recording devices. An illustration of the model is shown in Fig. 1.
Let Λ j (τ) be the event density at time τ: the rate at which a device at location j = 1,…,J takes observations of a particular species (or guild) at the time of the day τ ∈ [0,T], T = 24 h. We assume that this rate is affected by three processes: 1) the average rate l j at which individuals pass through location j, 2) the daily activity patterns  ( ) t that reflect differences in activity throughout the day and 3) the probability p j with which an individual passing through location j is recorded (i.e. detected by the device and properly identified downstream): The number of records W j (d,τ 1 ,τ 2 ) taken by a device at location j within the interval (τ 1 ,τ 2 ) on day d is then given by the non-homogeneous Poisson process 930 As in all models dealing with imperfect detection, the parameters related to species densities ( l j ) and detection (p j ) are confounded and can not be estimated individually for each location without extra information. However, it is possible to estimate relative densities between locations using a hierarchical model. Following others (Tobler et al. 2015), we assume that both parameters are functions of covariates (e.g. the environment), and hence only attempt to learn these hierarchical parameters. Here, we use where X j and Y j are known (environmental) covariates at location j and a, A and B are species specific coefficients. Note that to avoid non-identifiability issues, we did not include an intercept for p j . As a consequence, the absolute level of p j is not determined and hence l j is a relative measure only. Also, X should not contain covariates that are strongly correlated with covariates in Y.

Non-independent events
Another issue specific to data from continuous recordings is that not every record is necessarily reflective of an independent observation as the same individual might trigger multiple observations while passing (or feeding, resting, ...) in front of a recording device. It is often difficult and certainly laborious to identify such recurrent events. We thus propose to account for non-independent events by dividing the day where Λ j (c m−1 ,c m ) is given by Eq. 2. We note that the choice of n o must reflect the activity of the species considered: intervals must be large enough such that it is unlikely that the same individual is recorded in multiple intervals. Too large intervals, however, may impact power as many independent events will end up in the same interval.

Daily activity patterns
Here we assume that  ( ) t is a piece-wise constant function (or step function) with n a activity intervals of equal length h T n a a = (a for activity), i.e. we assume that the activity is constant within a specific interval but independent between intervals ( Fig. 1). While activity patterns are unlikely strictly piece-wise constant, we chose this function over a combination of periodic functions (Oliveira-santos et al. 2013) as they fit to complicated, multi-peaked distributions (e.g. crepuscular activity) with fewer parameters. We denote by k i the relative activity in interval i = 1,…,n h with k i = 0 implying no activity and k i = 1 implying average activity. The best way to place the activity intervals within the day (i.e. the tiling) is usually not known. To allow for more flexibility, we introduce the shift parameter δ such that the first interval is (δ,h a + δ) and the last overlaps midnight and becomes (T − h a + δ,δ) (Fig. 1). We therefore have Figure 1. Conceptual plot illustrating the model. Shown are recording devices at seven sites that differ in the rate at which animals pass l j and their detection probabilities p j . Passing animals may result in multiple records (stacked orange circles) or not be detected (black circles). Boxes illustrate the observation intervals with h o = 1 h and are hatched if at least one animal was recorded within (W j (·) > 0). The rate at which animals pass is modulated by the daily activity patterns (shown at the bottom) parameterized as a piecewise-constant function  with activity interval h a = 3 h, relative activity k = (0, 0.5, 1, 3.5, 2.5, 0.5, 0, 0) and shifted by δ = 2 h, peaking around mid-day.
where the indicator function 1 ( ) 0 , 1 ) (t t t identifies the interval i within which t falls: it is 1 if t ∈ (τ 0 ,τ 1 ) and zero otherwise.
where ( ( , , )| ) 1 W d c c j m m -q q is given by Eq. 5 and D j1 and D j2 denote the first and last day of recording at location j.

Prior distribution
Since it is usually not known which covariates X and Y are informative, nor at which spatial scale they should be evaluated, the potential number of covariates considered may be large. To avoid overfitting, we enforce sparsity on the vectors of coefficients A and B. Specifically, we introduce the indicators γ λi that indicate whether covariate X i is included in the model (γ λi = 1), or not (γ λi = 0). We then assume that A i follows a normal mixture model such that Here, Φ(0,σ 2 ) denotes the normal density centered at zero with variance σ 2 and the formulation ensures that this variance is larger in the case γ λi = 1 than in the case γ λi = 0. To ensure that σ λ0 is close to zero, we chose an exponential prior distribution σ λ0 ~ Exp(r 0 ) and set r 0 = 10 3 . In contrast, we assume an exponential prior distribution σ λ1 ~ Exp(r 1 ) with r 1 = 2. We further assume a Bernoulli distribution ( =1) = g p l l i with the exponential prior π λ ~ Exp(r π ) and used r π = 5, which implies that ( <0.5) 0.9 p l » . The prior on B i is analogous with indicators γ pi and parameters σ p0 , σ p1 and π p .
The priors for the remaining parameters were as follows: We chose a normal prior with density F(0, ) 2 s a on a and set s a 2 = 0.1. We chose uniform, improper priors on k and δ, For simplicity, we only consider cases in which h a , the length of the activity intervals, is a multiple of h o , the length of the observation intervals, and allow for discrete We note that the choices for r 0 , r 1 , r π and s a 2 may vary between applications. We also note that to estimate the parameters π λ and π p , data must be available for more sites than environmental covariates. If the number of camera trap sites is small or on the same order as the number of environmental covariates, π λ and π p should be fixed to the fraction of environmental covariates thought to explain the data.

MCMC
We used an MCMC algorithm with Metropolis-Hastings updates to generate samples from the posterior distribution ( | ) q W , as detailed in the Supporting information. We ran this MCMC for 10 000 iterations after ten successive burnins of 1000 iterations each. We found these settings to result in proper convergence as assessed using the Gelman-Rubin convergence diagnostics (Gelman and Rubin 1992) on multiple independent runs (Supporting information).

Prediction
Using a set of M posterior samples q q q q q q 1 , , ( | ) … ∼ M  W , we project event densities to a not-surveyed location i with covariates X i by calculating the mean l i of the posterior where a (m) and A (m) denote the m-th posterior sample of these parameters.

Species overlap in space and time
An important interest in ecology is to compare activity patterns among species and to see how overlapping patterns may relate to their interaction such as competition or predation (Ridout andLinkie 2009, Rowcliffe et al. 2014).
For that purpose, Ridout and Linkie (2009) introduced overlap coefficients that quantify the overlap between the activity patterns of two species f(x) and g(x), respectively, and range from 0 (no overlap) to 1 (identical activity patterns). The overlap Δ(f,g) is related to the distance measure L 1 as which justifies their visualization of overlap coefficients between k species using a multidimensional scaling (MDS) by considering 1 ( , ) -D f g as a measure of dissimilarity. In practice, the true density functions f(x) and g(x) are usually not known. Here we obtain an estimate of Δ numerically from samples m = 1,…,M of the species-specific posterior distributions obtained with Tomcat. We distinguish three types of overlap coefficients: Δ T for overlap in time, Δ S for overlap in space and Δ ST for overlap in time and space. While Δ T matches that of Ridout and Linkie (2009), the latter two are extensions made possible by the joint inference of habitat use ( l j ) and activity patterns (  ) and are informative about niche partitioning in space and time.
Overlap coefficient Δ T For a large number n T of equally spaced time values t t t Overlap coefficient Δ S For a given number n S of sites reflecting the habitat in a region, we sample D S m ( ) from the posterior distribution where l sj m s ( ) , =1,2 is computed according to Eq. 3 and normalized such as where for species s = 1, 2 we calculate  s

Implementation
All methods were implemented in the C++ program Tomcat, available through a git repository at <https://bitbucket.org/ WegmannLab/tomcat/> together with a wiki detailing its usage. As input, it requires three files: 1) a description of all camera trap sites in terms of both their start and end date D j1 and D j2 , respectively, as well as their environmental covariates X j . 2) A similar file listing the environmental covariates Y j for each camera trap site. 3) A file containing all observations of a species, consisting of a camera trap identifier and a timestamp. From this input, Tomcat will first calculate the full data matrix W based on the specific choice of the observation interval n o .

Simulations
To assess the performance of our algorithm, we generated three sets of simulations: Set 1. To assess the performance of inferring activity patterns  , we chose n a = 24 intervals of h a = 1 h each with k 1:4 = k 13:16 = (1 + α, 1 + 3α, 1 + 3α, 1 + α) and all other k 5:12,17:24 = 1 − α. We then used α = 0, 0.5, 1, with α = 0 resulting in a uniform activity pattern and α = 1 in an activity pattern with two pronounced peaks. We simulated a single site with a = 0 and no environmental covariate such that we expect 0.5 observations per day. Set 2. To assess the performance of inferring habitat preferences captured by l j and in particular of identifying the correct environmental variables affecting l j (i.e. sparsity), we simulated 50 environmental covariates X and Y, of which ten each affected l j and p j , respectively. In all cases, we set A i ,B i = 0 for all covariates with γ λi = 0 and γ pi = 0 and set A B V i i , = /10 for all covariates with γ λi = 1 and γ pi = 1, such that the total variation in l j was V = 0.1, 0.2, 0.5, 1, 2. For these simulations, we set all k i = 1 and a = 0 such that we expect 0.5 observations per day. Set 3. To assess the power of inferring overlap coefficients, we generated simulations with a single environmental variable for two species s = 1, 2 with no covariate Y and k s , A s and a s chosen to result in the desired overlap coefficients. When simulating under a specific Δ T , we set A 1 = A 2 = 0, a 1 = a 2 = 0 and chose k 1 and k 2 as described under Set 1 but with k 2 shifted by 6h compared to k 1 . We then used α = 0.8, 0.5, 0.2 to obtain Δ T = 0.2, 0.5, 0.8. To simulate under a specific Δ S , we set all k 1 = k 2 = 1 and A 1 = 0.5 and then determined the parameters A 2 , a 1 and a 2 to match Δ S = 0.2, 0.5, 0.8 as described in the Supporting information. To simulate under a specific Δ ST , we set A 1 = 0.5 and k 1 and k 2 as when simulating under a specific Δ T , and then determined the parameters a 1 , a 2 and A 2 to match Δ ST = 0.2, 0.5, 0.8 as described in the Supporting information. For these cases, we used α = 0.4, 0.25, 0.1 to give equal weight to the spatial and temporal components, In all sets, we generated simulations for different total numbers of camera trapping days between 100 and 20 000 with 100 replicates each. For cases with Set 1 and Set 3 with variation in l j , we used J = 100 or J = 1000 sites and chose D accordingly, excluding combinations for which camera traps were run for less than a full day.

Application to Central African duikers
We applied Tomcat to camera trapping data obtained during the dry seasons from 2012 to 2018 from a region in the eastern Central African Republic (CAR), a wilderness exceeding 100 000 km 2 without permanent settlements, agriculture or commercial logging (Aebischer et al. 2017(Aebischer et al. , 2020. The available data was from 1059 camera traps set at 532 distinct locations that cover the Aire de Conservation de Chinko (ACC), a protected area of about 20 000 km 2 . For more information about camera deployment and sampling design, Aebischer et al. (2017). Here, we focus on duikers Cephalophinae, which are a diverse mammalian group common in the data set and for which near-perfect manual annotation was available. We use n o = 96 observation intervals of 15 min each, and n a = 24 activity intervals of 1 h each, except for the two species with rather limited data (Cephalophus leucogaster arrhenii and Cephalophus nigrifrons), for which we chose n a = 8 of 3 h each.
To infer habitat preferences for these species, we benefited from an existing land cover classification at a 30 m resolution that represents the five major habitat types of the Chinko region: moist closed canopy forest (CCF), open savanna woodland (OSW), dry lakéré grassland (DLG), wet marshy grassland (WMG) and surface water (SWA) (Aebischer et al. 2017). Around every camera trap location, we calculated the percentage of each of these habitats in 11 buffers of sizes 30; 65; 125; 180; 400; 565; 1260; 1785; 3990; 5640 and 17 840 m. We complemented this information with the average value within every buffer for each of 15 additional environmental and bioclimatic covariates from the WorldClim database ver. 2 (Supporting information, Fick and Hijmans 2017) that we obtained at a resolution of 30 s, which translates into a spatial resolution of roughly 1 km 2 per grid cell.
The sparse priors implemented in Tomcat allows for the simultaneous use of any number of potentially correlated covariates. To aid in the interpretation of habitat requirements, however, we processed our environmental data as follows: First, we kept only the additional effect of each covariates after regressing out the habitat covariates CCF and OSW at the same buffer. This allows for a direct comparison of the inclusion probabilities of CCF and OSW, since no other covariates may serve as their proxies. Second, we kept only the additional effect of every covariate after regressing out the information contained in the same covariate but at smaller buffers. This ensured that larger buffers may not serve as proxies for smaller ones, and their inclusion in a model implies their importance at that buffer size (cf. Supporting information for details).
To predict habitat preferences in the ACC and to calculate Δ S and Δ ST between species, we determined the same habitat variables at 10 200 regular grid points spaced 2.5 km apart and spanning the entire ACC. To avoid extrapolation, we then restricted our analyses to the 2639 grid locations that exhibited similar environments to those at which camera traps were placed as measured by the Mahanalobis distance between each grid point and the average across all camera trap locations (cf. Supporting information for details).
To characterize detection probabilities, we used the binary classification of the four most common habitat types (CCF, OSW, MWG, DLG) at every location and determined the presence or absence of six additional habitat characteristics: animal path, road, salt lick, mud hole, riverine zone and bonanza.

Performance against simulations
We first used simulations to assess the performance of Tomcat to infer daily activity patterns  for patterns of different complexity. For each simulation, we quantified the estimation accuracy by calculating D T ( , ) k k between the true values used in the simulations (k) and those inferred (posterior medians k ). As shown in Fig. 2A, D T ( , ) k k converges towards 1 with increasing data and reached D T ( , ) 0.9 k k ³ in all cases as of 2000 camera trapping days, corresponding to about 1000 observations in these simulations.
We next used similar simulations to assess the performance of Tomcat in inferring the spatial rates l j , again quantifying the differences between the true ( l j ) and estimates values (posterior medians l j ) by calculating D S j j ( , ) l l . As shown in Fig. 2B, the accuracy of the inference increased with more camera trapping days or more sites surveyed. D S j j ( , ) 0.9 l l ³ was again reached at a total of 2000 camera trapping days when 1000 sites where surveyed (2 days per site), regardless of the effect of environmental variables. If only 100 sites were surveyed, a total of 5000 camera trapping days (50 days per site) was required to reach this threshold.
Interestingly, the effect of the total variation in l j (quantified by V) on the estimation accuracy varied as a function of data availability: If data was limited, the inference was less accurate in case V was large. If data was more abundant, the inference was more accurate in case V was large. This is probably best explained by the power to properly identify contributing environmental covariates, which we quantified by the area under the receiver operating characteristic curve (AUC). As shown in Fig. 2C and in the Supporting information, this power increased with more camera trapping days, more sites surveyed, as well as a higher total variance V. If data was abundant, high V thus resulted in a more accurate identification of the contributing environmental covariates, and therefore in more accurate estimates of l j . If data was limited, the identification of contributing environmental covariates was more challenging, but errors in this identification had less of an impact if V was small.
We further conducted simulations of two species to assess the accuracy in inferring overlap coefficients between species. As shown in Fig. 2D-F, all overlap coefficients were accurately inferred as of about 2000 camera trapping days, with Δ T generally requiring the less data than Δ S and Δ ST , in line with the previous findings that k was inferred more accurately than l j . If less than 1000 camera trapping days (i.e. less than about 500 observations) were used, however, estimates of overlap coefficients were biased towards prior expectations. For Δ T , the prior expectation is 0.5 as we give equal prior probability to any choice of k. For Δ S , the prior expectation is 1.0 as we prefer solutions in which all coefficients A i = 0. For Δ ST , the prior expectation matches that of Δ T as the prior expectation for Δ S is 1.0, implying no spatial effect.

Application to Central African duikers
We used Tomcat on existing camera trapping data (Aebischer et al. 2017(Aebischer et al. , 2020 to study the spatio-temporal distribution and overlap of duikers Cephalophinae in the Aire de Conservation de Chinko (ACC), a protected area of about 20 000 km 2 eastern Central African Republic (CAR) spanning the entire savanna-rainforest ecotone (Boulvert 1985, Olson andDinerstein 1998). Duikers are common in the data set and often observed in sympatry, i.e. several species were captured by the same camera trap within a few hours. We detected a total of eight species in the data set (Table 1) Camera trap days/100 Camera trap days/100 AUC (C)  weynsi, eastern blue duiker Philantomba monticola aequatorialis and bush duiker Sylvicapra grimmia. The eight duiker species varied greatly in their habitat preferences (Fig. 3, Supporting information) as inferred by Tomcat. As shown in Fig. 3, C. dorsalis and C. weynsi both have a strong preference for CCF over OSW habitat at the smallest buffers, in contrast to S. grimmia that shows a strong preference for OSW. At higher buffers the signal is less clear, probably owing to the heterogeneous nature of the study area, in which both CCF and OSW correlated negatively with WMG and DLG, habitats not well suited for any of these species. Interestingly, P. monticola and C. silvicultor seem to be true ecotone species preferring a mixture of the canonical habitats CCF and OSW (Supporting information).
As shown in Fig. 4 and in the Supporting information, the species also varied greatly in their daily activity patterns, with some being almost exclusively nocturnal (C. dorsalis and C. silvicultor), some almost exclusively diurnal (C. leucogaster, P. monticola, C. nigrifons, C. rufilatus, C. weynsi) and one crepuscular (S. grimmia).
The effect of covariates on detection probabilities followed general expectations: detection probabilities of camera traps placed in CCF or along rivers were estimated as generally lower and those at salt licks and other bonanza as generally higher than average (Supporting information). However, there was considerable variation among species in which covariates impacted detection probabilities, mostly as a result of habitat preferences: for species generally absent at camera traps with a particular covariate (e.g. CCF for savanna species), that covariate was not considered relevant in explaining variation in detection probabilities among observations.
To better understand how these closely related duiker species of similar size and nutritional needs can occur sympatrically, we estimated pairwise overlap coefficients in space and time (Fig. 4, Supporting information). Not surprisingly, most species pairs differed substantially either in A visualization using multidimensional scaling (MDS) of the pair-wise overlap coefficients of all six species with observations from at least 50 independent camera trap locations is shown in Fig. 5. For these species, 84.6% of variation in the temporal overlap can be explained by a single axis separating nocturnal from diurnal species. In contrast, only 44.5% of the variation in the spatial overlap is explained by the first axis distinguishing forest dwellers from savanna species. When using both temporal and spatial information, the two first axis explain 32.3% and 25.5%, respectively, suggesting that  a single axis is not sufficient to explain both temporal and spatial differences between species.

Discussion
Devices that continuously record animal observations now make it possible to survey biodiversity of larger or highly vocal animals in relatively short time and with a reasonable budget (O'Brien et al. 2010, Burton et al. 2015, Sollmann 2018. Thanks to increased battery life, larger media to store data and other technical advances, such data sets can now be produced with comparatively little manual labor, even under the demanding conditions of large and remote areas. In addition, the annotation of these data sets on the species level is now aided by machine learning algorithms that automatize the detection of common species and recordings without observations (Norouzzadeh et al. 2018). As a result, continuous recording devices have become an indispensable tool for wildlife monitoring.
Of particular interest is the inference of the spatial distribution of species. Traditionally, such distributions are inferred with occupancy models that account for the variability in detection rates between surveyed locations, addressing a key feature of data gathered through ecological surveys (MacKenzie et al. 2002, Burton et al. 2015, Sollmann 2018). However, the major draw-back of occupancy is that the species distribution is represented as a simple presenceabsence matrix not reflective of differences in abundance at occupied sites.
To address this concern, we decomposed the rate of records Λ j (t) of a species at site j and time t (e.g. the photographic rate) into three components: a spatial component l j reflecting the average rate of observations at location j, a temporal component  ( ) t reflecting the daily activity pattern, and the location-specific detection rate p j . To ensure identifiability, and similar to occupancy models, we further assume that the spatial component l j and the detection rates p j are functions of location-specific covariates (e.g. the environment).
The interpretation of the temporal component  is straight forward and matches that of other methods used to infer daily activity patterns from such data (Ridout and Linkie 2009). The interpretation of the spatial component l j and the detection probabilities p j warrant some discussion as the model has no intrinsic way to distinguish between them: it is their product l j j p that describes the rate of observations, which itself is affected by the local abundance and activity of the studied species.
In occupancy models, the variation in the rates of observations at occupied sites is attributed to differences in detection rates. Hence, sites at which a species is more abundant or more active will result in higher detection rates. To infer abundances, it is therefor usually assumed that activity does not vary between sites. Royle and Nichols (2003), for instance, introduced an important extension of occupancy models that infers local abundances from variation in detection rates. In their model, the detection probability is a function of the detection probability of a single individual r and the local abundance N j . To infer N j or its distribution, r is then assumed not to vary among sites, implying constant activity, or to be well characterized through covariates.
Apart from capture-recapture, most established survey methods that rely on direct or indirect observations to quantify abundances make similar assumptions. Inferring local abundances from responses at call-up stations, for instance, requires knowledge on local response rates (Webster et al. 2010). Similarly, inferring local abundances from observations on transects (Buckland et al. 2001) requires knowledge on local daily travels distances, rates of nest building, or similar quantities depending in the nature of the observation. In practice, estimates of such quantities are at best assessed for a handful of locations (Funston et al. 2010), but usually borrowed from other studies (Mathewson et al. 2008, Aebischer et al. 2020. In Tomcat, the variation in the rates of observations may be attributed to either l j or p j . By estimating p j jointly with l j , Tomcat explicitly accounts for the variation in detection between sites, also that caused by variation in activity. If this variation is captured well, l j may serve as a good surrogate for the variation in abundance between sites. But since l j and p j are confounded, their identification depends on the covariates X and Y used: if a covariate in Y explains part of the variation in the rate of observations, it is included in the model and contributes to p j . If that covariate was included in X instead, it would contribute to l j . The choice of covariates therefore determines which effects we wish to interpret as underlying l j (abundance) and which as underlying p j (detection or activity).
In the application to duikers, we chose to use the covariates describing the environment (e.g. the habitat or humidity) as relevant for l j and those describing local features (e.g. the presence of a salt lick or road) as relevant for p j . The motivation for this choice was our interest to learn about the environmental covariates explaining variation in abundance in terms of l j , while explaining high photographic rates due to increased activity through p j . However, and depending on the question, other choices may be equally interesting. If the surveyed area is small in relation to mobility, for instance, individuals are likely observed at many sites and l j will reflect habitat use of these individuals.

Underlying assumptions
As mentioned above, the Tomcat model does not make any assumption about site closure. However, it does assume that records are independent between surveyed sites. That does not imply that a single individual may not be recorded regularly at multiple sites, but it implies that the times at which an individual is recorded at different sites is independent given the general activity pattern  of that species. For instance, two camera traps in close proximity along a deer crossing will not provide independent information about the importance of environmental covariates as a record at one camera trap is mostly followed by a record at the other.
In its current implementation, Tomcat further assumes that while l j and p j vary spatially (as functions of environmental covariates), all temporal variation is captured by the daily activity patterns  . However, detection probabilities may also vary throughout the day, as it might be harder (or easier), for instance, to identify a certain species on black-and-white camera trap pictures taken at night or on infrared pictures taken during cooler times. In its current implementation, the model would explain such variation through  , which might lead to biases. An even stronger temporal assumption is that neither l j , p j nor  changes seasonally. However, the model could be readily extended to multiple seasons. In the occupancy framework, this is commonly done by modeling extinctions and colonizations explicitly (MacKenzie et al. 2003). In the framework proposed here, an analogue would be to model trends in l j over time. A simple extension would be to model l l g j j t t e ( )= to capture general population trends. To capture seasonal variation over the year, one could modulate l j with an additional function (e.g. piece-wise constant) similar to  spanning the relevant time period (e.g. an entire year or moon phases). Similar extensions can easily be envisioned for the detection probabilities p j or activity patterns  .

Species interactions
Records from camera traps or other continuous recording devices may also be used to study the interaction of species in space and time. To infer temporal interactions, two classes of methods exist (Niedballa et al. 2019): In a first class, temporal avoidance is quantified as the degree to which a first species influences subsequent visits of a second species (Harmsen et al. 2009, Karanth et al. 2017). These methods generally compare time intervals between observations of the first and the second species to determine statistical dependence. In a second class, daily activity patterns are inferred individually for each species and then compared using overlap coefficients (Ridout and Linkie 2009).
Tomcat implements this second class by estimating overlap coefficients between activity patterns  individually inferred for each species. We further extend the concept of overlap to space by comparing the spatial distributions of two species as captured by their respective l predictions in a specific region. Benefiting from a joint estimation of  and l , Tomcat also estimates overlap coefficients in the spatio-temporal distribution. Considering interactions in both space and time is particularly informative about niche partitioning between species (Farris et al. 2020), and capturing them via overlap coefficients, as done by Tomcat, allows for their visualization using multidimensional scaling. As we show with simulations, these overlap coefficients can be inferred rather accurately if several hundred observations per species have been recorded.
If fewer observations are present, they are biased towards their prior expectations. This is particularly apparent for Δ T that is biased away from 1.0 and may hence result in wrongly inferred differences between species. Niedballa et al. (2019) previously reported this issue and proposed a bootstrap approach to test if an estimate of Δ T is significantly different from 1.0.
Just as temporal overlaps, the spatial overlap estimated by Tomcat is estimated based on spatial distributions inferred for each species individually. A more powerful approach has been proposed for occupancy models in which spatial interactions are inferred through the probabilities of joint occupancy in multi-species models , Rota et al. 2016, Fidino et al. 2019. While not currently implemented, a similar extension can be envision for the model proposed here by adding an interaction term to the l j .

Co-existence of duikers in a savanna-rainforest ecotone
We here used Tomcat to infer the spatio-temporal distribution of eight species of duikers sympatrically occurring in the Aire de Conservation de Chinko (ACC) in the Central African Republic. These species varied greatly both in their daily activity patterns and habitat preferences, with some species being almost exclusively nocturnal, others diurnal or crepuscular. Similar, some species showed a strong preference for close canopy forest (CCF), some for open savanna woodland (OSW), and two appeared to be true ecotone species with a preference for mixed habitat. An interesting observation was that the two rarely studied species C. weynsi and C. leucogaster not only occur in large blocks of CCF as suggested in the literature (Kingdon et al. 2013), but also in narrow gallery forests within the forest-savanna ecotone several kilometers away from the next extensive forest block (Fig. 3, Supporting information).
When comparing these species using spatio-temporal overlap coefficients, we found that frequently observed and therefore evidently abundant species within this community tend to differ in their habitat preference and/or daily activity. In contrast, infrequently observed and therefore putatively rare taxa seem to have large overlap with co-occurring species. C. leucogaster, for instance, which is rather rare and was only observed at 11 distinct locations (Table 1), has similar habitat preferences and is active at the same time as C. weynsi, which is among the most common forest duikers within the ACC. In contrast, C. dorsalis, which is strictly nocturnal, seems to co-exist with the C. weynsi at higher densities (Fig. 4, Supporting information).

Conclusion
We here introduced Tomcat, a model that infers habitat preferences and daily activities from imperfect spatio-temporal observations. Similar to occupancy models, it allows to learn about the ecological requirements of animals, including rare, elusive and unmarked species. But unlike occupancy models, it does not estimate the presence or absence of a species, but rather a measure informative about species densities, jointly with site-specific detection probabilities and daily activity patterns. While estimating these quantities may require larger data sets than the inference under occupancy models, we believe they constitute a major step forward in understanding and monitoring species distributions.