The conservation of animal and plant species and their biological control require models to understand and predict population dynamics (Fieberg & Ellner 2001; Buongiorno & Gilless 2003; Demyanov, Wood & Kedwards 2006). Among population dynamics models, projection matrix models have been widely used to investigate the dynamics of age-, stage- or size-structured populations (Caswell 2001; Stott et al. 2010). They provide a simple way of integrating vital rate information such as recruitment, birth, growth or ageing, and mortality (Crone et al. 2011). Matrix models have been used to model population demography in the context of species invasion (Hooten et al. 2007; Sebert-Cuvillier et al. 2007), species extinction or conservation of endangered species (Cropper & Loudermilk 2006), and the sustainable management of exploited species (Hauser, Cooch & Lebreton 2006). Recent improvements in matrix models targeted the estimation of demographic parameters, in particular for animal populations using capture–recapture methods (Besbeas et al. 2002).
In species-rich ecosystems like tropical rain forests, tropical marine fish or coral reefs, high diversity implies that the sample size for most species is limited. The small sample size hinders a good fit of species-specific dynamics models, including matrix population models. To address this problem, modellers usually cluster species into groups. A variety of methods has been used to group species, favouring either ecological interpretation or the accuracy of predictions. Groups of species can be derived from functional characteristics (Steneck & Dethier 1994), ecomorphology (Bellwood & Wainwright 2001) or ecological subjective strategy (Swaine & Whitmore 1988; Favrichon 1994; Gitay & Noble 1997). These methods do not rely on a strong statistical methodology, thus they do not ensure that the within-group similarity is maximum, or that the number of groups is optimal. Gourlet-Fleury et al. (2005) described two other strategies applied in tropical rain forests: the ecological data-driven strategy (Phillips et al. 2002) and the dynamic process strategy, in which ‘process’ refers to the components of forest dynamics (recruitment, growth or mortality) (Gourlet-Fleury & Houllier 2000; Picard et al. 2010). These strategies rely on statistical unsupervised classification methods, such as hierarchical cluster analysis, to group species with similar traits. Moreover, species classification is most often disconnected from the matrix modelling and from the estimation of the matrix parameters, thus bringing species groups that may not be optimal with respect to the predicted community dynamics. To cluster the species while ensuring optimality for predicting community dynamics, we need to rely on the mixture model framework.
Mixture models are based on the assumption that observation data arise from several unobserved groups (McLachlan & Peel 2000). A model is associated to each group. Each observation contributes to the fitting of the model for a given group with a weight that represents its probability to belong to this group. These weights can eventually be used to classify observations among groups. Thus, mixture modelling simultaneously fits models and classifies observations, and the clustering step is closely linked to the calibration step. This favours the similarity of species response within groups rather than the similarity of species traits (Dunstan, Foster & Darnell 2011). Mixture modelling has mainly been developed for observations with a normal distribution (e.g. mixture regressions). The use of mixture models has recently been proposed to model the presence/absence of species (Dunstan, Foster & Darnell 2011), the species richness in a species assemblage (Mao, Colwell & Chang 2005) or the heterogeneity of capture and survival probabilities in free-ranging populations (Pledger, Pollock & Norris 2010).
This study aims at extending mixture modelling to matrix population models. The mixture of matrix population models will simultaneously solve two issues: fit matrix models for species-rich ecosystem with many rare species, and classify species into groups. As proposed in population genetics (Pritchard, Stephens & Donnelly 2000; Corander, Waldmann & Sillanpaa 2003; Guillot et al. 2005), the strategy consists in a probabilistic model-based clustering method expressed in terms of matrix population mixture models with an unknown number of components (Richardson & Green 1997; Dunson 2000; Marin, Mengersen & Robert 2005). The number of groups and the parameters of the matrix population models associated with each group are the unknown quantities. We propose to use a Bayesian framework to infer these unknown quantities. The Bayesian framework has several advantages over frequentist methods. First, it enables us to obtain the credibility interval for finite population sizes, whereas frequentist methods provide asymptotic confident intervals. Secondly, with the use of prior distributions, strong biological or ecological knowledge can be integrated in the model.
The mixture of matrix models is defined in the next section. An inference method is then outlined, and tested using simulated data. The mixture matrix model is finally applied to a data set from the Paracou tropical rain forest in French Guiana. The tree species groups obtained had consistent ecological behaviours with contrasted functional traits, and compared favourably to other groups obtained by a standard classification technique.