Optimal transportation theory for species interaction networks

Abstract Observed biotic interactions between species, such as in pollination, predation, and competition, are determined by combinations of population densities, matching in functional traits and phenology among the organisms, and stochastic events (neutral effects). We propose optimal transportation theory as a unified view for modeling species interaction networks with different intensities of interactions. We pose the coupling of two distributions as a constrained optimization problem, maximizing both the system's average utility and its global entropy, that is, randomness. Our model follows naturally from applying the MaxEnt principle to this problem setting. This approach allows for simulating changes in species relative densities as well as to disentangle the impact of trait matching and neutral forces. We provide a framework for estimating the pairwise species utilities from data. Experimentally, we show how to use this framework to perform trait matching and predict the coupling in pollination and host–parasite networks.


Introduction
In this document, we provide some background to methods discussed in our work 'Optimal transportation theory for species interaction networks'. Rather than relying on pseudocode, we offer a basic implementation of the methods in the Julia language. The precise syntax should allow the reader to understand the technical details, while still allowing to run the examples. We have written this document in a tutorial-like style. For the code used for the experiments in our work, we refer to the associated Github repo. We recommend using this code for researchers wishing to use the proposed methodology for their purposes.

Example data
We use a small plant-seed dispenser network for illustration purposes (http://journals.cambridge.org/action/ We can turn this in the observed coupling P with the associated marginals a and b. We do not have a utility matrix M . Let us suppose that interactions that occur twice have a utility of 1 and 0 else wise. In practice, this matrix is known based on expert knowledge, or has to be fitted using the coupling.

Computing the optimal transportation
Given a utility matrix M , we can compute observed couplings Q, depending on whether we keep both a and b fixed, only a, only b or none.

No marginals fixed
When no marginals are fixed, the observed coupling is just the softmax: optimaltransport (generic function with 1 method)

One marginal fixed
When one of the two marginals are given, the rows or columns of Q have to be scaled accordingly.

Both marginals are fixed
When both marginals are fixed, we have to apply the Sinkhorn algorithm. Because this algorithm uses iterative updates, we have to specify a tolerance parameter to determine convergence. # check convergence iter += 1 Q = Q .* (a ./ sum(Q, dims=2)) # normalize the rows Q = Q .* (b ./ sum(Q, dims=1)) # normalize the columns if iter >= maxiter return Q end end return Q end optimaltransport ab (generic function with 1 method)

Synthesis
We can aggregate the different version of optimal transportation in a single function, using Julia's dispatch system.

Comparision
We can compare all versions.

Assessing the effects of changing species distributions
Suppose there is a function f : U (a, b) → R that depends on the coupling, then we can compute for normalized a and b. Here, there are necessary constraint that i (∇ a f (Q a,b )) i = 0 and j (∇ b f (Q a,b )) j = 0. These give the effect of an infinitesimal change in the species density on the functional f . Which is, in this case, proportional with the numer of ones in the corresponding row or column of M .

Parallel computation of the cross-entropy
In the case where one wants to fit a global utility matrix for several observed couplings, one has to compute the optimal transportation for every coupling to obtain the cross-entropy. Especially for the Sinkhorn algorithm, this might be a prohibitively large computational burden. For this reason, we have leveraged a parallel (and GPU friendly) version of the Sinkhorn algorithm, which can compute the sum of the cross-entropies for every observed and modelled coupling pairs, i.e.
Thus, one has to find two matrices U ∈ R n×o + and V ∈ R m×o + that ensure that the couplings are appropriately scaled. This can be done by the algorithm below. The gradients can be computed directly, allowing for efficient optimization.