Distributed optimisation and control of graph Laplacian eigenvalues for robust consensus via an adaptive multilayer strategy

Functions of eigenvalues of the graph Laplacian matrix L, especially the extremal non‐trivial eigenvalues, the algebraic connectivity λ2 and the spectral radius λn, have been shown to be important in determining the performance in a host of consensus and synchronisation applications. In this paper, we focus on formulating an entirely distributed control law for the control of edge weights in an undirected graph to solve a constrained optimisation problem involving these extremal eigenvalues.


INTRODUCTION
In the control of multi-agent systems, both the structure and strength of connections between individuals are instrumental in determining the performance of the system as a whole. The totality of connections and their strengths uniquely defines a graph Laplacian matrix L, and the spectrum of L, 0 D 1 6 2 6 6 n , concisely reveals a number of useful properties of the network system. Importantly, it has been shown that functions of these eigenvalues often define or can be used to bound bulk properties of the network, including the performance of tasks ranging from consensus [1][2][3][4] to synchronisation [5][6][7]. These tasks are particularly important in the applications of clock synchronisation in wireless networks [8,9], energy management in smart grids [10,11] and formation control in multi-agent systems [12,13]. To illustrate this fundamental concept, we can consider the classic example of a group of agents approaching average consensus in an undirected network under the simple linear weighted consensus protocol [1]: Each agent i in the system updates its state x i according to the weighted sum of the differences between its own state and those of its neighbours x j , defining the neighbourhood of i, N i , as the set of agents with which i can communicate. The strength of the connection between agents i and j is modelled through the weight w ¹i;j º . This control law is seen to be distributed as each agent only receives information from its neighbours. The collection of all agents' dynamics can be concisely represented by the weighted graph Laplacian matrix L , as defined in Section 2: It is well known that in this system, consensus is achieved if and only if the network is connected, which is equivalent to the second smallest eigenvalue of L, 2 the algebraic connectivity, being strictly greater than zero [14]. Furthermore, the exponential rate at which the consensus value is approached is governed exactly by 2 , which represents the convergence rate of the slowest mode.
Defining the disagreement vector y.t / D x.t / 1 n 11 > x.0/ as the disagreement between the state at time t and the arithmetic mean of x.0/ yields jjy.t /jj 2 6 jjy.0/jj 2 e 2 t In this simple system, both the sufficient condition for consensus to be achieved and the measure of the performance are governed by the algebraic connectivity 2 of the graph Laplacian. This quantity is a function of all weights in the network and is therefore a global property of the network.
When the simple consensus protocol is affected by a homogeneous time delay of , the largest eigenvalue of the graph Laplacian, n , becomes critically important to the stability of the system. In a result from [1], it is shown that for the consensus solution to be stable, an upper bound must hold on the spectral radius n of the graph Laplacian and consequently, that a network with lower spectral radius will be stable under longer time delays. Specifically, it is found that for consensus to be robust to the time delay, the spectral radius of the graph Laplacian must satisfy the following inequality: In general, most methods in the literature for controlling the spectral radius n of the graph Laplacian or the eigenratio n = 2 follow centralised approaches, with the notable exception of the algorithm presented in [25], which uses decentralised power iteration methods to estimate the eigenvectors associated with 2 and n , in order to then adapt edge weights. However, this method breaks down when 2 3 , or n n 1 , and so cannot further optimise the network once this point is reached. In [26] and [27], the eigenratio is minimised through simulated annealing. And in [28], mixed integer nonlinear optimisation is utilised to solve eigenvalue optimisation problems including maximising the algebraic connectivity subject to a constraint on the spectral radius. Edge rewiring based on the eigenvectors of the graph Laplacian is used in [29] to minimise the eigenratio n = 2 .
In this paper, we take a different approach and formulate a decentralised method for estimating the spectral radius of the graph Laplacian [23], on the basis of the application and extension of the algebraic connectivity estimator presented in [21]. We use these estimators in parallel to evaluate the extremal eigenvalues of the graph Laplacian and inform an adaptive strategy, driving the edge weights of a network to a state that improves the robustness to time delay of the simple consensus protocol. This is achieved by adapting the edge weights so as to minimise the spectral radius n of the graph Laplacian, whilst ensuring that the network remains connected through the enforcement of a lower bound on the algebraic connectivity, 2 . We find that this approach is more robust when estimating eigenvalues close to non-uniqueness compared with [25], allowing the graph Laplacian eigenvalues to be optimised past the first point when 2 3 or n n 1 , which can lead to significant improvements in performance.
The distributed optimisation and control strategy that we propose consists of multiple layers of distributed estimation. The edge weight dynamics rely on estimates of the graph Laplacian eigenvalues, and these estimators themselves require further estimated variables to function correctly. An outline of this three-layer structure is illustrated in Figure 1. In Sections 3 to 5 of the paper, we will first present each layer independently, and then in Section 6, we specify how the three layers are interconnected, and derive sufficient conditions for convergence of the distributed strategy, using the framework of multilevel singular perturbation theory [30]. In Section 7, we will illustrate the theoretical results on a set of representative numerical examples, confirming the effectiveness of the approach. Figure 1. General overview of the distributed estimator-optimiser system. Each subsystem is contained within a white box, and groups of subsystems that operate in the same timescale share a yellow box. In this diagram, the rightmost subsystem is the slow system, then, reading to the left, we have the fast systems and finally, the ultra-fast subsystems. Dependencies between subsystems are illustrated via arrows. The vectors a and b are estimates of the eigenvectors associated with 2 and n , respectively, and the proportionalintegral (PI) average consensus subsystems are used to estimate the mean of the components hai D 1 n P i a i , hbi D 1 n P i b i and the mean of the components squared:

PROBLEM FORMULATION
We define the weighted, undirected graph G.V; E; A/ on n nodes, such that the vertex set V D ¹1; : : : ; nº, and with m D jEj edges, where the edge set E V V contains no self loops ¹i; iº … E. An edge ¹i; j º 2 E if nodes i and j can communicate (pass local variables), which in turn defines the neighbourhood of the ith node, N i W j 2 N i " ¹i; j º 2 E. To each edge, we assign a non-negative real weight w ¹i;j º > 0 that defines the weighted adjacency matrix A D OEa i;j n n a i;j D

<
: and the weighted graph Laplacian L D OEl i;j n n with non-negative diagonal elements l i;i referred to as the weighted degree of node i. The graph Laplacian is symmetric and positive semidefinite, which can clearly be seen using Gersgorin discs [31]: L is diagonally dominant with non-negative elements on the diagonal, and so we can lower bound the spectrum of L by 0 and upper bound the spectrum by twice the maximum weighted degree, D max i ¹l i;i º [2]. The row sums of L all equal 0; therefore, the consensus mode 1 is an eigenvector of L with associated eigenvalue 1 D 0. Henceforth, we can order the eigenvalues of L, 0 D 1 6 2 6 : : : n 6 2.
It is well known that the second smallest eigenvalue 2 , known as the algebraic connectivity, is non-zero if and only if the network is connected. That is, for each pair of nodes in the network, there is a path between them that traverses only edges with non-zero weight. Moreover, the algebraic connectivity serves as a lower bound on the edge connectivity and vertex connectivity [14,32].
The graph Laplacian may also be defined another way, by constructing a weight vector w and oriented incidence matrix P [33], which will be a useful notation:

Definition 1
Let q W ¹1; : : : ; mº ! E be a bijective function from the set of integers from 1 to m, to the edge set, so that we may define an ordering on the vector of edge weights w D OEw i m 1 , where w i D w q.i/ . We may then define the unoriented incidence matrix M D OEm i;j m n Similarly, we may define the oriented incidence matrix, P D OEp i;j m n so that L.w/ D P > diag¹wºP.
In this paper, we consider, as a specific example, the problem of minimising the largest eigenvalue of the graph Laplacian n so that a consensus process taking place over the network will be more robust to time delays as noted in [1,34], whilst bounding the algebraic connectivity away from zero, 2 > Ä > 0, so that the network remains sufficiently well connected. Moreover, greater algebraic connectivity is associated with an increased convergence rate in consensus problems with time delay (see, e.g. theorem 3 and figure 5 in [34]). This optimisation problem can be formulated as minimise w n .L.w// (10) subject to which is a convex optimisation problem: the objective function, the spectral radius of the Laplacian matrix, is a convex function of edge weights; and the constraint that the algebraic connectivity be greater than a desired value is also convex (the algebraic connectivity is a concave function of edge weights). These facts can be observed by noting that, respectively, the maximum and the negative of the minimum are symmetric convex functions of the spectrum of the graph Laplacian matrix, and hence, the spectral radius and the negative of the algebraic connectivity are convex spectral functions [15,35]. Moreover, we wish to solve the problem (10,11) in an entirely distributed manner. By which, we mean that with each edge, ¹i; j º is associated a set of variables ¹i;j º , which includes the edge weight w ¹i;j º 2 ¹i;j º , and to each node, i is associated a set of variables i . The dynamics of these sets of variables may be dependent only on locally available variables, and all edges and all nodes behave identically so that no edge or node is a leader: Furthermore, we restrict the size of these sets reflecting a limit on the memory and communication capacity of individual agents, so that the number of variables at each edge and node remains small even as the network becomes arbitrarily large. This means that nodes cannot simply accumulate information about the entire network and solve the problem with perfect knowledge of the network. Thus, the distributed estimator system with states ¹i;j º and i uses one-hop neighbour information only. In this paper, this distributed optimisation problem is solved using a multilayer strategy, shown in Figure 1, with each layer performing successive stages of distributed estimation. Starting with the slowest subsystem, we formulate weight dynamics, which solve the optimisation problem in Section 3. This solution is initially centralised in the respect that it is dependent on global variables, specifically the eigenvalues of the graph Laplacian and their partial derivatives. To render this strategy completely distributed, we implement two eigenvalue estimators that converge at a faster rate: one for the algebraic connectivity [21], described in Section 4.1, and one for the spectral radius, expounded in Section 4.2. Each of these estimators requires access to the mean of the estimator vector and to the estimator vector's magnitude. To estimate these functions in a distributed manner, four proportional-integral (PI) average consensus subsystems [36], as given in Section 5, are utilised. After discussing each of the subsystems individually, we prove in Section 6, using a singular perturbation approach, that given sufficient timescale separation between layers, and under certain assumptions, the entire distributed system is stable and converges onto the solution to the optimisation problem. This result is demonstrated numerically in Section 7, and we also show the performance of the system when the assumptions we make for proving convergence are relaxed.

WEIGHT ADAPTATION
To begin, we propose a system governing the weight dynamics, where each node and edge perform the following updates P w ¹i;j º D @ n .w/ @w ¹i;j º C 1 2 P ¹i;j º D w ¹i;j º ¹i;j º ; ¹i;j º .0/ > 0 (15) P ¹i;j º D .Ä 2 .w// ¹i;j º ; ¹i;j º .0/ > 0 (16) and show that the equilibrium point OEw > opt ; > opt ; > opt > satisfies the first-order necessary Karush-Kuhn-Tucker (KKT) [37] conditions for the optimisation problem. We will show that the firstorder necessary KKT conditions are not only necessary but also sufficient for the equilibrium point OEw > opt ; > opt ; > opt > to be the optimal solution to our optimisation problem, (10) and (11). Following this, we will discuss conditions for the equilibrium to be exponentially stable. As these conditions are not necessarily always true, we will then suggest a minor modification to the algorithm in (14) to (16). An important condition for the system to work as intended is the following assumption:

Assumption 1
At the optimal edge weights w opt , which solve the optimisation problem stated in (10) and (11), we assume that the eigenvalue 2 .w opt / ¤ 3 .w opt / and that n .w opt / ¤ n 1 .w opt / implying that the spectral radius n is distinct. Furthermore, any feasible algebraic connectivity is bounded away from zero when Ä > 0, and the algebraic connectivity 2 is distinct.
We make this assumption to facilitate the analysis of the subsystems and argue that this assumption is required for exponential convergence of edge weights onto the solution in the entire distributed system. In Section 7.1, we discuss the ramifications when this assumption does not hold and demonstrate the performance of the system in this case.

Lemma 1 ([38])
Under Assumption 1, the algebraic connectivity 2 .w/, the spectral radius n .w/ and their associated unit eigenvectors, v 2 .w/ and v n .w/, are analytic functions of the edge weights in the vicinity of the solution, w opt , and hence, their partial derivatives are well defined in this region.
We may write the collection of (14) to (16) for all edges and nodes in the network in a compact form using Hadamard (elementwise) multiplication a D b ı c , a i D b i c i 8i. We also introduce Hadamard exponentiation so that a ı2 D a ı a.

Lemma 2
The equilibrium point of the system given by (17) to (19) satisfies the first-order necessary KKT conditions of the optimisation problem (10) with the feasible set (11).

Proof
The optimisation problem is of the form where x , w, the objective function f .x/ , n .w/ is convex, the inequality constraints g i .x/ , w q.i n/ ; 8i D 1 : : : m are affine and the final inequality constraint g mC1 , .Ä 2 .w// is convex. The vector of dual variables is simply y , 1 2 OE ı2> ; ı2> > It can be seen that the right-hand sides of (15) and (16) are equal to zero when and only when the complementary slackness condition holds: g.x/ ı y D 0. Feasibility of the dual variables is assured by the initial conditions ¹i;j º D 0, i .0/ D 0, so that y.t / > 0; 8t > 0. Setting the right-hand side of (17) to zero is identical to satisfying the stationarity condition rf .x/ D P i rg i .x/y i . Furthermore, if primal feasibility does not hold, then P y ¤ 0 so long as y ¤ 0, and we know that for all finite time y > 0 and may only approach zero exponentially fast (so long as g i .x/ remains bounded from below).

Lemma 3
For the optimisation problem defined by (10) and (11), the first-order necessary KKT conditions are also sufficient.

Proof
Proof follows from the fact that the primal problem is convex and strictly feasible, thus Slater's condition holds (see [39] for details).
This implies that the equilibrium OEw > opt ; > opt ; > opt > of (14) to (16) is also the solution to the optimisation problem (10), (11). It is now necessary to provide conditions for the system (14) to (16) to be exponentially stable. This will help in the later singular perturbation analysis. Consider the following result:

Proof
Consider the generic convex minimisation problem (20). If the inequality constraints are such that g i .x/ 6 0; 8i; x 2 X , and the primal feasible set X is convex and compact, f .x/ is strictly convex, there exists a unique optimal point x opt . At this point, some r 1 > 1 constraints will be tight (as f .x/ is monotonically increasing, we can be sure that at least one inequality will be tight), and some r 2 > 1 inequality constraints may be slack (there must be at least one slack inequality; by contradiction, if all were tight, then w opt D 0, and thus, 2 D 0, which is infeasible).
We can divide these tight and slack inequalities into strict inequalities p j .x/ < 0 and equality constraints h j .x/ D 0 by redefining 8i s.t. g i .x opt / D 0; h j .x/ , g i .x/; j D 1; : : : ; r 1 (21) 8i s.t. g i .x opt / < 0; p j .x/ , g i .x/; j D 1; : : : ; r 2 Now, the equivalent optimisation problem can be formulated: The solution to this optimisation problem can be obtained from the following set of ODEs: Linearisation of the system about the optimal point where Q x D x x opt , Q D˛ ˛o pt , Q Dˇ ˇo pt , is: 2 It is evident that this Jacobian matrix has a block diagonal structure, indicating the decoupling of the dual variables whose associated inequality constraints are slack at the optimal point. In the vicinity of the optimal point, the square root of the (redundant) slack dual variablesˇi will decay to zero exponentially fast with rate constants p i .x opt / < 0. That is, the 'more slack' the constraint is, the faster its dual variable will decay away.
The other block is more interesting and has the form In the vicinity of the optimal point, the primal variables x will converge to x opt exponentially fast provided R is negative definite. This follows from the following lemma.

Lemma 4
The matrix J 1 D Ä R S S > 0 has eigenvalues with negative real part provided the symmetric part of R is negative definite and S is full column rank. Proof It can be seen that det¹Rº ¤ 0 as R is nonsingular, and thus, R 1 exists. Furthermore, det¹S > R 1 Sº is only full rank if S is full column rank, that is, S > S is invertible. Therefore, det¹J 1 º ¤ 0; J 1 has no zero eigenvalues. Considering the eigenvector equation Looking solely at (34), we can see that if x i D 0, then either i D 0, which we have previously shown not to be true, or y i D 0, so that v i D 0, which is simply the trivial solution. Therefore, we can conclude that for all eigenvectors of J 1 , x i ¤ 0. Now, we drop the subscripts and assume that v is a unit eigenvector so that As x ¤ 0 and the symmetric part of R is negative definite, <.x Rx/ < 0. This is the only contribution to the real part of the eigenvalue; thus, J 1 is a stable matrix.
We can be sure that the lower bound on the algebraic connectivity will be tight h j .x/ , Ä 2 .w/ (if it was not, then all weights could be scaled by a common factor less than 1, so that n would also be scaled down further). In our case, any other tight inequality constraints are affine, and so the condition that f .x/ or any h i .x/ is strictly convex near x opt reduces to the sufficient condition that either r 2 n .w opt / > 0 or r 2 2 .w opt / < 0.
3.0.1. Strict convexity:. In note 1, it was discussed that we require either n .w/ to be strictly convex, or 2 .w/ to be strictly concave, in the vicinity of the optimal point for exponential convergence. However, neither n .w/ nor 2 .w/ is necessarily strictly convex. To account for this, we choose to minimise a modified function f .x/ D . n .w// 2 . It is apparent that minimising . n .w// 2 is equivalent to minimising n , so the two optimisation problems are equivalent.
In the following lemma, we show that as n .w/ is positive and convex (near w opt ), then its square will be strictly convex in the vicinity of the optimal point if that optimal point is distinct.
Lemma 5 If f .x/ is (not necessarily strictly) convex and positive, then f .x/ 2 is strictly convex over non-flat regions.

Proof
f .x/ is convex, so Jensen's inequality holds: Squaring both sides, so long as f .x/ is non-negative, we can be sure that But to prove that f .x/ 2 is strictly convex, we need to show that By considering the right-hand side of (43) and (44), strict convexity follows if the following inequality holds .
which clearly holds in the interval so long as f .
then there may be a manifold of optimal edge weights. In this case, . n .w// 2 will not be strictly convex in this region. Instead, the set of optimal points will be a neutrally stable manifold for the set of ODEs, but local convergence to this manifold will be exponentially fast. Substituting in the altered objective function f .x/ D n .w/ 2 in (20), we arrive at the modified weight adaptation law: where the collection of all ODEs for each node and edge may be written in the more compact form: Observing the set of ODEs, (49) to (51), it is clear to see that these are not fully distributed. The functions 2 .w/, n .w/ and their partial derivatives with respect to the edge weights are global functions of all the edge weights. Thus, to achieve a fully distributed strategy, we must make distributed estimates of these functions. This is achieved using the distributed algebraic connectivity estimator from [21] and a distributed spectral radius estimator. Next, we will recall some of the results from [21] in Section 4.1, before continuing in Section 4.2 to explain how the system can be modified for estimating the spectral radius and its partial derivatives with respect to their edge weights.

Algebraic connectivity estimator
As described in [21], the algebraic connectivity of a weighted undirected network can be estimated using the ODE : That is, every node follows with hai D 1 n P i a i being the arithmetic mean of the components of a and ha ı2 i D 1 n P i a 2 i being the mean of the sum of squared components. It can be shown that if the algebraic connectivity is distinct, the stationary point a is defined as where v 2 is the unit eigenvector associated with 2 .w/. This allows the following estimates to be performed [21]: The equilibrium point a is locally exponentially stable, provided that k 1 > k 2 2 and k 3 > k 2 2 , k 2 > 0. Moreover, the rate constant of this local exponential convergence is given by Again, observing (56), it is apparent that each node requires access to the global functions hai and ha ı2 i, and so the system is not fully distributed. As in [21], a solution is to use further subsystems to make distributed estimates of these quantities, using PI average consensus estimators [36]. These estimators will be discussed in Section 5.

Spectral radius estimator
The estimation approach used for the algebraic connectivity can be now extended to obtain an estimate of the spectral radius n and its partial derivatives with respect to each edge weight. However, instead of the action from L being used to drive the system towards consensus so that the slowest mode dominates (which associated with 2 ), L now needs to provide a diverging force so that the fastest mode (associated with n ) dominates. Again, the consensus mode is deflated through the action of the first term, and the third term provides a normalising force so that the vector b remains bounded and is driven away from the origin. Specifically, we propose to use the dynamics: Taking a local view of the system, every node follows, Each mode can be taken in turn, so that for the consensus mode b d;1 which has stationary points at b d; . Clearly, if k 1 > k 3 , then this second equilibrium point does not exist, and the first equilibrium point is globally exponentially stable; the consensus mode will deflate.
Other modes follow equations of the form Thus, there is one equilibrium point located at the origin b d D 0, and for each unique eigenvalue i , there is an associated equilibrium space such that In the case that i has unitary multiplicity, then the associated equilibrium is a set of two points equidistant from the origin (a 0-sphere) given by and in the general case that the eigenvalue is not distinct i D D i Cp , having algebraic multiplicity p C 1, then the associated equilibrium set is a p-sphere of radius q n.k 3 Ck 2 i / k 3 defined by the set of solutions: X k2¹i;::: b d;j D 0; 8j … ¹i; : : : ; i C pº (69)

Lemma 6
The system defined in (61) is bounded, contains no periodic orbits and there exists at least one stable equilibrium.

Proof
We notice that the diagonalised system, (63), is a gradient system P As such, we can be sure that the system contains no periodic orbits (see [40] for further details). Further to this, we can immediately see that the system is bounded, as for large b d , the positive quartic terms will dominate. The Lyapunov-like potential function V.b d / is quartic; thus, by continuity, there must exist a global minimum of the function, and this critical point will be a stable equilibrium.

Theorem 2
Under Assumption 1, with control parameters satisfying k 1 > k 3 > 0; k 2 > 0, the equilibrium set associated with n , which consists of two distinct points, b D˙v n s n.k 3 C k 2 n / k 3 is the sole stable equilibrium set. Moreover, any trajectory that does not originate at an equilibrium will locally exponentially tend to one of these two points with a constant rate of convergence given by Proof Taking a small perturbation e b d D b d b d , the system defined in (63) can be linearised about the equilibrium b d : In the simplest case where all eigenvalues of the graph Laplacian are distinct, and thus, the equilibria associated with each eigenvalue is a set of two distinct points (a 0-sphere), each ith equilibrium b d;i will have one non-zero element only at the ith position. Thus, the only non-diagonal term in (73), 2b d b > d , will simplify to a diagonal matrix with the value 2b 2 d;i located at the .i; i/ element and zeros elsewhere. Therefore, the linearised system is diagonal, and eigenvalues may be simply read off, noting that . At the 1st equilibrium point (at the origin, b d D 0), the eigenvalues are ¹ k 1 C k 3 ; k 3 C k 2 2 ; : : : ; k 3 C k 2 n º The eigenvalues at the ith equilbria (i 2 ¹2; : : : It can be seen that if k 1 > k 3 , with k 1 ; k 2 ; k 3 > 0, then all eigenvalues are negative only at the nth equilibrium. Again, we stress that this corresponds to an equilibrium set of two distinct points under Assumption 1 (when n is distinct), and thus, each of these points is going to be an exponentially stable equilibrium point. For all other equilibria, there is at least one positive eigenvalue, and the points will be unstable. Therefore, in the case that all eigenvalues are distinct, almost any trajectory will converge exponentially on one of the two points in the nth equilibrium set.
When eigenvalues are not distinct, the analysis becomes somewhat more complicated. Consider the non-trivial eigenvalue i D D i Cp ¤ 0 having multiplicity p C 1, then the matrix 2b d b > d in the linearised system, about any point on the p-sphere equilbrium, will not be diagonal. It will, however, be block diagonal, with all elements being zero except for the square block between the .i; i/ and .i Cp; i Cp/ elements. Therefore, it is still simple to determine the n .pC1/ eigenvalues, which are not affected by this block: ® k 3 k 1 ; k 2 . j i / 8j … ¹i; : : : ; i C pº¯ (76) Thus, if there is any eigenvalue j larger than i , the associated p-sphere equilibrium will be unstable in at least one direction. By the fact that the system is a bounded gradient system, there must be at least one stable equilibrium, and through elimination, we deduce that this is the equilibrium associated with the spectral radius n . We therefore conclude that for a Laplacian where n has multiplicity p C 1, from any initial condition that is not in the equilibrium set, the trajectory will converge to the p-sphere equilibrium defined by Reversing the diagonal transformation, we find the stable equilibrium set for the system in (61) In the special case where n is distinct, then b simplifies to b D˙v n s n.k 3 C k 2 n / k 3 (80)

Use as an estimator:
. From this stable equilibrium, we see that b is an estimate for the eigenvector associated with n . In the case that n is distinct (Assumption 1 holds), then b almost surely converges to the associated eigenvector, whilst when n is not distinct, then b almost surely converges to a vector in the associated eigenspace. Whether distinct or not, the magnitude of b is dependent on n , so by rearranging (79), we have and an estimate of n can be obtained as Estimates of the partial derivatives with repect to the edge weights @ n @w ¹i;j º may also be obtained using this estimator. Taking the eigenvalue equation for eigenvalue .w/, with associated unit eigenvector O v.w/, and taking the derivative with respect to the edge weight yield As L is symmetric, this ensures that, for the unit eigenvector O v, Using 88 and b ı jjbjj 2 as an estimate of the unit eigenvector v n , an estimate of the partial derivative can be obtained as just as for the algebraic connectivity sensitivities in 59.

Distributed estimation
Both the 2 and n estimators require node dynamics, (56) and (62), to be dictated by global variables, specifically through the quantities hai, ha ı2 i and hbi, hb ı2 i, respectively. These four functions are arithmetic means of the components of a and b and of the squares of the components. As such, they require information from every node and not just those in the neighbourhood of node i. As in [21], a solution is to use PI average consensus estimators [36] to obtain distributed estimates of these means and use these local estimates in the eigenvector estimators.

PROPORTIONAL-INTEGRAL AVERAGE CONSENSUS
In [36], it is shown that the arithmetic mean of a vector, say u, can be estimated in a distributed way on a network, using the linear system: Here, we use the notation u i to designate the ith node's proportional variable for estimating the mean of the vector u, with u i being the corresponding integrator variable. This notation is introduced as we will need to use four PI average consensus systems for estimating the means: As such, the superscripts for the four PI consensus systems will be: a; a ı2 , b, and b ı2 . However, for the moment, we will recall some results about the generic PI system (90). The collection of all agents proportional and integrator variable dynamics can be concisely written as the matrix differential equation: It can be shown that the stable equilibrium manifolds (provided k ; k P ; k I > 0) for these systems are where L .w/ is the Moore-Penrose inverse of the graph Laplacian: Considering as input u and as output u , there is an uncontrollable and unobservable mode in the integrator variables for each PI average consensus estimator associated with the consensus mode, 1 > u .0/, which does not have any effect on any other variables. Deflating the systems in this mode through simultaneous diagonalisation, and removal of the zero rows/columns, then results in a distinct equilibrium point. Let V.w/ be the matrix of right eigenvectors of L.w/ so that L D V > .w/ƒ.w/V.w/. Because we know that the unit eigenvector associated with the consensus mode is simply v 1 D 1 p n , we can define an n .n 1/ matrix of the unknown unit eigenvectors W.w/ and consequently perform the diagonalisation and truncation of the uncontrollable/unobservable mode with ƒ r .w/ being the truncated matrix, with the first row removed: By Lemma 4, the system matrix of (98) is Hurwitz. Thus, the distinct, stable equilibrium point in the transformed variables is u; where the square matrix ƒ c;r .w/ is the further truncated matrix (the first row and the first column are removed):

SINGULAR PERTURBATION FORMULATION
We are now ready to substitute the distributed estimates from the faster subsystems into the slower subsystems, formulating an entirely distributed system. Starting with the base subsystems, the four PI average consensus estimators, each node i maintains an estimate of the four means: where we indicate the node i's estimate of a function using a hat and superscript .i/. These estimates are substituted into the eigenvalue estimators, (56) and (62), to yield the distributed estimators: From (58), (59), (83) and (89), these distributed estimators allow the following local estimates to be made at each node i: Each edge ¹i; j º has access to both the local variables at node i and at node j ; thus, it can also make local estimates of these functions. So as not to bias one parent node over the other, we choose to use the mean of the consensus variables from i and j when either would do, as we expect consensus to be reached.

Note 2
Note that we are now locating the variables i , associated with the inequality constraint 2 .w/ > Ä, at each node, rather than at each edge in (51), to reduce the amount of duplication (n is typically less than m). It is also more common that computation be performed in each node, and to this end, for instance, the ¹i;j º variables could also be distributed over the nodes by duplication in the incident nodes. Let the dual variable located at the ith node and responsible for guaranteeing non-negativity for the edge ¹i; j º be denoted ¹i;j º i , then these variables can adapt according to .0/ D ¹i;j º .0/. It can be seen that any edge-based dynamics can be duplicated on the incident nodes in this fashion. However, for the sake of clarity and conciseness of notation, we continue to treat these variables as if they were located at edges to avoid the need for duplication.
The system is now entirely decentralised, and for clarity, the interactions between the various subsystems within an individual node are graphically represented in Figure 2. To clarify the separation Figure 2. A simplified flowchart of information flow within a node. Each block is labelled with a number pointing to which equation in the paper the block represents. There are three timescales for the ordinary differential equations indicated by the colours: yellow (slow), green (faster) and blue (fastest). White rounded blocks are simply algebraic equations, and white squares signify control constants. [Colour figure can be viewed at wileyonlinelibrary.com] in timescales between the different layers, we normalise the eigenvalue estimators by 1 D 1 k 2 and for the PI average consensus layers, by 2 D 1 k P . We are interested in the behaviour of this system as 1 ! 0 and 2 1 ! 0, when the control parameter ratios k 1 k 2 , k 3 k 2 , k k P and k I k P remain fixed. Again, the complete set of equations can be written in a more compact form, using the oriented and unoriented incidence matrices P and M: The system exhibits a three timescale structure of the form P x D f.x; y 1 ; y 2 / 1 P y 1 D g 1 .x; y 1 ; y 2 / 2 P y 2 D g 2 .x; y 1 ; y 2 / with column vectors x , OEw; ; , y 1 , OEa; b and y 2 , OE a ; a ; a ı2 ; a ı2 ; b ; b ; b ı2 ; b ı2 . Theorem 1 from Hoppensteadt [30] deals specifically with singularly perturbed systems of this type, and conditions are given under which the solution of system described by (134) converges to that of the reduced system P x D f.x; y 1 ; y 2 / 0 D g 1 .x; y 1 ; y 2 / 0 D g 2 .x; y 1 ; y 2 / (135) as 1 ! 0 and 2 1 ! 0. We now proceed to go through these conditions and discuss under which assumptions they hold.

Transformation to place the stable equilibrium point at the origin
Condition 1 from [30] requires that there exists an equilibrium point for each of the fast subsystems, that this equilibrium point is located at the origin of systems faster than it and also that this equilibrium point is isolated. That is, it is required that one equilibrium point of the four possible combinations of stable equilibria, in accordance with condition 1 of [30]. However, the same analysis could be performed on each of the distinct points, yielding the same result for each combination of equilibria. The equilibrium point of the reduced system is also known and is the constant point where q and r are, respectively, the optimal dual variables corresponding to the inequalities w > 0, and the optimal dual variable corresponding to the inequality constraint 2 .w/ > Ä, which solve the first-order necessary KKT conditions of the modified optimisation problem. As such, we can apply the transformations: So that and also that condition 2 in [30], which the equilibrium point of the slow system lies at the origin, is satisfied:

Analyticity of the right-hand sides
Condition 3 in [30] requires that the functions F.:/, G 1 .:/ and G 2 .:/ and their derivatives with respect to the components of Q x; Q y 1 ; Q y 2 are continuous in a ball of radius R around the origin. Condition 4 is a requirement that the elements of F.:/, G 1 .:/ and G 2 and the elements of @F @x , @G i @x , @G i @y j are bounded within the ball around the origin. Further to continuity and boundedness, conditions 5 and 6 specify uniform smoothness conditions on F.:/ and G 2 .:/. Specifically, it is required that there is a continuous non-negative

Proof
It can be seen that both g 1 .:/ and g 2 .:/ are quadratic functions of their variables and are thus analytic. Further to this, both stationary manifolds Y 1 .:/ and Y 2 .:/ are analytic provided Assumption 1 holds. Hence, the function G 1 .:/, which is a composition of the analytic functions G 1 .:/, Y 1 .:/, Y 2 .:/ and F.:/, is analytic. The same argument can be made for G 2 .:/, which is a composition of the aforementioned analytic functions and G 1 .:/.

Stability of the boundary layer systems and reduced system
The final two conditions in [30], conditions 7 and 8, concern the uniform asymptotic stability of the boundary layer systems and the reduced system. That is, it is required that the reduced system is uniformly asymptotically stable on the origin, as are and @ Q y 2 @ 2 D G 1 .x ; y 1 ; Q y 2 / when taking x and y 1 as any constant values (in a ball around the origin) and defining i , t i . The fully distributed system has been designed so that the reduced system (17) is locally exponentially stable (Theorem 1) implying uniform asymptotic stability. Likewise, each boundary layer system, found by taking slower variables as constant and faster variables as having already equilibrated, is simply the ideal estimators (55), (61) and (92) and is thus exponentially stable as proven in Theorem 2 and elsewhere [21,36].
All the conditions from Theorem 1 in [30] hold, and so we may draw the conclusion that there exists a sufficiently small D OE 1 ; 2 > such that 1 ! 0 and 2 1 ! 0 as j j ! 0, so that the equilibrium point corresponding to the solution of the optimisation problem is locally exponentially stable. A suitable choice for is OE 1 k 2 ; 1 k P > , where k P D k 2 2 . There exists a k P > 1 sufficiently large so that the equilibrium point of the entire distributed system is locally exponentially stable.

NUMERICAL VALIDATION
In this example, we start with a small network of n D 8 nodes and m D 11 edges (Figure 3). Each edge is assigned an initial weight of one, w ¹i;j º .0/ D 1, so that the initial algebraic connectivity 2 .w.0// 0:7611 and initial spectral radius of n .w.0// 5:8635. Edge weights are then controlled according to the distributed multilayer system, with the objective of minimising n , whilst holding the constraints that all weights are non-negative and a mininimum algebraic connectivity of 2 > 0:5 is maintained. As the simulation progresses, edge weights evolve over time and eventually settle to a stationary value (Figure 4(a)). The edge weights in the network directly affect the graph Laplacian, resulting in the eigenvalues changing over time (Figure 4(b)) and for this particular example, eigenvalues at the solution w D w opt are distinct, so that edge weights converge as expected. At the end of the simulation, t D 2000, the algebraic connectivity has settled to its lower bound as expected 2 .w.2000// 0:5000, and the spectral radius has decreased to n .w.2000// 3:2211, shrinking to approximately 55% of its initial value. This result agrees with the optimal value found using a centralised SDP solver, with the relative error in the minimal spectral radius being less than 0:001%.
The simple linear consensus protocol P x.t / D 0:01 L.w/x.t / with homogeneous time delay D 40 is run both on the initial and final states of the network ( Figure 5). From the bound given in [1], we know that the system is only stable on the consensus mode if n 6 100 2 40 3:9270. We see that in the intial network (at time t D 0), the spectral radius exceeds this bound and as a consequence trajectories in Figure 5(a), diverge. When the same system is realised on the near-optimal network ( Figure 5(b)), trajectories now converge as the consensus mode is stable.

Ramifications of Assumption 1 not holding
To illustrate the scenario where Assumption 1 does not hold, we use a larger network of n D 50, m D 118, so that there are n 1 D 49 non-trivial eigenvalues in the connected graph, see Figure 6.  [Colour figure can be viewed at wileyonlinelibrary.com] As n .w/ is decreased, the spectrum of the graph Laplacian is compressed into a smaller region, increasing the likelihood that at the optimal edge, weights 2 .w/ and n .w/ will not be distinct (Figure 7(b)).
In this example, the spectral radius of the graph Laplacian is again decreased over time, from an initial value of n .w.0// 11:1450 to n .w.4000// 5:2514, whilst the algebraic connectivity falls from 2 .w.0// 0:6589 to 2 .w.4000// 0:4988. Note that this time at t D 4000, the algebraic connectivity lies slightly outside of its constraint. This is a result of Assumption 1 failing. At the optimal edge weights w opt , the algebraic connectivity has a multiplicity greater than unity. Therefore, the algebraic connectivity estimator subsystem does not converge to a specific eigenvector associated with 2 but converges instead onto its associated eigenspace. The effect of this is that the algebraic connectivity derivative estimates, @ 2 @w , will be incorrect as the value is not well defined for non-distinct eigenvalues. This error forces the edge weights off of the optimal values, and a persistent oscillation in the neighbourhood of the optimum is set up. The magnitude and frequency of these oscillations can be controlled by the size of the separation in timescales between the weight dynamics and the eigenvalue estimation, as this controls how far the edge weights can overshoot before the eigenvalue estimator re-converges on the correct value for @ 2 @w . We would however like to compare the values of the extremal eigenvalues with those found using a centralised solver, specifically using an SDP formulation presented in [15]. The centralised solver finds optimal eigenvalues of 2;opt 0:5000 and n;opt 5:2178. Comparing these values with the extremal eigenvalues at t D 4000, of 2 .w.4000// 0:4988 and n .w.4000// 5:2514, we can see that the decentralised method performs admirably despite Assumption 1 not holding, finding edge weights that result in a cost just 0.64% greater than those found using a centralised solver. In contrast, the decentralised method presented in [25] can only adapt edge weights up to the point when the extremal eigenvalues become indistinct and the algorithm breaks down, resulting in a greater disparity between the solution to the optimisation problem and the result from the decentralised method. It is foreseen that this issue of non-distinct extremal eigenvalues will grow with increasing network size.

CONCLUSION
In this paper, we have formulated a multilayer continuous control law for optimising edge weights in an undirected network, so as to minimise the spectral radius of the weighted graph Laplacian matrix, whilst maintaining a minimum bound on the algebraic connectivity, and enforcing the constraint that no edge weight may be negative. Moreover, this control law is completely distributed over the network; both edges and nodes require only local variables that are stored in their neighbours local memory. A notable application of this optimisation is that the robustness of the network to time delays in the simple linear consensus protocol is increased, and the constraint on the lower bound of the algebraic connectivity maintains connectivity in the network, so that consensus can be achieved. Moreover, the algebraic connectivity is strongly related to the performance of consensus protocols on networks. It should also be noticed that minimising n whilst maintaining a minimum 2 is equivalent to minimising the synchronisability ratio n 2 whilst maintaining a desired algebraic connectivity, and so the method presented here may be readily adapted for a number of synchronisation applications.
Local exponential convergence has been proven for sufficiently large timescale separation between the layers, under the assumption that both 2 .w opt / and n .w opt / are distinct at the optimal edge weights. This assumption is relatively strong, and for large networks, with n 1 non-trivial eigenvalues, in which the extremal eigenvalues are being driven towards each other, this assumption will not necessarily hold. When the assumption does not hold, the system is still shown to perform well, but edge weights do not converge as in the distinct eigenvalue case. Instead, small oscillations about the optimal values persist, and a limit cycle is set up. This effect is due to the extremal eigenvalues crossing and discontinuities in the eigenvectors associated with the extremal eigenvalues: as the spectral radius is decreased, the trajectories of n and n 1 may cross over, and in the time it takes for the eigenvalue estimation to re-converge on the new eigenvector associated with n , the weight dynamics, using incorrect estimates, will overshoot, setting up the persistent oscillation.
Future work will focus on relaxing Assumption 1, so that convergence can be guaranteed when the extremal eigenvalues 2 .w opt / and n .w opt / are not distinct. Also, a constructive method shall be sought for determining a sufficient timescale separation between the layers.