Attitude optimization by extremum seeking for rigid bodies actuated by internal rotors only

In this paper, we investigate the stabilizing effect of an extremum seeking control law on a class of underactuated Lagrangian control systems with symmetry. Our study is motivated by the problem of attitude optimization for satellites with reaction wheels. The goal is to minimize the value of an analytically unknown configuration‐dependent objective function without being reliant on measurements of the current configuration and velocity. This work extends our previous work on extremum seeking control for fully actuated mechanical systems on Lie groups in the absence of dissipation. We show that the earlier method for fully actuated systems can also be successfully applied to a certain class of underactuated systems, which are controlled by internal momentum exchange devices (reaction wheels) so that the total momentum is conserved. Conservation of momentum allows us to reduce the control system to a system on a smaller state manifold. The reduced control system is not of Lagrangian form but fully actuated. For the reduced control system we prove that the extremum seeking method leads to practical uniform asymptotic stability. We illustrate our findings by the example of a rigid body with internal rotors.


INTRODUCTION
The problem of designing extremum seeking control for mechanical systems is of importance for numerous practical applications, such as source seeking control for autonomous vehicles. 1-7Such vehicles are usually driven by engines which generate forces or torques.It is therefore reasonable to choose a second-order dynamic control model (e.g., controlled Euler-Lagrange equations) to describe the motion of a source-seeking vehicle.Nevertheless, many studies on source seeking assume a first-order kinematic vehicle model, which means that the velocities can be controlled directly through the inputs.For instance, first-order kinematic vehicle models have been used to describe the motion of wheeled robots, 8,9  drones, 10,11 underwater vehicles, 12,13 and satellites. 14For a mechanical system, instantaneous changes of the velocity to a prescribed value cannot be realized with bounded forces because of the system's inertia.It is therefore desirable to investigate extremum seeking methods that can be applied directly to second-order dynamic control models.
There are many different strategies to design extremum seeking control. 15-18Every approach typically involves very specific assumptions on the structure and the behavior of the underlying control system.For example, many extremum seeking methods are based on the assumption that the system displays a steady-state input-output behavior. 19-22That is, for every constant input, the system has a unique asymptotically stable equilibrium so that the output converges to a purely input-dependent value.In this case, the problem of extremum seeking reduces to an optimization problem for the steady-state input-output function.Such an assumption is certainly justified for some applications, 23-25 but, in general, one cannot expect that a mechanical system has an asymptotically stable equilibrium if an arbitrary constant control force or torque is applied.
In recent years, research on extremum seeking control for open-loop unstable systems has made significant progress.In particular, the Lie bracket approach from Reference 26 and its various extensions, for example, in References 27-29, has led to a large class of novel extremum seeking methods.The approach in Reference 26 provides access to descent (or ascent) directions of an analytically unknown objective function through approximations of Lie brackets of suitably chosen vector fields.The approximations are induced by time-varying feedback of the measured objective function values.The time variations in the feedback law are generated by periodic perturbation signals with sufficiently large amplitudes and frequencies.This strategy works very well for first-order kinematic systems.However, if the approach is applied directly to a second-order dynamic system, then the approximated Lie brackets vanish.Extensions of the Lie bracket approach from Reference 26 are limited to a rather small class of second-order dynamic systems. 3-5Moreover, these extensions lead to an unbounded growth of velocities with increasing amplitudes and frequencies of the employed perturbation signals.It is therefore desirable to develop methods that can be applied to a larger class of open-loop unstable mechanical systems, and which lead to uniformly bounded velocities in the closed-loop system.
In the present paper, we follow a recently established approach to design extremum seeking control for mechanical systems.The approach was first introduced in Reference 30 for the particular case of an acceleration-controlled unicycle and, since then, has been extended to a larger class of mechanical systems. 7,31-35The strategy for second-order dynamic systems shares certain similarities with the approach for first-order kinematic systems as described in the previous paragraph.It also involves periodic perturbation signals with sufficiently large amplitudes and frequencies to induce an approximation of Lie brackets.However, in contrast to the kinematic approach, the velocities in the closed-loop system remain uniformly bounded with increasing amplitudes and frequencies of the employed perturbation signals.Moreover, the approximated Lie brackets are fundamentally different.While the kinematic approach involves Lie brackets of pairs of vector fields on the configuration manifold, the dynamic approach involves iterated Lie brackets of vector fields on the tangent bundle of the configuration manifold.The approximation property can be revealed through an application of the general averaging theory from Reference 36.One can identify the Lie brackets on the tangent bundle as so-called symmetric products of vector fields on the configuration manifold. 37 The present paper complements and extends our recent work 35 on extremum seeking control for fully actuated mechanical systems on Lie groups in the absence of dissipation.An implementation of the approach in Reference 35 does neither require measurements of the configuration nor measurements of the velocity.The only information about the current system state is given by real-time measurements of a scalar configuration-dependent objective function the value of which shall be minimized.The method in Reference 35 is based on an approximation of symmetric products as indicated in the previous paragraph.The symmetric products steer the closed-loop system into the negative gradient direction of the objective function.Moreover, the method in Reference 35 can be successfully applied in environments with little or no naturally occurring dissipation.A damping effect is induced by adding phase to the signal that estimates the negative gradient direction.However, the analysis in Reference 35 is limited to fully actuated control systems; that is, it is assumed that the control vector fields span the tangent space at any point of the configuration manifold.An application of the method in Reference 35 to underactuated systems has not been investigated so far.
The specific class of underactuated control systems we consider here is motivated by the investigations in Reference 38 on stabilization of Euler-Poincaré mechanical systems.The configuration manifold is assumed to be the Cartesian product of a non-Abelian Lie group with an Abelian Lie group (a product of tori and lines) and the Lagrangian is assumed to be left-invariant in the non-Abelian variables, cyclic in the Abelian variables, and the controls only act on the cyclic variables.This class of systems can be used to describe rigid bodies that are controlled by momentum exchange devices such as momentum wheels.In particular, the class contains simple models for satellites and underwater vehicles with internal rotors. 38Many different stabilizing control laws have been proposed and analyzed over the past decades.The problem of stabilizing relative equilibria (i.e., motions with constant velocity) is investigated, for example, in, 39-41 and the problem of stabilizing states with zero velocity is investigated, for example, in References 42-45.Implementations of the existing methods to stabilize relative equilibria require measurements of the velocity, and methods to stabilize states with zero velocity require measurements of the entire system state and knowledge of the desired configuration.In the present paper, we study a different stabilization problem.The desired configuration is the unknown minimum point of an analytically unknown scalar configuration-dependent objective function.We assume access to real-time measurements of the values of the objective function, but no other information about the current system state.For instance, in case of a satellite with internal rotors, the objective function could describe an analytically unknown signal that attains an extreme value for a certain unknown attitude of the satellite.We do not assume that the satellite can measure its own orientation (relative to a fixed inertial frame) or that it can measure its own rotational velocity.Moreover, we allow that the satellite has an unknown nonzero spatial angular momentum.In this situation, we are interested in a feedback law that asymptotically stabilizes the carrier about the optimal configuration with zero velocity.We will show that our control law leads to practical asymptotic stability, where the word "practical" means that we can ensure convergence of solutions into an arbitrary small neighborhood of the desired configuration, depending on the amplitudes and frequencies of the employed perturbation signals.More generally, we prove that the practical stability results from Reference 35 for fully actuated systems remain true for the above-described class of underactuated systems.
The paper is organized as follows.In Section 2, we introduce the technical notion of practical asymptotic stability that is used later to describe behavior of the extremum seeking closed-loop system.A description of the control system and a reduction to a control system on a smaller state manifold (using symmetry and conservation of momentum) is given in Section 3. The extremum control law as well as an averaging and stability analysis for the closed-loop system are presented in Section 4. At various points we illustrate our considerations by the above-mentioned example of a satellite with internal rotors and provide numerical simulation results.

L and l
Left-invariant kinetic energy Lagrangian given by G and its restriction to  × TG

PRACTICAL STABILITY AND APPROXIMATIONS OF TRAJECTORIES
We begin by introducing suitable notions of practical stability and attraction, which are tailored to the approach that we present in the subsequent sections.The closed-loop system under extremum seeking control will be a parameter-and time-dependent system on a certain state manifold.The state manifold M is assumed to be an embedded submanifold of some higher-dimensional Euclidean space.We use the Euclidean norm | ⋅ | of the ambient Euclidean space to measure distances of points of M. To represent the closed-loop system, for every positive real number , let f  be a time-dependent vector field on M. The parameter  will later play the role of an amplitude and frequency parameter for the periodic dither signals.We suppose that, for every  > 0, every t 0 ∈ R, and every p 0 ∈ M, there exists a unique maximal solution of the ordinary differential equation with initial condition p(t 0 ) = p 0 .In the subsequent sections, we want to stabilize a system of the form (1) by approximating the solutions of another system with desirable stability properties.To reveal the approximation solutions, we will perform a suitable change of variables of the form where   t ∶ M → M is a bijective map for each  > 0 and each t ∈ R. We will use the following terminology to describe the behavior of a parameter-and time-dependent system of the form (1) in the variables (2).Definition 1.We say that a point p * of M is practically uniformly stable for (1) in the variables (2) if, for every  > 0, there exist  0 > 0 and  > 0 such that, for every  ≥  0 , every t 0 ∈ R, and every p0 ∈ M with |p 0 − p * | ≤ , the maximal solution p of (1) with initial condition p(t 0 ) = (  t 0 Definition 2. Let p * be a point of M and let S be a neighborhood of p * in M. We say that p * is S-practically uniformly attractive for (1) in the variables (2) if, for all , r > 0, there exist  0 > 0 and Δ, R > 0 such that, for every  ≥  0 , every t 0 ∈ R, and every p0 ∈ S with |p 0 − p * | ≤ r, the maximal solution p of (1) with initial condition p(t 0 ) = (  t 0 We replace the prefix "S-" in Definitions 2 and 3 by the word "locally" if the subset S of M is an unknown small neighborhood of p * .If the map   t is the identity map of M for every  > 0 and every t > 0, then, we omit the phrase "in the variables (2)" in Definitions 1 to 3. If the vector fields f  are independent of the parameter , then we omit the word "practically" in Definitions 1 to 3. If, in addition to f  being independent of , the vector field f  = f is also independent of time, then we also omit the word "uniformly" in Definitions 1 to 3 .
Let f be a time-dependent vector field on M. Again, we suppose that, for every t 0 ∈ R and every p 0 ∈ M, there exists a unique maximal solution of the ordinary differential equation with initial condition p(t 0 ) = p 0 .Now we relate the solutions of ( 1) and (3) in the limit  → ∞.
Definition 4. We say that the solutions of (1) in the variables (2) approximate the solutions of (3) if, for all , Δ > 0 and every compact subset K of M, there exists  0 > 0 such that, for every t 0 ∈ R and every p 0 ∈ M, the following implication holds: If the maximal solution p of (3) with initial condition p(t 0 ) = p 0 satisfies p(t) ∈ K for every t ∈ [t 0 , t 0 + Δ], then, for every  ≥  0 , the maximal solution p of (1) with initial condition If the solutions of (1) in the variables (2) approximate the solutions of (3), then stability properties of (3) carry over to (1) in the variables (2) as follows.
Proposition 1. Suppose that the embedded manifold M is a topologically closed subset of the ambient Euclidean space.Assume that the solutions of (1) in the variables (2) approximate the solutions of (3).Then, for every point p * of M and every neighborhood S of p * in M, the following implication holds: If p * is S-uniformly asymptotically stable for (3), then p * is S-practically uniformly asymptotically stable for (1) in the variables (2).
A proof of Proposition 1 can be found in app.A of Reference 35.

Configuration manifold and Lie algebras
We consider a simple mechanical system whose configuration manifold Q is the Cartesian product of two (finite-dimensional) Lie groups H and G.We assume that the manifolds H and G are properly embedded into higher-dimensional Euclidean spaces.Recall that the Lie algebra of a Lie group is the tangent space at the identity together with the Lie bracket [⋅, ⋅] that is given by the usual Lie bracket of left-invariant vector fields through the canonical identification.The Lie algebra of H is denoted by  and the tangent bundle of G is denoted by TG.Throughout the paper, we make the following assumption, which is also common in the context of Routh reduction for Lagrangian systems (see, e.g., Sec.8.9 in Reference 46).

Assumption 1.
The Lie algebra of G is Abelian.
Our goal will be to control the variables lying in the quotient manifold Q∕G ≅ H (also called shape space) using controls which act directly on the variables lying in G.For example, for a satellite with internal rotors (which is treated in detail later), H is the special orthogonal group and G is the configuration manifold of the internal rotors.
Note that, up to isometry, a connected Lie group with Abelian Lie algebra is always a Cartesian product of a torus for all i, k ∈ {1, … , n}, where n is the dimension of H.Here and in the following we use the standard summation convention and drop the summation sign when there are repeated indices.Summation indices i, j, k run from 1 to n, and summation indices a, b run from 1 to m.Let H → GL(), x  → Ad x denote the adjoint representation of H.We write for every x ∈ H and every i ∈ {1, … , n}.For every x ∈ H, let L x ∶ H → H be the left translation map by x and let T e L x denote the tangent map of L x at the identity e of H.

Controlled Euler-Poincaré equations and problem statement
We assume that the Lagrangian L ∶ TQ → R for the mechanical system under consideration is given by a left-invariant Riemannian metric on the product Lie group Q = H × G.In particular, invariance of L under the action of G ensures that we can represent L in terms of cyclic coordinates for TG; that is, natural coordinates ( 1 , … ,  m , θ1 , … , θm ) for TG such that the coordinate representation of L does not explicitly depend on  1 , … ,  m .Let l ∶  × TG → R be the restriction of L to the Lie algebra of H.The Lagrangian L is assumed to be a purely kinetic energy Lagrangian (no potential energy) such that l takes the form where the v i are the components of v ∈  with respect to a basis { i } i for  and the real numbers g ij , g ia , g ab are elements of an (n + m)-dimensional symmetric and positive definite matrix.Recall that the summation indices i, j, k run from 1 to n, and that the summation indices a, b run from 1 to m.Note that using the same letter "g" for all components g ij , g ia , g ab is a slight (but common) abuse of notation.Recall that we use the notation c j ki for structure constants of (, [⋅, ⋅]).The controls are assumed to act only on the cyclic coordinates  1 , … ,  m .One may interpret this particular type of actuation as a control by momentum exchange devices such as momentum wheels.For instance, in case of a momentum wheel, a motor supplies a torque −u a to the ath wheel.Consequently, an equal opposite torque +u a is exerted by the wheel on the system.This leads us to the controlled Euler-Poincaré equations where v = (T e L x ) −1 ̇x is the -valued body velocity of a differentiable curve in H, θ is a curve in TG, and the u a are real-valued inputs for a control law.The papers, 38,47 which complement and extend the work in References 48,49, provide a constructive approach to the determination of stabilizing control laws for Lagrangian mechanical systems of the form ( 7)- (8).Under certain conditions, the method in References 38,47 can lead to asymptotic stability of relative equilibria, where the relative equilibria are motions with constant H-velocity v ≠ 0 and vanishing G-velocity θ.An implementation of the method in References 38,47 requires measurements of all velocities.In the present paper, we investigate a different stabilization problem for ( 7)-( 8), which is motivated by the problem of designing extremum seeking control (cf.References 15,17,18).Suppose that the only information about the current system state of ( 7)-( 8) is given by real-time measurements of an output channel where  is a smooth real-valued function on the shape space H.The function  is analytically unknown; meaning that the rule according to which an element x ∈ H is assigned to the value (x) is unknown.We consider  as an objective (or cost) function.Our goal is to design an output feedback law that asymptotically stabilizes the shape variables in H about points where  attains a minimum value.To this end, we extend a recently established approach to extremum seeking control for fully actuated systems on Lie groups from Reference 35.Here, fully actuated means that the controls can instantaneously induce a force into any direction of the configuration manifold H × G.Note that ( 7)-( 8) is an underactuated system since the controls u a only act into the direction of the variables in G. Our stability analysis will show that, under suitable assumptions, the extremum seeking method from Reference 35 is not limited to fully actuated systems, but can be successfully applied to an underacted system of the form ( 7)-( 8).

Reduction of the control system
Next, we analyze and reduce the control system ( 7)-( 8) with configuration manifold H × G to a control system with configuration manifold H by using symmetry and conservation of momentum.If the inputs u a in (8) are zero, then the conjugate momenta with respect to the basis dual to { i } i (see, e.g., Definition 5.68 in Reference 52), where the A j i are given by ( 5) and l is the (reduced) Lagrangian in (6).As a consequence of Noether's conservation law (see., e.g., Theorem 5.69 in Reference 52), we get the following property.
Fact 1 (Noether's conservation law).For every choice of the inputs u 1 , … , u m , the H-momentum map  is a conserved quantity along solutions of Equations ( 7) and (8).
Using (11) and the fact that the adjoint representation Ad ∶ H → GL() is a Lie algebra homomorphism, we can write the controlled Euler-Poincaré Equations ( 7) and ( 8) equivalently as Let g ab denote the entries of the inverse of the (m × m)-matrix with entries g ab .Solving (13) for θa gives Substituting ( 14) into (12), we get the reduced control system* on H × , where the  l (x, v, θ) are constant by Fact 1.We set J ij ∶= g ij − g ia g ab g jb (16) for all i, j ∈ {1, … , n}.Using a known formula for the inverse of a partitioned matrix (see, e.g., Theorem 8.2.1 in Reference 53) and that g ij , g ia , and g ab are entries of an (n + m)-dimensional symmetric and positive definite matrix, one can verify the following property.
Fact 2. The (n × n)-matrix J with entries J ij defined by ( 16) is symmetric and positive definite.
Because of Fact 2, the matrix J is the representation matrix of an inner product J on  with respect to the basis { i } i .Let { i } i denote the basis for  * dual to { i } i .For every a ∈ {1, … , m}, define Let J ♭ ∶  →  * denote the canonical isomorphism induced by J. Now we can write the reduced control system (15a,b) equivalently as where Ad ∶ H → GL() is the adjoint representation of H and ad ∶  → () is the adjoint representation of .
Example 1 (Rigid body with internal rotors).We consider a rigid body fixed at a point.The Lie group H = SO(3) of (3 × 3)-rotation matrices describes the set of all possible orientations.An element R of SO(3) maps the inertially fixed standard unit vectors e 1 , e 2 , e 3 in R 3 to the unit vectors Re 1 , Re 2 , Re 3 of the body frame.
The Lie algebra of SO(3) is the space  = (3) of skew-symmetric (3 × 3)-matrices.Define an isomorphism for every vector  ∈ R 3 with components Ω 1 , Ω 2 , Ω 3 .Let ⋅ ∶ (3) → R 3 denote the inverse of ⋅.The kinematic evolution of the orientation may now be expressed as Ṙ = R Ω, where the R 3 -valued map  ∶= (R −1 Ṙ)ď escribes the body angular velocity.Following the descriptions of a simple control model for a satellite in References 54-56, we suppose that the rigid body is carrying m momentum wheels (internal rotors) as illustrated in Figure 1.We assume that each rotor is spinning about an axis of symmetry passing through the center of mass of the body.The configuration manifold for the m rotors is the m-torus G = T m ; cf.Subsection 3.1.For each a ∈ {1, … , m}, let f a be the unit vector in R 3 such that, for every orientation R ∈ SO(3) of the carrier, the spinning axis of the ath rotor is given by the unit vector Rf a .Let F be the (3 × m)-matrix with columns f 1 , … , f m .Let J be a symmetric and positive definite (3 × 3)-matrix to describe the inertia of the rigid body with locked wheels minus the inertias of the wheels about their spin axes.Let J rotor be a diagonal and positive definite (m × m)-matrix to describe the inertias of the wheels about their spin axes.Then describes the inertia of the rigid body with locked wheels.Let θ1 , … , θm denote the spin velocities of the rotors relative to the carrier.The reduced Lagrangian l ∶ (3) × TT m → R is assumed to be given by where the spin velocities θ1 , … , θm of the rotors are combined to a column vector  r ∈ R m .Then, the controlled Euler-Poincaré Equations ( 7) and ( 8) can be written as where u is an m-component vector of real-valued inputs for a control law.The above matrix J is the same as the matrix J with entries J ij defined in (16).A direct computation shows that the reduced control system (18a,b) can be written as the system Rigid body fixed at a point with three internal rotors (momentum wheels) for attitude control.Each rotor is assumed to spin about an axis of symmetry passing through the center of mass.The internal control torques on the rotors conserve the total spatial angular momentum.
on SO(3) × R 3 , where the vector represents the (unknown) constant spatial angular momentum of the entire system.It is known from Reference 56 that system (23a,b) is controllable if the spin axes f 1 , … , f m span R 3 , but otherwise system (23a,b) is not controllable or even accessible.

Orthonormalization of the control directions
So far, we have reduced the controlled Euler-Poincaré Equations ( 7) and ( 8) to the controlled Equation (18a,b).Let J ♯ ∶  * →  denote the inverse of J ♭ ∶  →  * .Then we can write the reduced control system (18a,b) equivalently as where  0 ∈  * is the constant (but unknown) value of the H-momentum map  in (11).The value  0 is determined by the initial state of the system.The output of (25a,b) is still given by (9).To steer (25a,b) into a descent direction of the objective function , we need immediate access to any direction on H.For this reason, we assume that the reduced control system (25a,b) is fully actuated in the following sense.
From now on, we suppose that Assumption 2 is satisfied.Then we can find real constants S a i , i = 1, … , n, a = 1, … , m, such that the vectors form an orthonormal basis for  with respect to J. A computation of suitable coefficients S a i requires knowledge of the control directions J ♯ (f 1 ), … , J ♯ (f m ) and the inner product J, but no information about the current system state or the objective function.Now we apply the linear input transformation with new real-valued inputs u 1 o , … , u n o for the orthonormal control vectors e 1 , … , e n .Then, the reduced control system (25) becomes where the endomorphism G( 0 , x) of  is defined by The map G( 0 , x) describes a gyroscopic force since it is skew-symmetric with respect to J; that is, for all v, w ∈ .

Example 2 (Example 1 cont'd).
For the rigid body with internal rotors in Example 1, Assumption 2 means that the spin axes directions f 1 , … , f m of the rotors span R 3 .The assumption of three linearly independent rotor directions is also made in the papers 43,45 on stabilization by full state feedback.If Assumption 2 is satisfied, then the matrix F with column vectors f 1 , … , f m has full rank and therefore we can find an (m × 3)-matrix S such that where the √ J is the square root of the symmetric and positive definite matrix J.The linear input transformation (27) becomes where u o is a 3-component vector of inputs for the orthonormal control directions.The reduced control system (28a,b) with orthonormal control directions reads where the gyroscopic term in ( 29) is represented by Note that the columns of ( √ J) −1 in (33b) form an orthonormal basis for R 3 with respect to the inner product induced by J.

EXTREMUM SEEKING CONTROL
Throughout this section, we suppose that Assumption 1 and 2 are satisfied.In the next subsection, we explain how we get access to the gradient of the analytically unknown objective function  in ( 9) and how we estimate the unknown velocity v in the reduced control system (28a,b).

Gradient and velocity estimates through approximations of symmetric products
For the moment, we return to the initially considered controlled Euler-Poincaré Equations ( 7) and (8).Recall that the kinetic energy Lagrangian is assumed to be given by a left-invariant Riemannian metric on the product Lie group Q = H × G. Following the notation in Reference 52, we denote this Riemannian metric by the symbol G.The assumption of a left-invariant Riemannian metric ensures that the controlled Euler-Lagrange equations can be simplified to controlled Euler-Poincaré equations of the form ( 7) and (8); see Theorem 5.45 in Reference 52.In this subsection, we will describe the control system by Euler-Lagrange equations.To this end, let ∇ denote the left-invariant Levi-Civita connection for G.
For every differentiable curve q in Q, let ∇ q q denote the covariant derivative of q (also known as geometric acceleration of q).After performing the linear input transformation (27), the controlled Euler-Poincaré Equations ( 7) and ( 8) can be equivalently written as the controlled Euler-Lagrange equations where X 1 , … , X n are left-invariant vector fields on Q.Let G ♯ ∶ T * Q → TQ be the sharp map (musical isomorphism) of G, where T * Q is the cotangent bundle of Q.Let pr H and pr G denote the projections of H × G onto H and G, respectively.We identify TQ with TH × TG and T * Q with T * H × T * G .Because of the particular structure of ( 7) and ( 8), each vector field where 0 * H is the identically zero covector field on H and C i is some left-invariant covector field on G. On the other hand, because of the linear input transformation (27), each vector field X i can also be written as where E i is the left-invariant vector field on H whose value at the identity is the vector e i given by ( 26) and D i is some left-invariant vector field on G.The measured output (9) can be written as y = (pr H (q)).
Recall that we are interested in an output feedback law that steers the system towards a minimum of the objective function .Therefore, we need access to descent directions of .This can be done by means of the following perturbation-based extremum seeking approach from Reference 33.Let T be a positive real number and let u 1 , … , u n ∶ R → R be measurable, bounded, T-periodic functions with zero mean.The functions u 1 , … , u n will act as perturbation signals.Let  ∶ R → R be a smooth design function, which is specified later.Let  be a positive constant.For each i ∈ {1, … , n}, we consider the output feedback law where  is the state of a high-pass filter to remove a possible offset from the sensed signal y and  is a (sufficiently large) positive real control parameter.Control law (39) applied to (35) leads to the closed-loop system where, for each i ∈ {1, … , n}, the (not necessarily left-invariant) vector field Y i on Q is defined by For the sake of simplicity, we suppress the dependence of Y i on  in the notation.
There is a well-established averaging theory for oscillatory affine connection systems of the form (40): It is known from Reference 36 (see also Chap. 9 in Reference 52 for a textbook reference) that, for sufficiently large  > 0, a system of the form (40) approximates the behavior of a certain averaged system.The averaged system is again an affine connection system and contains symmetric products of vector fields on the right-hand side.Recall (see, e.g., Reference 57 for the original reference or Definition 3.106 in Reference 52 for a textbook reference) that, the symmetric product of two smooth vector fields X and Y on Q (with respect to ∇) is the vector field The averaging analysis in Reference 36 leads to the result that the averaged system associated with (40) is the system where and U 1 , … , U n ∶ R → R are the zero-mean anti-derivatives of u 1 , … , u n .Now we take a closer look at the averaged system (43).Using known computation rules for the covariant derivative as well as ( 37), (41), and (42), we compute for all i, j ∈ {1, … , n} and every q ∈ Q with x = pr H (q) ∈ H, where the averaged filter state  is treated as a constant and E i  ∶ H → R denotes the Lie derivative of  with respect to E i .Next, using that the Levi-Civita connection ∇ and the vector fields X i are left-invariant as well as (36) and Assumption 1, one can verify that the symmetric products ⟨X i ∶ X j ⟩ in (45b) are identically zero.Moreover, we choose the perturbations u 1 , … , u n so that the coefficients Λ ij in (44) are equal to 1∕2 for i = j and 0 otherwise.Then, the averaged system (43) becomes where we write x = pr H (q). Using (37) and that the vectors e i in ( 26) form an orthonormal basis for  with respect to J, we obtain that the H-component of the right-hand side of ( 46) is given by where grad  is the gradient of  with respect to the left-invariant Riemannian metric on H induced by J. Since we are interested in descent directions of , we choose the design function  so that the product  ′ attains only positive values.Note that grad  is not the same as the H-component of the gradient of •pr H with respect to the left-invariant metric G.
For this reason, the right-hand side of (46) does not describe a gradient force with respect to the left-invariant Riemannian metric G.The situation changes however if we consider the reduced control system (28a,b) on H ×  with respect to the inner product J.One can show (see Subsection 4.3) that the averaged system associated with the reduced control system (28a,b) under the feedback law (39) is the system where  ∶ H →  is defined by One can show that minimum points of the objective function  lead to stable equilibria for the averaged system (48a,b).However, since there is no dissipative term in (48a,b), we cannot expect asymptotic stability.If the current velocity v in (48a,b) would be known, then asymptotic stability could be induced by a suitable velocity feedback (e.g., linear damping).However, as explained in Subsection 3.2, we do not assume access to measurements of the velocity.Only measurements of the output ( 9) are available.To generate a damping effect, we use the following approach from Reference 35.Suppose for the moment that we have access to the negative gradient term on the right-hand side of (48b).Then we can generate a damping effect by adding phase to the time-varying negative gradient "signal."To this end, we introduce the phase-lead compensator with input and output y c , where a, , b,  are positive constants such that Now, if we replace the negative gradient term on the right-hand side of (48b) by the output of the phase-lead compensator, then we obtain the system We will show in Subsections 4.2 and 4.3 that a suitable modification of the output feedback law (39) leads to a closed-loop system that approximates the behavior of (53a,b).Moreover, we will show that minimum points of  lead to asymptotically stable equilibria of (53a,b).This in turn will ensure practical asymptotic stability for the closed-loop system.

Extremum seeking control law
Now we give a precise description of the control strategy.As indicated in Subsection 4.1, the proposed method consists of two components.The first component is the oscillatory output feedback law (39) for the purpose of gradient approximation.The second component is an additional term to approximate the behavior of the phase-lead compensator (50a,b) for the purpose of damping.This second component can be omitted if damping occurs naturally due to velocity-dependent dissipation (e.g., friction or air resistance).However, for applications in dissipation-free environment, as described in Examples 1 and 2, it is necessary to induce damping artificially in order to stabilize the system.Let T be a positive real number.For the perturbation signals, we choose measurable, bounded, T-periodic functions u 1 , … , u n ∶ R → R with zero mean such that their zero mean anti-derivatives for all i, j ∈ {1, … , n}.For instance, if n = 2, then we can choose u 1 ∶= √ 2 cos and u 2 ∶= √ 2 sin.Next, for the purpose of output feedback as in control law (39), we choose a smooth function  ∶ R → R such that the product of  and its derivative  ′ is strictly increasing and attains only positive values.For example, we can define  by which ensures that  ′ = (1 + tanh)∕2 is positive.As explained in Subsection 4.1, our method involves a high-pass filter and a phase-lead compensator for which we choose positive real constants h, a, , b,  such that ( 52) is satisfied.The proposed output feedback law for (28a,b) is where  is a positive real control parameter, y is the measured output ( 9), the real-valued variables w 1 , … , w n are given by and the real-valued high-pass filter state  is given by The purpose of ( 58) is to remove a possible offset from the measured output y, and the control parameter  is supposed to be large compared to the filter constant h.Note that (56) contains the inputs for the reduced control system (28a,b) with orthonormal control directions.To obtain the inputs for the initially given control system ( 7) and ( 8), one has to perform the linear input transformation (27).The first term on the right-hand side of ( 56) is the same as in (39).As indicated in Subsection 4.1, this term gives access to gradient of the objective function .To be more precise, it gives access to the last term on the right-hand side of (53b).The second term on the right-hand side of ( 56) has the intention to induce damping.We will see that a damping effect occurs since (57) approximates the behavior of the phase-lead compensator (50a,b).Next, we write the closed-loop system in a compact form.To this end, we introduce several -valued maps.Recall that the vectors e 1 , … , e n in (26) form an orthonormal basis for  with respect to J. The perturbation functions u 1 , … , u n and their zero-mean anti-derivatives U 1 , … , U n define -valued maps u ∶= u 1 e 1 + • • • + u n e n and U ∶= U 1 e 1 + • • • + U n e n on R. Similarly, the real-valued states w 1 , … , w n in (57) define the -valued state w ∶= w 1 e 1 + • • • + w n e n .Now the closed-loop system, which is obtained by applying ( 56)- (58) to the reduced control system (28a,b), can be written as the system

Averaging and stability results
In the next step, we extract the averaged system associated with the closed-loop system (59a-d).We perform the change of variables and consider the candidate for the averaged system where  ∶ H →  is defined by (49).Indeed, as a direct consequence of the averaging result in App.B of Reference 35, we obtain the following approximation result on the state manifold M = H ×  ×  × R in the terminology of Definition 4.
The gradient terms in (61b) and (61c) originate from symmetric product approximations, as indicated in Subsection 4.1.
Because of Propositions 1 and 2, it suffices to investigate stability properties of the averaged system (61a-d).If the gyroscopic term G( 0 , x) in (61b) vanishes, then (61a-d) reduces to a system which was already investigated in Reference 35.In this case, it is possible to provide a simple formula for a Lyapunov function for (61a-d): If the objective function  ∶ H → R attains a minimum value at some point x * of H, then a Lyapunov function V for (61a-d) is given by where || ⋅ || denotes the norm on  induced by the inner product J and  ∶ R → R is defined by Because of the assumptions on the design function  in Subsection 4.2, the function  is positive definite about 0. One can check that the derivative V of V along solutions of (61a-d) is nonpositive if the gyroscopic term G( 0 , x) in (61b) vanishes.We discuss the case of a nonvanishing gyroscopic term further below.For the moment, we make the following assumption.
Assumption 3. The constant value  0 ∈  * of the H-momentum map  is so that G( 0 , x)v = 0 for every x ∈ H and every v ∈ .
For a Lie group H with a non-Abelian Lie algebra , Assumption 3 means that the system is initialized with zero H-momentum.Such an assumption can also be found in Reference 42 for the particular case of a rigid body with internal rotors as in Example 1.
If Assumption 3 is satisfied and if the sublevel sets of  are compact, then the solutions of (61a-d) converge to the set on which V vanishes.To ensure that that this invariant set only contains the state with the desired configuration x * and zero velocity, we also need that the gradient of  is nonzero outside x * .To state these assumptions precisely, we introduce the following notation.For every smooth manifold X, every real-valued function f on X, every point x of X, and every real number y, we let f (≤ y, x) denote the connected component of the y-sublevel set of f containing x. Assumption 4.There exist a point x * ∈ H and a real number y 0 > (x * ) such that (≤ y 0 , x * ) is compact and such that the gradient of  is nonzero at every point x ∈ (≤ y 0 , x * ) with x ≠ x * .
If Assumptions 3 and 4 are satisfied, then one can use LaSalle's invariance principle (as it is done in Reference 35) to conclude that the averaged system (61a-d) is asymptotically stable.By Propositions 1 and 2, this implies the following practical stability property for the closed-loop system (59a-d) in the terminology of Definition 3.
In the presence of an arbitrary gyroscopic term (i.e., without Assumption 3), a proof of asymptotic stability for (61a-d) is more challenging.In this case, we can only prove local asymptotic stability under the additional assumption that the Hessian of  is positive definite at the minimum point x * .Such an assumption can also be found in Sec.40 of the classical textbook 58 by Chetaev on stability of gradient systems with dissipative and gyroscopic forces.Using the notation for the averaged system (61a-d), the system considered in Reference 58 is (roughly) of the form where R(x) is a symmetric and positive definite endomorphism (Rayleigh dissipation).The linear damping term −R(x)v in (64b) leads to a loss of energy for nonzero velocities.However, there is no such velocity-dependent damping term in Equation (61b) of the averaged system.Instead, the term −b w in (61b) has the intention to induce a damping effect.It is known from Reference 58 that a local minimum x * of  leads to a locally asymptotically stable equilibrium of (64a,b) if the Hessian of  at x * is positive definite.We show in Appendix A that the same statement is true for the averaged system (61a-d) without the damping term −R(x)v.As in Reference 58, our proof of local asymptotic stability is based on linearization about an equilibrium and a suitable choice of a Lyapunov function.
Assumption 5.The objective function  ∶ H → R attains a local minimum value at some point x * of H and the Hessian of  at x * is positive definite.
If Assumption 5 is satisfied, then we can prove local asymptotic stability for the averaged system (61a-d).By Propositions 1 and 2, this implies the following practical stability property for the closed-loop system (59a-d) in the terminology of Definition 3.
A proof of Theorem 2 is given in Appendix A.
Remark 1.A suitable lower bound for the parameter  in control law ( 56)-( 58) depends (among other things) on the value  0 ∈  * of the H-momentum map.Note that the momentum value  0 has a direct impact on the strength of the gyroscopic term G( 0 , ⋅) in (59b) and (61b).In case of a strong gyroscopic force, a larger value of the control parameter  is needed to ensure a sufficiently well approximation of the averaged system by the closed-loop system.On the other hand, since the right-hand sides of (59a-d) and (61a-d) depend smoothly on  0 , one can show that a suitable lower bound for the control parameter  also depends smoothly on  0 .
Example 3 (Example 2 cont'd).We test the proposed method numerically for the rigid body with m = 3 internal rotors as described in Examples 1 and 2. That is, we apply the time-varying output feedback law ( 56)-( 58) to the reduced control system (33a) with orthonormal control directions.The inertia matrix in (33b) is chosen as To generate measurements of the output (9), we define the (unknown) configuration-dependent real-valued objective function  on SO(3) by where I 3 is the 3 × 3 identity matrix.One can check that  satisfies Assumption 5 at the minimum point R * = I 3 .The carrier body starts with optimal attitude R(0) = R * but with nonzero initial body angular velocity (0) = [0.1,0.2, 0.3] ⊤ .We consider the following two cases: zero total spatial angular momentum  0 = 0 (left column in Figure 2) and nonzero total spatial angular momentum  0 = [−0.5,0.0, 0.5] ⊤ (right column in Figure 2).In any case, without control action, the nonzero initial rotational velocity prevents convergence of the attitude to the optimal configuration; see left column in Figure 2. The functions and parameters for the extremum seeking control law are chosen as follows.For each i ∈ {1, 2, 3}, define the function u i ∶ R → R for the ith perturbation signal by u i () ∶= √ 2 i cos(i).The design function  ∶ R → R is defined by (55).The gains h, a, , b,  for the high-pass filter and a phase-lead compensator are chosen as h ∶= 1, a ∶= 0.2,  ∶= 0.4, b ∶= 0.2,  ∶= 0.3.The high-pass filter state and the phase-lead compensator state start with initial conditions (0) = 0 and w(0) = 0. We know from the proof of Theorem 2 in Appendix A that the averaged system is locally asymptotically stable.This is confirmed by the simulation results in the right column in Figure 2. The closed-loop system approximates the behavior of the averaged system with increasing control parameter .The middle column in Figure 2 shows the result for  = 10.The speed of convergence toward the optimal configuration is much slower for a nonzero total spatial angular momentum.

F I G U R E 2
Attitude trajectories of a rigid body fixed at a point with three internal rotors as in Figure 1.In each plot, a trajectory t  → R(t) ∈ SO(3) is represented by the S 2 -valued trajectories t  → R(t)e 1 (red), t  → R(t)e 2 (green), t  → R(t)e 3 (blue), where S 2 denotes the unit sphere in R 3 and e 1 , e 2 , e 3 are the standard unit vectors in R 3 .

CONCLUSIONS AND OPEN PROBLEMS
We have shown that the extremum seeking control law from Reference 35 for fully actuated systems on Lie groups in the absence of dissipation can also be successfully applied to an underactuated systems of the form ( 7) and ( 8), provided that Assumptions 1 and 2 are satisfied.Sufficient conditions for local and non-local practical uniform asymptotic stability of a reduced closed-loop system were given in Theorems 1 and 2. In particular, we have shown that the method can be successfully applied to the example of a rigid body with three internal rotors (see Examples 1 to 3), which is motivated by attitude control of satellites in space with actuation wheels.In this context, it is natural to ask what happens if the number of actuation wheels is reduced from three to two or even to one.Then, we can certainly not expect stabilization of a single configuration, but maybe at least stability with respect to a certain set of configurations.For example, if the proposed method is applied to a satellite with only two actuation wheels, then, under suitable assumptions, it might be possible to induce practical stability about a certain axis of rotation with non-zero rotational velocity.Another question arises from the linear input transformation in (27).As explained in Subsection 3.4, an implementation of (27) does not require information about the unknown objective function , but it requires detailed knowledge about the underlying control system (7) and (8).If the underlying control system is not perfectly known, then it is natural to ask what happens if the proposed method ( 56)-( 58) is directly applied to ( 7) and (8); that is, without the input transformation (27).In this case, we get a different averaged system, which makes the stability analysis more challenging.

APPENDIX A. PROOF OF THEOREM 2
Suppose that are satisfied with x * ∈ H as therein.Set y * ∶= (x * ).We prove that the linearization of (61a-d) about the Assumptions 1, 2, and 5 equilibrium point p * ∶= (x * , 0, 0, y * ) ∈ M is asymptotically stable.It is known that asymptotic stability for the linearized system implies local asymptotic stability for the original system (see, e.g., Proposition 6.7 in Reference 52).Since (61a-d) is an autonomous system, we then may conclude that p * is locally uniformly asymptotically stable for (61a-d) in the sense of Definition 3. By Propositions 1 and 2, this implies that p * is locally practically uniformly asymptotically stable for (59a-d) in the variables (60), as claimed.
To linearize (61a-d) about p * , we use "exponential coordinates of the first kind" for the Lie group H (see, e.g., Remark 5.33-1 in Reference 52).It is known that the exponential map exp ∶  → H (see, e.g., Definition 5.30 in Reference 52) is a local diffeomorphism and that its tangent map at the origin is the identity map of  (see, e.g., Theorem 5.32 in Reference 52).In particular, there exists a sufficiently small open neighborhood U of the origin such that the restriction exp | U of exp to U is a diffeomorphism onto an open neighborhood of the identity of H.Then, locally about x * , we can perform the change of variables for the state x of (61a), where L x −1 * ∶ H → H is the left translation by the inverse of x * .This in turn allows us to write (61a-d), locally, as a system of the form ̇z = F(z) on an open neighborhood W of the point z * ∶= (0, 0, 0, y * ) in the vector space Z ∶=  ×  ×  × R, where F ∶ W → Z is a smooth map.The linearization of ̇z = F(z) about z * is the system ̇z = DF(z * )(z − z * ), where DF(z * ) ∶ W → W is the usual derivative of F at z * .A direct computation shows that the linearization of (61a-d) (in the variables (A1)) about z * is the linear system on Z, where id  is the identity map of , the endomorphism G * ∶= G( 0 , x * ) of  describes a gyroscopic force, and the endomorphism H * ∶=  ( ′ )(0) (T e L x * ) −1 Hess (x * ) (T e L x * ) (A3) of  is given by the Hessian operator of  at x * .Here, the Hessian operator of  at x ∈ H is the endomorphism Hess (x) of the tangent space to H at x defined by Hess (x)v x ∶= ∇ v x grad , where ∇ is the Levi-Civita connection on H for the left-invariant Riemannian metric determined by the inner product J on  (not to be confused with the Levi-Civita connection on Q in Subsection 4.1).By, Assumptions 5 the endomorphism H * is positive definite with respect to J.As in Subsection 4.2, we subsequently use the notation ⟨⟨⋅, ⋅⟩⟩ for J.
Because of the block diagonal structure of (A2) and since the high-pass filter gain h is positive, it suffices to show that the linear system where the first two terms on the right-hand side of (A5) may be interpreted as a sum of kinetic and potential energy.Since a, , b,  are assumed to satisfy (52), the function V is positive definite about the origin.Moreover, a direct computation, using that G * is skew-symmetric with respect to ⟨⟨⋅, ⋅⟩⟩, shows that the derivative V of V along solutions of (A4) is given by Since V is nonnegative, it follows that the compact sublevel sets of V are positively invariant sets for (A4).Suppose that (, v, w) is a solution of (A4) with V(, v, w) ≡ 0. Then (A4) and (A6) imply that ẇ = −( H *  + a w) ≡ 0. Therefore w is identically equal to some constant w * ∈ .It follows that  H * ξ =  H * ξ + a ẇ = − ẅ ≡ 0. Since H * is positive definite, this implies that  is identically equal to some constant  * ∈ .Using again (A4) and  H *  * + a w * = 0, we conclude that v = ξ ≡ 0 and therefore 0 ≡  ̇v ≡ −  H *  * − b  w * = (a  − b ) w * .Because of (52), this implies w * = 0 and therefore  * = 0. Thus, we have shown that (, v, w) is identically equal to (0, 0, 0).By LaSalle's invariance principle (see, e.g., Theorem 6.19 in Reference 52), we conclude that the linear time-invariant system (A4) is asymptotically stable.As explained at the beginning of the section, the statement of Theorem 2 follows.
) k and lines R l (see, e.g., Proposition 9.1.10and Theorem 9.1.11in Reference 46).Since the unit circle S 1 is commonly parametrized by an angle, we use symbols like θ to denote elements of TG.Natural coordinates for TG are denoted by ( 1 , … ,  m , θ1 , … , θm ), where m is the dimension of G.We typically use the letters v, w, and  for elements of .Let  → (), v  → ad v ∶= [v, ⋅] denote the adjoint representation of .The structure constants of the (not necessarily Abelian) Lie algebra (, [⋅, ⋅]) with respect to a basis { 1 , … ,  n } for  are the unique real numbers c j ki such that smooth vector fields X and Y on Q with respect to ∇ Let p * be a point of M and let S be a neighborhood of p * in M. We say that p * is S-practically uniformly asymptotically stable for (1) in the variables (2) if p * is practically uniformly stable for (1) in the variables (2) and if p * is S-practically uniformly attractive for (1) in the variables (2).