Discovering efficient periodic behaviors in mechanical systems via neural approximators

It is well known that conservative mechanical systems exhibit local oscillatory behaviors due to their elastic and gravitational potentials, which completely characterize these periodic motions together with the inertial properties of the system. The classification of these periodic behaviors and their geometric characterization are in an ongoing secular debate, which recently led to the so‐called eigenmanifold theory. The eigenmanifold characterizes nonlinear oscillations as a generalization of linear eigenspaces. With the motivation of performing periodic tasks efficiently, we use tools coming from this theory to construct an optimization problem aimed at inducing desired closed‐loop oscillations through a state feedback law. We solve the constructed optimization problem via gradient‐descent methods involving neural networks. Extensive simulations show the validity of the approach.


Introduction
Mechanical systems, such as industrial robots or bio-inspired ones, often need to perform tasks exhibiting a periodic nature, e.g., pick and place or periodic locomotion.The ubiquity of these tasks, as well as the theoretical appeal of understanding and characterising periodic solutions of dynamical systems, made the study of repetitive motions and their control an immensely important branch in the system theoretic community.
Abstracting from the specific class of mechanical systems and assuming a more general control theoretic perspective, the problem of tracking periodic signals, sometimes referred to as periodic regulation, has been intensively tackled with different tools.Without the claim to be exhaustive, we refer to the surveys [1,2] for an overview, to [3,4] (and references therein) for more recent contributions, and to [5] for an application in robotics.
Contrarily to what is pursued in this work, the mentioned approaches are mostly focused on the design of controllers which implement some steady-state cancellation of the plant dynamics to achieve tracking of specific periodic reference signals.As mentioned in [6], these approaches lack a biomimetic perspective in the sense that the design of the periodic regulator is focused on versatility rather than efficiency.In other words, the focus of these approaches is designing a controller that works for a large class of reference signals rather than designing efficient controllers for a smaller class of efficiently stabilisable periodic trajectories.In [6], this efficiency objective is pursued by steering a mechanical system onto natural oscillations of the system itself, which are matched to the mechanical system's physics.
The existence of such periodic oscillations for nonlinear mechanical systems with conservative potentials (usually considered of elastic and gravitational type) is a well-known fact [7,8,9], and the recent theory of eigenmanifolds [10] attempts at giving a geometric characterisation of these families of oscillations.These oscillations constitute an invariant of the system, i.e., when no dissipative effects or other disturbances are present, a system initialised on such a nonlinear mode would stay there autonomously, with no need of additional inputs.The control theoretic appeal for such structure is immediate once a nonlinear oscillation is understood as a desired periodic behaviour for the closed-loop system, which can vary from achieving energy efficient forms of locomotion, to industrial-like tasks like e.g., pick and place.
In [6] the authors successfully stabilised these periodic oscillations defined by eigenmanifold theory, claiming an efficient control design.In fact, a controller able to stabilise a specific invariant oscillation of the system only requires a minimal power consumption, as in principle only the energy to compensate for dissipative effects would be injected by the controller.In conclusion, the underlying biomimetic rational drives the designer in exploiting the natural physics (elastic joints, gravity, inertial parameters) to understand and stabilise an efficiently stabilisable behaviour with minimal energy consumption.We refer to the recent paper [11] for further elaborations about the connection between efficiency in robotics and the exploitation of natural physics (referred to as "intrinsic dynamics" in that work) present in mechanical systems.
In [6] the approach was limited to stabilise the open-loop nonlinear modes produced by the conservative elastic and gravitational potentials of the underlying mechanical system.Motivated by the fact that natural modes of the open-loop system might not correspond to desired task-specific oscillations, and that mechanical design of a system achieving specific desired oscillations might be very difficult, we introduce a new scheme, which can be seem as an extension of the one in [6] to account for a broader class of periodic oscillations.In particular, we aim at learning and stabilising a desired oscillation which achieves the fulfillment of some periodic task, which is close to the natural mode of the underlying system, but not necessarily coincident.The main contribution of this paper is to present a procedure aimed at finding a potential based state-feedback law which generates desired efficient oscillations in the closed-loop system.In order to do so we cast the control problem into an optimisation framework in which the decision variable is a control potential, approximated by a neural network and updated through gradient descent to minimise a task-dependent performance metric together with a metabolic cost.The learned potential uniquely defines a feedback law which generates a closed-loop system exhibiting the desired oscillations.These are then stabilised using an approach similar to [6,10], where non trivial adaptations have been made to improve the energetic behavior of the control.
Extensive simulations performed on a double pendulum show the validity of the approach.

Structure of the paper
The structure of the paper is sketched in Fig. 1.In Sec. 2, we give some background material on the Hamiltonian formulation of controlled mechanical systems and on eigenmanifolds.In Sec.4.1, the main contribution of this work, the optimisation of the control potential is presented and addressed through gradient descent methods involving neural networks as functional approximators.The section is concluded by defining the controller aimed at stabilising the mechanical system on the learned periodic mode and addressing the energetic behavior (in particular passivity) of the resulting closed-loop system.Sec. 5 contains the simulations and discussions, while Sec.6 concludes the paper.The extensive appendices B to E show further results of the proposed optimization.

Hamiltonian formalism for controlled conservative mechanical systems
In this work we deal with conservative mechanical systems, and we use the Hamiltonian formalism to describe their dynamics.Even if not standard in the eigenmanifold literature, this choice will provide technical advantages in formally Figure 1: General architecture of the control scheme and synopsis of the paper presenting some properties of interest.In order to keep the focus on the relevant contributions, in this work we will present all the equations in "standard" coordinates with R n as the configuration space for an n-dimensional mechanical system 2 .However, all the concepts can be generalised at a manifold level.The Hamiltonian dynamics (with control) of an n-DoF conservative mechanical system with position q(t) ∈ R n and momentum p(t) ∈ R n is described (hiding time dependencies for lightening notation) by where H(q, p) = K(q, p) + V (q) is the Hamiltonian, i.e., the total mechanical energy of the system.The total mechanical energy H is given by the sum of kinetic energy K(q, p) = 1 2 p T M −1 (q)p (where M (q) ∈ R n×n is the inertia tensor) and the potential energy V (q), storing the conservative gravitational and elastic effects.As standard in this formalism, the gradient operator applied to the Hamiltonian is given by ∇H(p, q) = ∂ ∂q H(p, q) ∂ ∂p H(p, q) T ∈ R 2n , and I and 0 are the n-dimensional identity and zero matrices respectively.We consider an explicit control input u, representing the generalised forces on the mechanical system collocated to the degrees of freedom defining the position coordinates q.The usual corollary that the Hamiltonian function is conserved along autonomous evolutions ( Ḣ = 0 holds along solutions of (1) with u = 0) will be used in the rest of this work.

Eigenmanifolds
Eigenmanifold theory [10] generalises the theory of oscillations present in linear mechanical systems to conservative, intrinsically nonlinear mechanical systems.Here, the essentials of this formalism are presented in its Hamiltonian form.
An eigenmanifold E ⊆ R 2n is then a collection of such modes x, defined with respect to an isolated, stable equilibrium x eq = ( q, 0) of the system (1).Such an equilibrium, which represents the "trivial mode" in the eigenmanifold, exists at a minimum of the potential V (q).The additional demand is that the collection R = x eq ∪ {x(t)|p(t) = 0}, called the generator of the eigenmanifold, is a connected, 1-dimensional submanifold 3 , see also Fig. 2. The generator represents the collection of points which are the extrema of the oscillations of every mode in the eigenmanifold.These modes, for systems in the form (1), partially characterise the periodic oscillations that a frictionless mechanical system can have.
Figure 2: In a mechanical system with potential V (q) and equilibrium q, an eigenmanifold is a collection of eigenmodes x(t) = (q(t), p(t)), such that the 1-dimensional generator R (which collects particular initial conditions of different eigenmodes) contains the equilibrium x eq = ( q, 0) as a limit point.Here, these concepts are depicted after their projection to a 2-dimensional configuration space.
It is instructive to think about these modes as the collection of trajectories of (1) factoring out i) the bounded and non periodic evolutions, whose behavior is commonly referred to as chaotic and ii) the periodic evolutions for which no point with p(t) = 0 exists, i.e., those which do not qualify as oscillations.Eigenmanifolds are then of particular interest to factor out such trajectories in nonlinear mechanical systems with n ≥ 2 DoFs (e.g., double pendulum), where chaotic behavior is often present.
Remark: Similar to linear oscillations the eigenmodes can often4 be ordered in the eigenmanifold for increasing levels of energy along a mode (starting with the zero energy level corresponding to the trivial mode which is the equilibrium x eq ).Contrarily to the linear case, the eigenmanifold can be bounded (it is not a linear space and it can also not be extended indefinitely for high energy levels), and every mode has in general a different period T (while in the linear case it is constant).
The problem of existence and the complete characterisation of eigenmanifolds for conservative mechanical systems is an open problem and is out of the scope of this paper.We refer to [10] for the latest developments in this direction.Nevertheless, as main motivation of this work, both experimental and numerical evidence [12] are confirming that such nonlinear oscillations are structurally present in mechanical systems of any dimension, and can be detected and stabilised, as shown in [13,6].
To formulate the proposed eigenmanifold optimization method in Section 4.1, we need three lemmas involving conservative mechanical systems which can be verified using the Hamiltonian formulation (1).In fact, the latter system (with u = 0) is subject to the discrete symmetry (q, p, t) → (q, −p, −t), i.e., if (q(t), p(t)) is a forward in time solution for (1), then (q(−t), −p(−t)) is likewise a forward in time solution for (1).The following lemmas, which are proven in [12], act as corollaries.
Lemma 2.2 Any periodic trajectory with p(0) = 0 and period T will have the property that p(T /2) = 0.
Lemma 2.3 Any trajectory with two distinct points q(t 1 ) = q(t 2 ) such that p(t 1 ) = 0 and p(t 2 ) = 0 will be periodic, with period Combining the definition of an eigenmode and the previous lemmas, valid for any conservative mechanical system, the following can be concluded.An eigenmode with initial conditions 5 (q(0) = q 0 , p(0) = 0) has necessarily a period T = 2 t, where t is the time instant of the other extremum of the oscillation, i.e., p( t) = 0. Conversely, if a periodic trajectory with properties defined in Lemma 2.3 presents a line shaped set {q(t)|t ∈ R}, it is necessarily a (possibly isolated) eigenmode.It is worth to note that this condition of line-shapedness was rarely violated in practice.

Related Work: Neural Networks in Dynamical Systems
The recent developments of artificial intelligence and machine learning has opened the door to new approaches for understanding and controlling dynamical systems by relying on data.In particular, data-driven methods, e.g.neural networks, have often been used as function approximators for learning the dynamics of systems [14,15,16,17] or for representing control strategies [18,19,20] even in high-dimensional optimal control problems [21].Furthermore, neural networks can be used to approximate the Lyapunov functions in the case of autonomous [22,23,24,25,26] and non-autonomous dynamical systems [27,28,29,30] for stability and control purposes.
However, purely data-driven methods often learn physically-inconsistent models that do not respect physical conservation laws.Therefore, the most recent research trends have shifted towards encoding physical principles into neural networks.Examples are hamiltonian, symplectic, and lagrangian neural networks [31,32,33] and the physics-informed neural networks [34], aiming at exploiting the best of both worlds, namely the expressive power of nonlinear function approximators with grounded physical knowledge.
Another important step in this direction has been the introduction of Neural Ordinary Differential Equations (Neural ODEs) [35].The neural ODE framework allows the study of a neural network and its training phase as ODE, opening many possibilities for analysis and understanding of black-box methods.
A closely-related approach to our methods is the work of [36], where a neural ODE is used for learning an optimal passive controller in the port-Hamiltonian framework.The learned controller is composed of a learned potential energy term and a learned damping injection term.However, differently from [36] which solve the problem of the stabilization of an inverted pendulum, we focus on a more complex problem, namely the learning of energy-efficient eigenmodes for optimally solving pick and place tasks with a double pendulum.Additionally, instead of learning a damping injection term, we introduce a passive controller injecting only the energy lost by the system due to dissipative elements.

Discovering and stabilizing optimal eigenmodes
In this work, we want to control trajectories efficiently towards periodic signals that perform some task.To do so, we first need to find a periodic signal that 1) represents the execution of a task and 2) allows for efficient control towards it.We discuss this in Section 4.1.Once such an oscillation is found, we discuss a controller that steers trajectories onto this orbit in Section 4.2

Discovering optimal eigenmodes via Neural Approximators
To find an oscillatory motion that allows for efficient control towards it, we consider eigenmodes (see definition 2.0.1) of the system in (1) where we restrict the input to the gradient of a control potential u = ∇ q V θ (q), so that the closed-loop system will have the form of an autonomous mechanical system (1) (u = 0) with Hamiltonian H + V θ .The rationale behind this choice is to modify the system dynamics from (1) as little as possible, avoiding potential cancellation approaches and exploiting the natural physics in the most efficient way.
The aim is to find the map V θ : R n → R, such that for a fixed initial condition the resulting motion is an eigenmode of the closed-loop system and minimizes a task-dependent cost term L task .This yields a constrained optimization problem whose decision variable is the map V θ .To solve this problem with gradient descent methods, a finite-dimensional parametrisation of V θ is necessary.We denote θ the vector collecting the (finitely many) parameters characterising the map V θ , which motivates the notation for the latter.In this work θ will collect the parameters of a neural network, which will be used as functional approximator for V θ (q).In this perspective, the closed-loop system in optimisation phase becomes a so-called Neural ODE [35].
Summarising the above considerations, the optimisation that we aim to solve is then represented as: where L task (x) is the loss function of the problem, L eigen (x) = 0 represents the constraint forcing the closed-loop trajectory to be an eigenmode, and x = (q, p).Notice that the choice p 0 = 0 happens without loss of generality since we are dealing with periodic orbits, and by Def.2.0.1 an eigenmode is always characterised by p(t) = 0 for some t.
Unless specified otherwise, in this work we assume both the initial position q 0 of the eigenmode and its period T to be fixed.
In Section 5, we solve the optimization problem (2) for a pick and place experiment where we move from initial task space position h(q 0 ) (being h(q) the forward kinematic map) to a desired position h * .In this case we design L task (x) as: where • 2 is the 2-norm, such that the first term promotes the minimisation of the distance between the end-effector position at time t = T 2 and the target position h * , and the second term is of metabolic nature and penalises high control efforts u(t) = ∇V θ (q(t)) 6 .Here, α eff is a positive scalar balancing the contribution of the two terms, whose effect is analysed in Appendix B.1.
The constraint L eigen (x) = 0 in ( 2) is designed in a way to force the evolution of the closed-loop system to be an oscillation: the construction of the function L eigen (x) is inspired by Lemma's 2.1, 2.2, and 2.3.In particular, given the initial conditions in (2), by Lemma's 2.2 and 2.3, it suffices to enforce p T 2 = 0 to get a periodic trajectory of period T .Moreover, given a periodic trajectory with period T , Lemma 2.1 shows that q(t) = q(T − t) and p(t) = −p(T − t).Finally, by periodicity, the trajectory satisfies q(T ) = q 0 and p(T ) = 0. Combining all these observations, we chose the following form for L eigen : where and where • ∞,T is defined by: y(•) ∞,T := max t∈[0, T  2 ] ( y(t) 1 ) with y : [0, ∞) → R n .
As an alternative to (2), the eigenmode constraint can be relaxed into a soft one by solving the optimisation: with β ∈ R + a positive constant.
We stress that even though this version of the optimisation does not present L eigen = 0 as a hard constraint, in the moment in which a line shaped periodic trajectory results as a solution of the optimization problem, we are able to assess the learning of an eigenmode with the same confidence as for (2) by considering Definition 2.0.1, although we can in principle not ensure that the optimization will result in a periodic orbit.
Remark: In the pick and place experiment, we want the end-effector to stop at a specific location h * = h(q 0 ) at some arbitrary time t.By choosing t = T 2 in (3), the constraint L eigen (x) = 0 guarantees that the end-effector will actually stop at h * .
Remark: With the lemmas 2.1, 2.2, 2.3, and the definition of an eigenmode in mind, it is easy to check that the trajectory x(t) will correspond to an eigenmode in the sense of eigenmanifold theory if and only if λ 2 > 0, L eigen (x(t)) = 0 and {q(t)|t ∈ R} is line shaped.As a consequence, the term λ 1 is strictly speaking redundant, but was found to improve the convergence (together with the specific choice of norms in (4)) in the optimisation.In conclusion, if the solution of the optimization problem above yields a line-shaped periodic trajectory, we can conclude that the orbit indeed corresponds to an eigenmode.We furthermore stress the practical scarcity of non line-shaped periodic trajectories, which, to the knowledge and the experience of the authors, have been rarely found in the previously studied cases.

Solving the optimisation
Given the finite-dimensional parametrisation of the map V θ (q), in this work the optimisation is solved through gradient descent methods, i.e., the optimal parameters θ are found by iterating: where L(x) = L task (x) + βL eigen (x) is the cost in (5).If η k , a positive scalar referred to as learning rate, is suitably chosen, and L(x) is convex, θ converges to the minimiser of L(x) as k → ∞.Although global convergence is no longer guaranteed in the nonconvex case (which is the case of this work), gradient descent techniques are widely used in practical applications, especially among the machine learning community, due to their scalability and computational efficiency.
In order to implement the gradient descent procedure, the sensitivity ∂ ∂θ L(x(θ)) needs to be computed.This is where the so called neural ODE framework, an extension of the continuous depth framework for recurrent neural networks, is used.In particular, the dynamic constraint in (2) has the structure of a neural ODE, i.e, an ordinary differential equations parametrised by a neural network V θ (q) with parameters θ.The training of this continuous network corresponds to solving the optimisation problem (2).The sensitivities ∂ ∂θ L(x(θ)) are calculated via the backpropagation method, in particular via automatic differentiation [37], that is commonly used for training neural networks.Utilising the adjoint method [35] for computing the exact sensitivies, rather than the approximate ones computed by backpropagation, is an option for future investigation.

Stabilising Controller and Analysis of the Closed-Loop System
We formally introduced the optimisation that aims at learning a closed-loop conservative mechanical system exhibiting desired oscillations.In real applications, where dissipative effects and parametric disturbances are present, it is important to design a controller able to robustly stabilise the closed-loop system onto the learned eigenmode.With the motivation of interpreting the learned oscillations as "efficient" (minimizing a certain cost-function), it would furthermore be desirable that the stabilising controller acts in a energetically convenient way (i.e., the control effort is equal to zero on the desired trajectory, and the controller is passive, if no dissipation is present).In other words, the controller should inject the mechanical energy needed to stay on the eigenmode into the system and it should compensate for unavoidable dissipative effects only, resembling a clear biomimetic approach.In [6] such a controller was successfully implemented to stabilise the (open-loop) eigenmodes of a 7-DoF KUKA iiwa robot.Here we propose an alternative stabilising controller that is likewise split into an energy-injecting and an eigenmode stabilizing part.Contrary to [6], the latter is not allowed to inject energy in this work.The effect of this splitting will simplify the analysis of the controller.
The system with stabilizing feedback u s : R 2n → R n is of the form d dt The purpose of this feedback is to stabilize an eigenmode x : R → R 2n ( x(t) = ( q(t), p(t))), the latter being itself a solution of the learned autonomous system (5).To this end, the desired requirements are lim t→∞ dist(q(t), q( t)) = 0 , t = arg min s dist(q(t), q(s)) , Here dist(a, b) returns the Euclidean distance 7 of points a, b ∈ R n .
Intuitively speaking, t in equation (10) is the parameter at which the desired trajectory q is closest to the current position q(t).In practice, t is implemented as a function t : R n → R that takes as input q ∈ R n .Although q( t) is uniquely determined, p( t) is only determined up to a sign for an eigenmode, which is chosen according to the sign function σ : R 2n → {−1, 0, 1} in equation ( 11) to be aligned with the current system momentum p(t).
The choice is made to split the control into an energy-controlling feedback u E (cf.[38]) and an eigenmode stabilizing feedback u M .For an analogous control splitting see [6,13].

Energy-controlling feedback
The energy-controlling feedback u E steers the system's energy E = H + V θ towards a desired energy Ē = E( q(0), p(0)).The form chosen is with α E ∈ R + a positive control gain and the normalized momentum 8 p = 1 Since q = M −1 (q)p, it holds that the mechanical power u T E q injected by the energy controller is given by

Eigenmode stabilizing feedback
The eigenmode stabilizing feedback u M is defined as where α M ∈ R + is the positive control gain.Furthermore, σ(q, p) ∈ {−1, 0, 1} and t(q) ∈ R are as defined in (11) and (10), respectively.p : R → R n is the momentum component of the desired eigenmode.Last, π p is the projection defined by This projection is such that which means that u M cannot change the energy content of the system, and thus cannot interfere with the control-task of u E .
Remark: This is a D-type controller analogous to [6,13], with the only adaptation being that the energy injection is restricted (compare e.g.[39]).The controller of the form ( 16) follows from by using the property of the projection that π p (p) = 0.

Stability
We first investigate the energetic behavior of the combined controller u s = u E + u M , and investigate the stability of the trajectory afterwards.The energy injected by the controller is equal to the mechanical power u T s q: the parallel transport of the momentum p along the geodesic ρ from q(t) to q( t).Here, instead, ρ * is chosen to be the identity map in the given coordinate system. 8To avoid numerical issues in practice, p is chosen as 0 when p T M (q)p = 0.
Here, the second equality holds because the system without feedback u s conserves E, while the third equality follows from the definition of momentum M (q) −1 p = q.
Combining the expressions shows that Hence, the energy converges to the desired energy level Ē almost always, i.e. as long as u E = 0, and otherwise E is constant.Moreover, as the combined actions u E and u M vanish only on the desired mode, we get highly efficient control behavior as highlighted in [13,6].
However, the above does not prove either global or local stability.This work restricts itself to a guarantee of local stability, which can be obtained by evaluating the cycle multipliers of the stabilized periodic orbit.Let Ψ t (x(0)) := x(t) define the flow of the dynamic system ( 12), then cycle multipliers can be defined as the ratio of partial derivatives9 Here, x i denotes the i-th component of x and x 0 is a starting point of the stabilized periodic orbit, while t and x are as defined in and above equation (10).As will be shown along the result section, if these cycle multipliers have absolute values smaller than 1, the periodic orbit is stable.

Simulations
In this section, we perform numerical experiments for the case of a double pendulum.More precisely, we consider a pick and place experiment where we want the end-effector of the double pendulum to move between two points in an oscillatory fashion.To achieve this, an optimal eigenmode is learned via the optimization strategy in Section 4.1.For our numerical experiments, we solve the optimization problem in ( 5) with loss functions given in Equations ( 3) and ( 4).Subsequently, we stabilize the eigenmode using the control strategy in Section 4.2.

Double Pendulum Model
The double pendulum is one of the simplest mechanical systems with non-trivial eigenmanifolds (see also [12]).The presented double pendulum is under the influence of gravity and has a linear spring at the second joint.The equations of motion correspond to the conventions shown in Figure 3.They are fully determined by (1) and the Hamiltonian H : R 2 × R → R given as in Equations ( 23) to (25).
V ((q 1 , q 2 )) = V θ ((q 1 , q 2 )) − mdg(2 cos(q 1 ) + cos(q Here, V θ (q) : R 2 → R is a potential function, that will be constructed as a neural net with parameters θ ∈ R m .The actual equations of motion are reported for completeness in Appendix A.

Learning Eigenmodes
In Figure 4, we visualise the trajectory of the inner closed-loop conservative system (see Figure 1) at different time instants after the training of the learned potential, via the optimisation procedure described in Section 4.1, for 500 epochs and a given period T = 1.5 s.The results are obtained with the set of loss function hyperparameters reported in Table 1 10 .The potential V θ (see Figure 5b) is capable of shaping the systems potential (5a), such that the trajectory of the system is an energy-efficient eigenmode of the desired period T .Additionally in Figure 6, we depict the control inputs u = ∇ q V θ (q(t)) inducing the desired periodic behaviour, and the trajectory in the configuration space (Figure 6c) from which it is possible to notice the line-shaped property of the eigenmode described in Def.2.0.1.In the example, we use the starting condition q = (0.2, 0.2), p = (5, 5).In particular, Figure 7d and Figure 7e show the development of q(t) and p(t) over time, which approach the desired q( t) and p( t) (see Section 4.2 for their definition).

Stabilization of the Learned Eigenmode
Figure 7a shows the energy H + V θ of the closed loop system, which approaches the constant energy level of the learned mode.Figures 7b and 7c show the distance of the trajectory from the desired trajectory in position and momentum space respectively (i.e.q(t) − q( t) 2 and p(t) − p( t) 2 ), in both cases approaching 0. The cycle multipliers of the closed loop system are less than 1: for this example, it was found that they are bounded by 0.5, which guarantees that the learned periodic orbit is locally stable.
To observe the robustness of the controller in the presence of damping, viscous damping is introduced.With b the damping coefficient, the input u s in ( 7) is adapted to read which corresponds to velocity dependent damping.The cases b = 0.1 and b = 1 are shown in Figures 8f and 9f, respectively.Notably, the damping causes the energy shown in Figures 8a and 9a to continue to fluctuate in the eventual periodic evolution, about a value lower than the desired energy.It is worth noting that the systems remain close to the desired mode, even for such large cases of damping.However, it should be considered to adapt the energy controlling term u E to compensate for damping more accurately, as was done e.g. in [6] for a particular case of damping that was, among others, linear in velocity.

Additional Results
To strengthen the numerical contribution, we include additional results and ablation studies in Appendices.In particular, in Appendix A, we show the equations of motion for the double pendulum used in our simulations, while in Appendix B, we study the effect of varying the regularisation coefficient α eff and the period T on the resulting eigenmode and control inputs.In Appendix C, we apply the method with different initial and target positions, periods, and regularisation (a) Closed loop energy error.(b) dist(q(t), q( t)).(c) p(t) − p( t) 2 .
(d) (q 1 (t), q 2 (t)) (e) (p 1 (t), p 2 (t)) (f) Trajectory in configuration space and level sets of dist(q, q).(d) (q 1 (t), q 2 (t)) (e) (p 1 (t), p 2 (t)) (f) Trajectory in configuration space and level sets of dist(q, q).In this paper we present a procedure aiming at shaping desired periodic oscillations for mechanical systems.In particular, using tools from eigenmanifold theory and neural networks as function approximators, a state feedback law is learned in such a way to produce a closed-loop system exhibiting a desired periodic motion.This is done by minimising the effort of the learned control law and exploiting at best the natural physical properties of the underlying open-loop systems, characterised by its inertia and its conservative potentials.A stabilising controller able to steer the system along the learned oscillation in presence of parametric disturbances is presented.Extensive simulations show the validity of the approach.

B.2 Learned Eigenmodes for Different Fixed Periods T
In Figure 12, we show the resulting trajectories, learned potentials V θ , and squared control effort u for different period length T .Our approach is capable of finding eigenmodes for different periods T .It is noticed that the learned potential combines with gravitational and elastic potentials in non trivial ways to steer the system on oscillatory modes with the desired period.In Figure 12q-12t, the period of oscillation is close to the natural evolution of the system, i.e. when only the gravitation potential is active and no learned potential is present, the learned potential is such that the resulting control effort is extremely small.Figure 16: The time behavior of the angles q 1 and q 2 over one period.Figure 20: The time behavior of the angles q 1 and q 2 over one period.Figure 24: The time behavior of the angles q 1 and q 2 over one period.Figure 32: The time behavior of the angles q 1 and q 2 over one period.

Figure 4 :
Figure 4: Learned eigenmode at different time steps.The blue circles represent the initial position of the joints of the pendulum, while the red cross represents the end-effector target used for computing the first term of L task (x) in (3).

Figure 7
Figure7shows the results of applying the control structure introduced in Section 4.2 to the learned trajectory shown in Figure4, for coefficients α M = 10 and α E = 1.

Figure 6 :
Figure 6: Control inputs over time (Figure 6a and 6b), and trajectory in the configuration space (Figure 6c).

Figure 8 :
Figure 8: Various features of the stabilized system shown in 7, but including damping linear in system velocity with damping coefficient b = 0.1

Figure 9 : 1 6
Figure 9: Various features of the stabilized system shown in 7, but including damping linear in system velocity with damping coefficient b = 1

Figure 10 :
Figure 10: Control effort squared for different control effort penalty coefficients.

Figure 11 :
Figure 11: Learned potential for different effort penalty coefficients.

(a) q 1
over the period.(b) q 2 over the period.(c)Trajectory in configuration space.

Figure 18 :
Figure 18: Control inputs and control effort.

( a )
Learned potential V θ .(b) Gravitational potential V gravity .(c) Spring potential V spring (d) Open-loop potential V spring +V gravity .(e)Overall potential V θ + V spring + V gravity .

(a) q 1
over the period.(b) q 2 over the period.(c)Trajectory in configuration space.

Figure 22 :
Figure 22: Control inputs and control effort.

( a )
Learned potential V θ .(b) Gravitational potential V gravity .(c) Spring potential V spring (d) Spring and gravitational potential V spring + V gravity .(e)Overall potential V θ + V spring + V gravity .

(a) q 1
over one period.(b) q 2 over one period.(c)Trajectory in configuration space.

( a )
First component of u(t).(b) Second component of u(t).(c)Squared control effort penalty.

Figure 26 :
Figure 26: Control inputs and control effort.

( a )
Learned potential V θ .(b) Gravitational potential V gravity .(c) Spring potential V spring (d) Open-loop potential V spring +V gravity .(e)Overall potential V θ + V spring + V gravity .

Figure 28 :
Figure28: The time behavior of the angles q 1 and q 2 over one period.

Figure 30 :
Figure 30: Control inputs and control effort.

( a )
Learned potential V θ .(b) Gravitational potential V gravity .(c) Spring potential V spring (d) Open-loop potential V spring +V gravity .(e)Overall potential V θ + V spring + V gravity .

(a) q 1
over one period.(b) q 2 over one period.(c)Trajectoryin configuration space.

Table 1 :
). Loss function hyperparameters used in the experiments.