Physics-informed Neural Networks to Model and Control Robots: a Theoretical and Experimental Investigation

This work concerns the application of physics-informed neural networks to the modeling and control of complex robotic systems. Achieving this goal required extending Physics Informed Neural Networks to handle non-conservative effects. We propose to combine these learned models with model-based controllers originally developed with first-principle models in mind. By combining standard and new techniques, we can achieve precise control performance while proving theoretical stability bounds. These validations include real-world experiments of motion prediction with a soft robot and of trajectory tracking with a Franka Emika manipulator.


Introduction
Deep Learning (DL) has made significant strides across various fields, with robotics being a salient example.DL has excelled in tasks such as vision-guided navigation [1], grasp-planning [2], human-robot interaction [3], and even design [4].Despite this, the application of DL to generate motor intelligence in physical systems remains limited.Deep Reinforcement Learning, in particular, has shown the potential to outperform traditional approaches in simulations [5][6][7].However, its transfer to physical applications has been primarily hampered by the prerequisite of pre-training in a simulated environment [8][9][10].
The central drawback of general-purpose DL lies in its sample inefficiency, stemming from the need to distill all aspects of a task from data [11,12].In response to these challenges, there's a rising trend in robotics to specifically incorporate geometric priors into data-driven methods to optimize the learning efficiency [13][14][15].This approach proves especially advantageous for high-level tasks that need not engage with the system's physics.Physics-inspired neural networks [16][17][18], infusing fundamental physics knowledge into their architecture and training, have found success in various fields outside robotics, from earth science to materials science [19][20][21][22].In robotics, integration of Lagrangian or Hamiltonian mechanics with deep learning has yielded models like Deep Lagrangian Neural Networks (LNNs) [23], and Hamiltonian Neural Networks (HNN) [24].Several extensions have been proposed in the literature, for example, including contact models [25], or proposing graph formulations [26].The potential of Lagrangian and Hamiltonian Neural Networks in learning the dynamics of basic physical systems has been demonstrated in various studies [18,[27][28][29].However, the exploration of these techniques in modeling intricate robotic structures, especially with real-world data, is still in its early stages.Notably, [30] applied these methods to a position-controlled robot with four degrees of freedom, which represents a relatively less complex system in comparison to contemporary manipulators.
This work deals with the experimental application of PINN to rigid and soft continuum robots [31].Such endeavor required modifying LNN and HNN to fix three issues that prevented their application to these systems: (i) the lack of energy dissipation mechanism, (ii) the assumption that control actions are collocated on the measured configurations, (iii) the need for direct acceleration measurements, which are non-causal and require numerical differentiation.For issue (iii), we borrow a strategy proposed in [32,33], which relies on forward integrating the dynamics, while for (i) and (ii), we propose innovative solutions.
Furthermore, we exploit a central advantage of LNNs and HNNs compared to other learning techniques; the fact that the learned model has the mathematical structure that is usually assumed in robots and mechanical systems control.By forcing such a representation, we use model-based strategies originally developed for first principle models [34][35][36] to obtain provably stable performance with guarantees of robustness.
The use of PINNs in control has only recently started to be explored.Recent investigations [33,37,38] focused on combining PINNs with model predictive control (MPC), thus not exploiting the mathematical structure of the learned equations.Indeed, this strategy is part of an increasingly established trend seeking the combination of (non-PI and non-deep) learned models with MPC [39,40].Applications to control PDEs are discussed in [41,42], while an application to robotics is investigated in simulation in [43].Preliminary investigations in other model-based techniques are provided in [30,44], where, however, controllers are provided without any guarantee of stability or robustness and formulated for specific cases.
To summarize, in this work, we contribute to state of art in PINNs and robotics with the following: 1.An approach to include dissipation and allow for non-collocated control actions in Lagrangian and Hamiltonian neural networks, solving issues (i) and (ii).
2. Controllers for regulation and tracking, grounded in classic nonlinear control that exploit the mathematical structure of the learned models.For the first time, we prove the stability and robustness of these strategies.
3. Simulations and experiments on articulated and soft continuum robotic systems.To the Authors' best knowledge, these are the first validation of PINN, and PINN-based control applied to complex mechanical systems.

Lagrangian and Hamiltonian Dynamics
Robots' dynamics can be represented using Lagrangian or Hamiltonian mechanics.In the former, the state is defined by the generalized coordinates q ∈ R N and their velocities q ∈ R N , where N represents the configuration space dimension.The Euler-Lagrange equation dictates the system's behavior d dt ∂L(q, q) ∂ q − ∂L(q, q) ∂q = F ext , where L(q, q) = T (q, q) − V (q) with potential energy V and kinetic energy T 1 2 q M (q) q, where M (q) ∈ R N ×N is the positive definite mass inertia matrix.External forces, denoted as F ext ∈ R N , include control inputs and dissipation forces.
In Hamiltonian mechanics, momenta p ∈ R N replace the velocities, with q = M −1 (q)p.The Hamiltonian equations q = ∂H(q,p) ∂p , ṗ = − ∂H(q,p) ∂q + F ext , where H(q, p) = T (q, p) + V (q) is the total energy.The kinetic energy in this case is defined as T (q, p) = 1 2 p M −1 (q)p.

LNNs and HNNs
Lagrangian Neural Networks (LNNs) employ the principle of least action to learn a Lagrangian function L(q, q) from trajectory data, with the learned function generating dynamics via standard Euler-Lagrange machinery [34].The loss function for the LNN is given by the Mean Squared Error between the actual accelerations q and the ones that the learned model would expect q HNNs, conversely, are designed to learn the Hamiltonian function H(p, q).Once learned, this Hamiltonian function provides dynamics through Hamilton's equations.The loss function for HNN is similarly an MSE but between the predicted and actual time derivatives of generalized coordinates and momenta: We use fully connected neural networks with multiple layers of neurons with associated weights to learn the Lagrangian or the Hamiltonian, shown in Figure 1.

Limits of classic LNNs and HNNs
Note that both loss functions rely on measuring derivatives of the state q and ṗ, which -by definition of state -cannot be directly measured.This issue is easily circumvented in simulation by the use of a non-causal sensor.Yet, this is not a feasible solution with physical experiments.An unrobust alternative is to estimate these values from measurements of positions and velocities numerically.This relates to issue (iii), stated in the introduction.Moreover, existing LNNs and HNNs assume that F ext ∈ R N is directly measured.This is a reasonable hypothesis only if the system is conservative, fully actuated, and the actuation is collocated.The first characteristic is never fulfilled by real systems, while the second and the third are very restrictive outside when dealing with innovative robotic solutions as soft [31] or flexible robots [45].Note that learningbased control is imposing itself as a central trend in these non-conventional robotic systems [46].These considerations relate to issues (i) and (ii) stated in the introduction.

Proposed algorithms 3.1 A learnable model for non-conservative forces
In standard LNNs and HNNs theory, non-conservative forces are assumed to be fully known and to be equal to actuation forces directly acting on the Lagrangian coordinates q.This is very restrictive, as already discussed in the introduction.
In this work, we include given by dissipation and actuator forces, i.e., F ext = F d (q, q) + F a (q).We propose the following model for dissipation forces where D(q) ∈ R N ×N is the positive semi-definite damping matrix.Besides, we model the actuator force as F a (q) = A(q)u, where u ∈ R M is the control input signal to the system, and A(q) ∈ R N ×M is an input transformation matrix.For example, A could be the transpose Jacobian associated with the point of application of an actuation force on the structure.With this model, we take into account that in complex robotic systems, actuators are, in general, not collocated on the measured configurations q.Note that, even if we accepted to impose an opportune change of coordinates, for some systems, a representation without A is not even admissible [47].With (4), we also seemingly treat underactuated systems.
Note that [44] uses a dissipative model, but considers it in a white box fashion.Hence, we rewrite the Lagrangian dynamics as follows Similarly, the Hamiltonian takes the form q ṗ = 0 I −I −D(q) ∂H(q, q) ∂q ∂H(q, q) ∂p + 0 A(q) u. (6)

Non-conservative non-collocated Lagrangian and Hamiltonian NNs with modified loss
Figure 2 reports the proposed network framework, which builds upon Lagrangian and Hamiltonian NNs discussed in Sec.2.2.Our work incorporates the damping matrix network, input matrix network, and a modified loss function into the original framework.The damping matrix network is used to account for the dissipation forces in the system via (3), while the input matrix network corresponds to A(q) in (4).We predict the next state by integrating (5) or (6) with the aid of the Runge-Kutta4 integrator.Clearly, different integration strategies could be used in its place.Eq. (7) Eq. ( 8) Figure 2: The overview of Lagrangian and Hamiltonian neural networks: in red, the data and calculation process required for Lagrangian dynamics, while the green parts represent the corresponding data and calculation associated with the Hamiltonian dynamics.
The dataset D = [D k , T k |k ∈ {0, ..., k end }] contains information about the state transitions of the mechanical system.With this compact notation, we refer not necessarily to a single evolution, but we include a concatenation of an arbitrary number of evolutions of the system.The input data D k is composed of either [q k , qk , u k , ∆t], for Lagrangian dynamics, or [q k , p k , u k , ∆t] in the case of Hamiltonian dynamics.Similarly, the corresponding label T k is either qk+1 , for the Lagrangian case, or [q k+1 , p k+1 ] for Hamiltonian dynamics.
The values of M (q), V (q), D(q), and A(q) are estimated by four sub-networks, namely, the mass network (M-NN), potential energy network (P-NN), damping network (D-NN), and input matrix network (A-NN), as shown in Figure 2. The kinetic energy can be calculated once the values of q or p are obtained.Then, the Lagrangian or Hamiltonian functions can be derived from the kinetic and potential energies.The derivative of the states q or [ q ṗ] can be computed using ( 5) or ( 6), respectively.The predicted next state q or [q p] can be obtained using the Runge-Kutta4 integrator.
We thus employ the following modified losses [32,33] for LNNs, where #D is the cardinality of D, and for HNNs.Thus, compared to (1) and ( 2), we are calculating the MSE of a future prediction of the state -simulated via the learned dynamics -rather than of the current accelerations, which cannot be measured.Note that we also include a measure of the prediction error at the configuration level for L HNN because the information on ∂H(q, q) ∂p appears disentangled from D and A (which are also learned) in the first n equations of (6).

Sub-Network Structures
Constraints based on physical principles can be imposed on the parameters learned by the four subnetworks.Specifically, the mass and damping matrices must be positive definite and positive semi-definite, respectively.To this end, the network structure of the dissipation matrix can follow the prototype established for the mass matrix in [48].This structure can be decomposed into a lower triangular matrix L D with non-negative diagonal elements, which is then computed using the Cholesky decomposition [49] as  The output of M-NN and D-NN is calculated as (N 2 + N )/2, with the first N values representing the diagonal entries of the lower triangular matrix.To ensure non-negativity, activation functions such as Softplus or ReLU are utilized as the last layer.Furthermore, a small positive shift, denoted by + , is introduced to guarantee that the mass matrix is positive definite.The remaining (N 2 − N )/2 values are placed in the lower left corner of the lower triangular matrix.
The calculation of the potential energy is performed using a simple, fully connected network with a single output, which is represented as V (q, θ 2 ).Moreover, A-NN, depicted in Figure 4, calculates A(q, θ 4 ) with dimensions R N ×M .

PINN-based controllers
We provide in this section two provably stable controllers by combining the learned dynamics in combination with classic model-based approaches.Before stating these results, it is important to spend a few lines remarking that a globally optimum execution of the learning process described above will result in learning The fully connected network output is a vector in R N M , which is reshaped to a matrix in R N ×M .A sigmoid activation function can be applied to the matrix elements for value constraint.
for the proposed LNN, or for the proposed HNN, where M, G, A, D are the real reference values.Instead, we highlight the components that have been learned from the ones that are not by adding an L as a subscript.Also, by construction, M L , G L , A L , D L will have all the usual properties that we expect from these terms, like M L and D L being symmetric and positive definite, and G L being a potential force.Yet, this does not imply that M = M L , G = G L , and so on.Indeed, there could exists a matrix P (q) such that P (q)M (q), P (q)G(q), P (q)A(q), P (q)D(q) have all the properties discussed above while simultaneously fulfilling L(q, q; P M, P G, P A, P D) = L(q, q; M, G, A, D) or H(q, q; P M, P G, P A, P D) = H(q, q; M, G, A, D).
(11) So controllers must be formulated and proofs derived under the assumption of the learned terms being close to the real ones up to a multiplicative factor.

Regulation
The goal of the following controller is to stabilize a given configuration where we omit the arguments t and θ i to ease the readability.We highlight the components that have been learned from the ones that are not by adding an L as a subscript.G L (q ref ) is the potential force which can be calculated by taking the partial derivative of the potential energy learned by the LNN; K P and K D are positive definite control gains.
For the sake of conciseness, we introduce the controller, and we prove its stability for the fully actuated case.However, the controller and the proof can be extended to the generic underactuated case using arguments in [36].This will be the focus of future work.
Proposition 1. Assume that M = N , with A and A L both full rank.Then, given a maximum admitted error δ q , the closed loop of (5) and ( 12) is such that if K P , K D κI, with κ ∈ R high enough, and if it exists a matrix P (q) ∈ R N ×N such that ||G L (q) − P (q)G(q)|| < δ G , for some finite and positive δ G .Also, we assume that A(q)[A L (q) − P (q)A(q)] + A(q)A (q)P (q) 0, (14) and that Remark 1.Note that if P (q) 0, then A(q)A (q)P (q) 0, and (14) translates into another request of A L (q) being close enough to A(q) up to a multiplicative factor P (q).The positive definiteness of P is, in turn, a request on the quality of the outcome of the learning process.Indeed, if ||M L (q) − P (q)M (q)|| is small enough, then the positive definitness of M L and M implies the one of P .Similarly, (15) is always verified for small enough ||A L (q) − P (q)A(q)||.
Rearranging terms, we define We can therefore bound the norm of ∆ I (q) as follows We can therefore, rewrite the generalized forces produced by the controller A(q)u as Where ∆ all (q) = P −1 (q)∆ G (q) + A(q)∆ I (q)P (q)G(q) + A(q)∆ I (q)∆ G (q) is a bounded term, as sum and product of bounded terms.The gains KP and KD are positive definite being product of two positive definite matrices.The closed loop is then M (q)q + C(q, q) q = ∆ all (q) + KP (q ref − q) − (D(q) + KD ) q. (20) In this segment, we establish our thesis by adopting and replicating the arguments provided in [51], which is, in turn, adapted from the seminal paper [52].This direct application of an existing theorem has been made possible by our rearrangement of the closed loop, which has made it identical to the structure delineated in those papers.
Note that even if we provided the proof using a Lagrangian formalism, the Hamiltonian version can be derived following similar steps.Also, note that the bounds on the learned matrices are always verified for any choice of δ A , δ G at the cost of training the model with a large enough training set.We conclude with a corollary that discusses the perfect learning scenario.
Corollary 1. Assume that M = N and A is full rank.Then, the closed loop of (5) and ( 12) is such that if K P , K D 0 and if it exists a matrix P (q) ∈ R N ×N such that M L (q) = P (q)M (q), A L (q) = P (q)A(q), G L (q) = P (q)G(q).
Proof.Let's start from (14), which now becomes A(q)A (q)P (q) 0. Furthermore, considering that A(q) is full rank by hypothesis yields the equivalent condition P (q) 0. As discussed in the remark before, this is implied by the fact that M L (q) = P (q)M (q) and both M L and M (q) are positive definite.Thus, ( 14) is always verified.Similarly, ( 15) is trivially verified for A L (q) = P (q)A(q).Moreover, note that ∆ all = 0 as the deltas are now all zero.So, the closed loop ( 20) is always the equivalent of a mechanical system, without any potential force, controlled by a PD.Note that the gains are positive because we just proved that P (q) 0, and because K P , K D 0 by hypothesis.The proof of stability follows standard Lyapunov arguments (see, for example, [34]) by using the Lyapunov candidate V (q, q) = T (q, q) + 1 2 q KP q.

Trajectory tracking
The goal of the following controller is to track a given trajectory in configuration space q ref : R → R n .We assume q ref to be bounded with bounded derivatives.We also assume the system to be fully actuated -i.e., M = N , det(A) = 0, det(A L ) = 0.Under these assumptions, we extend (12) with the following controller to follow the desired trajectory where we omit the arguments t and θ i to ease the readability.We highlight the components that have been learned from the ones that are not by adding an L as a subscript.We can obtain the Coriolis matrix C L (q ref , qref ) from the learned Lagrangian by taking the second partial derivative of the Lagrangian with respect to the desired joint position q ref and velocity qref , i.e., ∂ 2 L(qref, qref) ∂qref∂ qref .Corollary 2. The closed loop of (5) and ( 22) is such that, for some δ q ≥ 0, if K P , K D κI, with κ ∈ R high enough, and if it exists a matrix P (q) ∈ R N ×N such that A L (q) = P (q)A(q), M L (q) = P (q)M (q), C L (q) = P (q)C(q), G L (q) = P (q)G(q), D L (q) = P (q)D(q).We also assume that P is such that ||P −1 (q)P (q ref ) − I|| < δ P for some δ P > 0.
Proof.We can rewrite (22) by substituting the values of the learned elements in terms of P .The result is where with ∆ P = P −1 (q)P (q ref ) − I. Thus, ∆ all is bounded by hypothesis as a product and the sum of bounded terms.Moreover, as discussed in the proof of Corollary 1, AA P 0. Thus, being the closed loop is equivalent to the one discussed in [53], the same steps discussed there can be followed to yield the proof.
Note that even if we provided the proof using a Lagrangian formalism, the Hamiltonian version can be derived following similar steps.Also, the bound δ q can be made as small as we desire at the cost of making the control gains large enough.
Finally, note that we provided here only proof of stability for the perfectly learned case.Similar hypotheses and arguments to the ones in Proposition 1 would lead to similar results in the tracking case, with ||P (q)A L (q)−A(q

Methods: Simulation and experiment design
To evaluate the efficacy of the proposed PINNs and PINN-based control, we apply them in three distinct tasks: (T1) Learning the dynamic model of a one-segment spatial soft manipulator, (T2) Learning the dynamic model of a two-segment spatial soft manipulator, (T3) Learning the dynamic model of the Franka Emika Panda robot.We selected (T1) and (T2) because they have a nontrivial A(q), and (T3) because it has several degrees of freedom.Furthermore, we employ the learned dynamics to design and test model-based controllers for T2 and T3.
In a hardware experiment, the LNN is utilized to learn the dynamic model of the tendon-driven soft manipulator reported in [54] and the Panda robot.We show for the first time experimental closed-loop control of a robotic system (the Panda robot) with a PINN-based algorithm.

Data Generation
Training data for T1 and T2 are generated by simulating the dynamics of one-segment and two-segment soft manipulators in MATLAB.For T1, ten different initial states are combined with ten different input signals to generate data using the one-segment manipulator dynamics model.Each combination produces ten-second training data with a time step of 0.0002 seconds.For T2, we use a variable step size in Simulink to generate datasets from the mathematical model of a two-segment soft manipulator.With this approach, we create twelve different sixty-second trajectories, which are subsequently resampled at fixed frequencies of 50Hz, 100Hz, and 1000Hz.Concerning T3, PyBullet simulation environment is used to generate training data corresponding to the Panda robot.Then, Different input signals are applied to the joints to create the data of 70 different trajectories with a frequency of 1000Hz.
Regarding experimental validation, we propose the following experiments.For the tendon-driven continuum robot, we provide sinusoidal inputs with different frequencies and amplitudes to the actuatorsfour motors-and record the movement of the robot.An IMU records the tip orientation data with a 10Hz sampling frequency.As a result, 122 trajectories are generated, and four more are collected as the test set.For the Panda robot, we provide 70 sets of sinusoidal desired joint angles with different amplitudes and frequencies.We collect the torque, joint angle, and angular velocity data using the integrated sensors, considering a sampling frequency of 500Hz.

Baseline Model and Model Training
In order to provide a basis for comparison, baseline models are established for all simulations and hardware experiments.These models, which serve as a control, are constructed using fully connected network and trained using the same datasets as the proposed models, however, with a larger amount of data and a greater number of training epochs.These baseline models aimed to demonstrate the benefits of incorporating physical knowledge into neural networks.
In this project, all the neural networks utilized are constructed using the JAX and dm-Haiku packages in Python.In particular, the JAX Autodiff system is used to calculate partial derivatives and the Hessian within the loss function.The optimization of the model parameters is carried out using AdamW in the Optax package.

One-segment 3D soft manipulator
To define the configuration space of the soft manipulator, we adopt the piecewise constant curvature (PCC) approximation [55] shown in Figure 5. Customarily, this approximation describes the configuration of each segment as q i = [φ i , θ i , δ i ], where φ i is the plane orientation, θ i is the curvature in that plane, and δ i is the change of arc length.In this work, the configuration-defined method reported  The detailed information for this task is shown in Table 1.The prediction results of these two learned models are compared in Figure 6.The figure indicates that the model trained by LNNs exhibits a high degree of predictive accuracy, manifesting near-infinite prediction capabilities with over 50,000 consecutive prediction steps in this example.While some areas exhibit less precise fits, it is important to note that such errors do not accrue over time.These outcomes suggest that LNN-based models can effectively capture the underlying dynamics of the one-segment soft manipulator.By contrast, the blackbox model converges during the training process, but it does not gain insights into the dynamic model from its prediction performance.This system is also learned using HNNs by providing momentum data.Hamiltonian-based neural networks yield similar quality prediction results as Lagrangian-based neural networks, as shown in Figure 7.The matrices obtained from these two physics-based learning models are shown in Table 3 and 4, where G(q) represents the potential forces, i.e., ∂V (q)  ∂q .As Table 4 shows, HNNs can learn the physically meaningful matrices, while LNNs only learn one of the solutions satisfying the Euler-Lagrangian equation.Comparing the corresponding matrices in Table 2 and 3, we can find that the matrices and vectors learned by the LNNs are related to the real parameters through a transformation P (q).Table 3: Lagrangian-based learning model matrices of one-segment soft manipulator q M (q) D(q) Ĝ(q) Â(q) P (q)  

Two-segment 3D soft manipulator
The two-segment soft manipulator model is simulated in MATLAB, where the configuration space is also defined as in the one-segment case.The training and testing information for this task is shown in Table 5. Figure 8 summarizes the prediction results of the 50Hz, 100Hz, and 1000Hz learned model.From   the simulations, we conclude that the higher the sampling frequency within a certain range, the more accurate the learned model is.Based on the learned model trained at 1000Hz, we devise a PINN-based control loop as in (12).To demonstrate the performance of the designed controller, we employ it to control the two-segment soft manipulator in MATLAB.The proportional gains K P and derivative gains K D are set to 10 and 50, respectively, for all six configurations.The alterations in the states of the two-segment manipulator under control are depicted in Figure 9, whereas the performance of the controller is demonstrated in Figure 10.Results indicate that the controller is capable of tracking a static setpoint within one second while keeping the root mean square error (RMSE) less than 0.23%, and exhibits a stable and minimal overshoot performance.These performances underscore the reliability and efficiency of the designed controller based on the learned model.

Panda robot
Table 6 presents the training and testing data of the simulated Panda in PyBullet, while Figure 11 displays the prediction results obtained from the learned model.The model exhibits relatively accurate prediction performance within 1 second (i.e.continuous prediction for 1000 steps).Furthermore, the Lagrangian-based models can achieve long-term forecasting by updating the input values of the learned model to the real states at a fixed rate, typically ranging from 50 to 100 Hz.Based on this learned model, we build the tracking controller discussed in Sec.??.The results are depicted in Figure 12, where we observe that our controller has a fast response time and can quickly adapt to changes in the reference signal.It can maintain high accuracy and low phase lag, which makes it well-suited for tracking fast-changing signals.We validate the proposed approach in the platform depicted in Figure 13, which is constructed based on [54,57].We consider two different data preprocessing methods.(i) Moving average method: This method reduced the noise and outliers in the data, generating a more stable representation of underlying trends.However, it may overlook intricate relationships between variables, resulting in some information loss.(ii) Polynomial fitting: This method captured non-linear patterns in the data.However, it was susceptible to the influence of outliers, resulting in spurious information that may compromise the quality of the trained model.
The training and testing information is shown in Table 7.
The method of moving average is implemented in MATLAB through the utilization of the movmean function, with a prescribed window size of 50 points.The processed data are used for training the LNNs.In Figure 14, we compare the continuous prediction ability of black-box and Lagrangian-based learning models.The prediction performance in this figure indicates that the Lagrangian-based learning model exhibits superior predictive accuracy in this sample.Furthermore, Figure 14 (c) shows that the learning model can realize long-term predictions under the short-term update.
The polynomial fitting of the data is done in MATLAB using the function polyfit.The prediction results of the model are shown in Figure 15.The learned model exhibits a decent performance when the   window size is reduced, as shown in Figure 15(c).In contrast to the previous model, this model exhibits significant prediction errors shown in Table 7.This can be caused by the significant noise in the sensors and misinformation caused by the approximation used to fit the data.

Rigid Robot -Franka Emika Panda
The collected data are processed through a Butterworth filter in MATLAB to reduce noise.Further details are provided in Table 8.In the experiment, we observe small joint acceleration, which results in minimal velocity change.To prevent the network from focusing solely on learning a large mass matrix and neglecting other important factors, we utilize a scaling sigmoid function.This function ensures that the elements in the mass matrix are scaled within a specific range.For this particular case, we have set the scaling factor to 3.50.Figure 16 illustrates the predictive performance of our physics-based model, where Figure 16 (b) depicts the continuous prediction error within 2 seconds or 1000 prediction steps and (c) shows that updating the model's input with real-time state data can help us make a long prediction.
A controller based on the equation presented in (22) is proposed for the actual robot.The proportional gain matrix, K P , is set to a diagonal matrix with entries 600, 600, 600, 600, 250, 150, and 50, respectively.The derivative gain matrix, K D , is set to a diagonal matrix with entries 30, 30, 30, 30, 10, 10, and 5, respectively.Figure 17 illustrates a series of photographs depicting the periodic movement used to track a sinusoidal trajectory within a time frame of 10 seconds.The whole tracking performance is shown in Figure 18.
Furthermore, we have presented the trajectory of the end-effector, which is a helical motion shown in Figure 19, and its resultant control effect has been visually demonstrated in Figure 18.
In these Figures, we can observe that the designed controller has satisfactory performance, as evidenced by its ability to track a desired trajectory.The tracking error, while present in some joints, remains within acceptable bounds and does not significantly impair the overall performance of the controller in practical applications.An examination of the controller's performance reveals that, while generally effective, its performance exhibits some degree of variability across different joints.The overall performance of the controller remains within acceptable levels and suggests its potential for effective use in real-world applications.This paper presented an approach to consider damping and the interaction between robots an/d actuators in PINNs-specifically, LNNs and HNNs-, improving the applicability of these neural networks for learning dynamic models.Moreover, we used the Runge-Kutta4 method to avoid acceleration measurements, which are often unavailable.The modified PINNs proved suitable for learning the dynamic model of rigid and soft manipulators.For the latter, we considered the PCC approximation to obtain a simplified model of the system.The modified PINNs approach exploits the knowledge of the underlying physics of the system, which results in a largely improved accuracy in the learned models compared with the baseline models, which were trained using an fully connected network.The results show that PINNs exhibit a more instructive and directional learning process because of the prior knowledge embedded into the approach.Notably, physics-based learning models trained with fewer data are more general and robust than the traditional black-box ones.Therefore, continuous long-term and variable step-size predictions can be achieved.Furthermore, the learned model enables decent anticipatory control, where a naive PD can be integrated for a good performance, as illustrated in the experiments performed with the Panda robot.

Figure 3 :
Figure 3: Diagram of the damping matrix including a feed-forward neural network, a non-negative shift for diagonal entries, and the Cholesky decomposition

Figure 4 :
Figure 4: Diagram for actuator matrix:The fully connected network output is a vector in R N M , which is reshaped to a matrix in R N ×M .A sigmoid activation function can be applied to the matrix elements for value constraint.

Figure 5 :
Figure 5: PCC approach illustration: (a) two-segment soft manipulator is shown, where Si is the end frame, the blue parts are the orientated plane, i is the original length of each segment; (b) shows the length of the four arcs whose ends connected to the frame Si −3 −4.84e −4 −1.94e −3 −4.84e −4 9.67e −2

Figure 6 :Figure 7 :
Figure 6: One-segment soft manipulator leaned model comparison results: (a) depicts the predictions generated by the black-box model ( ), the Lagrangian-based learning model (• • • ), and the ground-truth (−) arising from the dynamic mathematical equations; (b) shows the prediction error of these two learned models.

Figure 8 :
Figure 8: Two-segment soft manipulator prediction performances under different sampling frequencies

Figure 9 :
Figure 9: The sequence of movements at the times 0.0s, 0.1s, 0.3s, 0.6s, and 1.0s executed by the two-segment soft robot as a result of the implementation of the LNN-model-based controller.The red line represents the tip's position

Figure 11 :
Figure 11: Franka Emika Panda learned model prediction results: (a) shows 1500 steps prediction in a row; (b) is the angle errors of the prediction with respect to the ground truth; (c) shows the long prediction results with 50-step window size.

Figure 12 :
Figure 12: Performance of the model-based controller designed using the model learned by the LNNs.The desired trajectories are plotted with dotted lines.

Figure 14 :
Figure 14: The smoothing data black-box model ( ) and physics-based learning model (--) continuous prediction results: (a) and (b) show prediction 43 prediction steps in a row; (c) depicts the prediction results with 5-step window size.

Figure 15 :
Figure 15: The fitting data black-box model ( ) and physics-based learning model (• • • ) continuous prediction results: (a) and (b) show 25 prediction steps in a row; (c) shows the prediction results with 5-step window size.

Figure 16 :
Figure 16: Panda physics-based learning model prediction results: (a) and (b) show prediction of about 800 steps in a row; (c) depicts the prediction results with 5-step window size.

Figure 17 :
Figure 17: Photo sequence of one periodic movement resulting from the application of the LNN-model-based controller tracking trajectory

Figure 18 :
Figure 18: Performance of the model-based controller that is designed using the learned model.

Figure 19 :Figure 20 :
Figure 19: Photo sequence of helical motion of the end-effector by using LNN-model-based controller

Table 1 :
One-segment soft manipulator simulation detailed information

Table 2 :
Mathematical model matrices of one-segment soft manipulator

Table 4 :
Hamiltonian-based learning model matrices of one-segment soft manipulator