Interactive Human–Robot Skill Transfer: A Review of Learning Methods and User Experience

Generalizing the operation of robots in dynamical environments regardless of the task complexity is one of the ultimate goals of robotics researchers. Learning from demonstration approaches supported by transfer learning and user feedback offer a remarkable solution to achieve generalization. The main idea behind such approaches is teaching robots new skills with human instructors and training parametric models with data from demonstrations to achieve and update the desired skills under changing conditions. Herein, the applications of skill transfer with reinforcement learning algorithms and the effect of user experience (UX) on learning from demonstration approaches are reviewed. This review outlines the importance of considering and evaluating UX during human–robot interaction and, especially, robot teaching. A detailed view on the relations between robot learning and UX is provided and approaches for future improvements are derived. Finally, adaptive autonomy sharing between the robot and the user during teaching is presented as a promising approach to enhance the interaction by exploiting user feedback. In the long run, interactive and user‐centered human–robot skill transfer is expected to reduce cognitive and physical load of the user. Discussion on future research questions aiming to improve learning process and semiautonomous behavior concludes the review.


Introduction
Autonomous robots have become irreplaceable in a variety of industrial domains in the last decades, especially with the rapid developments in Industry 4.0 practices. [1][2][3] In addition to industrial applications, robots are getting more active into daily life activities. To tackle different tasks in complex environments such as delivering parcels in natural urban environments, [4] assisting elder care, [5] or managing the household, [6] autonomous robots must interact with humans. All these application domains require that robots process large amounts of data from noisy sensor observations during the execution of hundreds of different motor and manipulation skills depending on the complexity of the desired task. Additional objectives such as obstacle avoidance or disturbance compensation in dynamic environments make the tasks even more complex. Under such circumstances, programming skills manually may not be sufficient to develop intelligent autonomous systems that can assist humans in tasks of everyday life. Another challenge stems from the nature of interaction: user experience (UX) should be taken into consideration because the main purpose of integrating robots in daily life activities is to improve the quality of life. [7] The user should feel safe and comfortable while a robot is operating in a nearby place. At this point, user feedback about controllability, pleasantness, and other humanrelated factors should help to improve human-robot interaction (HRI) based on shared autonomy. [7,8] Among several robot teaching techniques such as using teach pendant or kinesthetic teaching, [9] learning from demonstration (LfD) approaches are a promising route toward initializing self-improving autonomous systems. In LfD methods, human instructors teach the robot new skills through demonstrating the task, which is captured by sensors measuring joint Generalizing the operation of robots in dynamical environments regardless of the task complexity is one of the ultimate goals of robotics researchers. Learning from demonstration approaches supported by transfer learning and user feedback offer a remarkable solution to achieve generalization. The main idea behind such approaches is teaching robots new skills with human instructors and training parametric models with data from demonstrations to achieve and update the desired skills under changing conditions. Herein, the applications of skill transfer with reinforcement learning algorithms and the effect of user experience (UX) on learning from demonstration approaches are reviewed. This review outlines the importance of considering and evaluating UX during human-robot interaction and, especially, robot teaching. A detailed view on the relations between robot learning and UX is provided and approaches for future improvements are derived. Finally, adaptive autonomy sharing between the robot and the user during teaching is presented as a promising approach to enhance the interaction by exploiting user feedback. In the long run, interactive and usercentered human-robot skill transfer is expected to reduce cognitive and physical load of the user. Discussion on future research questions aiming to improve learning process and semiautonomous behavior concludes the review.
angles, [10][11][12] visual tracking systems, [13,14] or other input devices such as sensor gloves. [15,16] The demonstrated high-dimensional data are then used to train parametric models in dynamical attractor systems [10,17] or closed-loop feedback controllers in Gaussian mixture models. [18,19] However, these LfD approaches have two major limitations which are investigated thoroughly in this review. First, most robot skill learning approaches require processing of high-dimensional data. Moreover, dimensionality increases while teaching more complex robots, such as humanoids, which makes teaching process more complicated. [20] Consequently, processing of highdimensional data, e.g., visual or tactile data, is a limiting factor for learning approaches especially when sensor signals feature low or moderate dimensions. Second, the existing LfD approaches are mostly limited to laboratory environments with robotic experts, who are well trained in teaching their robots, as human instructors, although there are few exceptions. [21,22] Thus, generalizing teaching process to different tasks with nonexperts as human instructors appears not to be sufficiently examined although bearing the potential of improving the feasibility and acceptance of assistive robots in everyday tasks. We address both issues in this review by surveying and discussing learning algorithms and user feedback during semiautonomous operation.
To realize autonomous learning from high-dimensional noisy data, there are probabilistic implementations [23,24] of deep learning approaches with neural networks [25][26][27] that can be trained from high-dimensional stochastic inputs and provide closed-loop feedback controllers. [28,29] As dozens of neural network parameters are needed to be learned for each motor skill, speeding up the learning process is crucial. Following that purpose, different transfer learning strategies can be implemented to initialize the network parameters in new environments. Neural networks are well suited for this task as it is possible to gradually increase the number of transferred parameters and even switch between the transfer of task-specific knowledge to abstract features encoded in deeper layers.
On the human side of interaction, different transfer learning strategies might be rated through behavioral and physiological measurements as well as psychometric questionnaires to create interactive learning methods that are accessible not only to robotic experts but also to nonexpert users. We envision that considering UX can lead to more intuitive and natural interaction as suggested by some previous research. [30,31] This review outlines how modified predictions of the deep network could be evaluated by analyzing the UX in teaching the robot a series of novel manipulation skills.
Moreover, the evaluation of human feedback on different robot adaptation strategies will demand a very detailed selection of the amount and content of transferred knowledge. Closedloop feedback control approaches are utilized in artificial neural networks, where a large number of parameters, i.e., the synaptic weights at different layers, can be selected for transfer learning. [28] Apart from widely used psychometric questionnaires, probabilistic models of the humans' feedback and cognitive models of the multisensory integration [32] can be used to improve the autonomous adaptation strategies of the robot where the goal is to learn and teach related motor skills from a few samples.
The rest of the article is organized as follows: the fundamentals of transfer learning techniques, namely, movement primitive representations, skill adaptation strategies, and transfer of manipulation skills, are analyzed in Section 2; Section 3 discusses the assessment and evaluation of UX in HRI and shared autonomy; and finally, Section 4 concludes the article providing a critical discussion and derives future research directions.

Learning Methods
Transfer learning approaches are the main focus of this review for the learning process of robots with nonexpert instructors. However, it should be noted that transfer learning has some major challenges in robotics, which can be summarized as follows: modeling complex manipulation skills; adapting to dynamically changing environments through integrating high-dimensional stochastic visual and tactile feedback; and continuously improving its internal representation of the world and the skills through transfer learning in nonparametric neural networks.
In this section, possible solutions to the aforementioned challenges are discussed. Movement primitives and their use in LfD approaches are explained with examples. The importance of skill adaptation from multisensory data processing is emphasized. Finally, a broad comparison of reinforcement learning algorithms for transfer learning purposes is provided.

Movement Primitive Representations
For learning manipulation tasks, a multitude of operations has to be performed. In a kitchen environment, for instance, such operations include the grasping of beakers, bottles, and flasks; transporting; pouring; and all forms of assembling and detaching of objects in bimanual operations. For manipulating hundreds of objects of different shapes, sizes, fill levels, and material properties, e.g., compliance and friction coefficients, manually programming these operations will not be feasible and autonomous learning methods are needed.
All of these operations can be treated as an abstraction of timedependent elementary actions, also called movement primitives. Movement primitives are a parametric description of elementary movements. Complex behaviors can be generated by sequencing and coactivating these movement primitives. [11,17,[34][35][36] Figure 1 shows an example of the procedure. First, the task is demonstrated by the human instructor multiple times. Then, movement primitives are generated with the help of measurements captured by sensors. Finally, the humanoid robot reproduces the task even if it is not identical to the initial task. In robot skill learning, the dynamic systems movement primitive framework, suggested by Schaal et al., [11] has yielded success in a variety of different applications, including table tennis, drumming, playing ball-in-a-cup, and legged locomotion.

Skill Adaptation from Visual and Tactile Data
Skill adaptation from visual or tactile data is key for efficient transfer learning. Therefore, in robot vision, different approaches including deep learning techniques have been used to extract features for grasping tasks or for representing the robot configuration. For example, a method for hand-eye coordination for grasping tasks based on the evaluation of monocular images using large convolutional neural networks has been introduced by Levine et al. [37] Here, two large-scale experiments have been conducted with different robotic grippers which show efficient real-time control and grasping of novel objects given the evaluated method. In addition, transfer experiments showed learning from different robots can enhance learning to achieve more reliable and effective grasping results. Pinto and Gupta demonstrated in their study [38] that large data sets in combination with convolutional neural networks can extensively improve the performance for grasping tasks including unseen objects. Recent deep reinforcement learning techniques even allow for training directly with the real robot, as shown by Gu et al. [39] Moreover, prior works based on representing the robot configuration directly from the raw pixels [40,41] showed efficient learning behavior on different simulated tasks, such as swimming, walking, or hopping.
When interacting with the environment, tactile sensors have become of great value in robot control. [42] Therefore, hundreds of small sensor elements might be spread over the artificial skin of the robot to mimic the sensory abilities of humans. These sensors in combination with others, e.g., heat flux sensors or acoustic slip sensors, are used to reason about contact forces, material properties, and further information while grasping or touching objects. [43][44][45][46][47] Based on collected sensory data, dynamical models can be learned as demonstrated by Paraschos et al. [36] In combination with visual perception by using sensor fusion, objects can be efficiently tracked. This complements vision information, which is versatile and low cost but has low-sample rates and high delays, with tactile sensors, such as force sensors, which are accurate and fast but limited as they require contact. [48,49] Tactile skins have also been used in humanoid robots for manipulation and learning, [50] where, in addition, methods have been shown to improve the tactile skin for humanoid robots by introducing, for example, new materials. [51,52]

Model-Based Reinforcement Learning for Transfer of Manipulation Skills
Model-based reinforcement learning algorithms learn a transition model parallel to optimizing the policy π. The learned transition modelM can be further extended to new goals in the same environment, [53,54] i.e., transfer learning. Compared with model-free reinforcement learning, model-based reinforcement learning algorithms are known to be sample-efficient. [55] Figure 2 shows a classification of reinforcement algorithms with focus on model-based methods as discussed in this section.
In the model-based Dyna-style algorithms, the model is learned online from the agent's interaction with the environment. Then imaginary rollouts, which are far less expensive than the real interaction data, can be generated by the learned dynamics model and used to improve the policy. Model-Ensemble Trust-Region Policy Optimization [57] proposes an ensemble of neural network models that fit on transition dynamics. Unlike the vanilla Dyna-Q algorithm, where the policy improvement on imaginary rollouts is performed a fixed number of times, Model-Ensemble Trust-Region Policy Optimization imposed a criterion on when to stop policy improvement on imaginary samples. [56] By ensemble learning, the problem of model bias can be alleviated. Luo et al. [64] showed that the difference of the value estimate, using simulated samples from the learned model b M and model error, constituted a lower bound for the exact value estimate in the true environment model M Ã , which improves efficiency when the model is learned correctly. Model-Based Value Expansion [58] replaces the one-step temporal difference (TD) target by H-step imaginary TD-target value so that the target error can decrease by a factor of γ 2H , where γ denotes the discount factor. They also applied TD-k trick to mitigate the distribution match problem that bias in Q-function from replay buffer samples is lower than imaginary rollouts. However, the hyperparameter H is task-dependent. As an improvement, algorithm STEVE [59] replaced the fixed H-step imaginary TD target by a weighted sum of n-step imaginary TD target, for n ¼ 1, .., H. The weights are computed as the inverse variance of the imaginary TD-target values over model ensembles.
The second category of model-based approaches is Shooting Algorithms, which try to optimize the action in each step. Nagabandi et al. [55] combined model learning and planning in a fashion similar to Model Predictive Control. Initially, the model b M is trained on random trajectories and then the policy network is initialized to mimic the expert behavior from the model using Data Aggregation. [65] Subsequently, a model-free reinforcement Figure 1. LfD for a cup stacking task. Top row: Demonstration through teleoperation. Bottom row: Versatile autonomous reproduction that can adapt to task deviations. Reproduced with permission. [33] Copyright 2015, The Authors.
www.advancedsciencenews.com www.advintellsyst.com learning algorithm is initialized on policy parameters. In each of its decision steps, a number of candidate action sequences are generated and propagated through b M; the best sequence is chosen according to the simulated reward, i.e., planning. After transiting to the successor state, planning is redone and the model is trained periodically for new transition data. By replanning at each decision step, these algorithms alleviate the burden of an accurate model, and the shooting method can decrease the computational complexity. Another shooting algorithm Probabilistic Ensembles with Trajectory Sampling [60] enables both modeling environment stochasticity (aleatoric uncertainty) and model confidence (epistemic uncertainty) by an ensemble of probabilistic neural networks that output distribution parameters to account for stochasticity. Propagation by random shooting in learned models makes these two uncertainties distinguishable, allowing for efficient exploration guided by epistemic uncertainty and more accurate planning considering environment stochasticity. Probabilistic Ensembles with Trajectory Sampling algorithm reaches comparable performance to state-of-the-art model-free baseline in Cartpole, 7-DOF Reacher/Pusher, and Half-cheetah with significantly fewer samples.
The last category of model-based reinforcement learning algorithms is Backward Propagation Through Time, where the gradient of the reinforcement learning objective with respect to the policy can be computed analytically with the presence of b M. In Iterative Linear Quadratic Gaussian, [61] it is assumed that the model dynamics is not known. A local linear model and a quadratic cost function are fit so that the whole problem can be solved in the way a linear quadratic regulator does. As the local model is only valid in local regions, the improved trajectory should not deviate much from the previous trajectory. Extended from Iterative Linear Quadratic Gaussian, Guided Policy Search [62] characterizes the policy π externally with network parameters θ and makes the policy network mimic the behavior of the controller.
Model-Agnostic Meta-Learning [63] proposes a metaoptimization scheme on reinforcement learning problems, where the agent is trained on a variety of tasks and can be quickly adapted to an unseen task with only a few data points. A metamodel is trained so that a gradient step along the direction of a taskspecific loss from the metamodel maximizes its performance on this new task. Based on the concept of Model-Agnostic Meta-Learning, Clavera et al. [66] propose learning an ensemble of dynamics models, where each model acts as a different task. It shifts the burden of model accuracy to the adaptation to the task. Such formulation encourages exploration in the region where the model disagrees. As models turn accurate, higher sample efficiency can be reached. This algorithm achieves same-level performance as model-free reinforcement learning in Mujoco, [67] but requiring significantly fewer data.
There are other model-based approaches for high-dimensional inputs such as images, [68][69][70] whereas the model for exteroceptive and proprioceptive sensory inputs still remains unexplored. A natural next step would be to equip deep networks with learned state transition models. In deep learning literature on robot control, however, mostly online model learning has been studied [71,72] and skill transfer in model-based deep networks is an unexplored research field. The categorization focuses on model-based reinforcement learning algorithms as they are more sample-efficient than model-free algorithms. [55] www.advancedsciencenews.com www.advintellsyst.com The ultimate challenge for model-based deep learning approaches for robot control is how to combine perceptual and proprioceptive feedback information with a state transition model that generalizes over multiple skills. We started to explore how networks of spiking neurons can implement this model fusion task in a Bayes optimal manner. [24] Rueckert et al. [24] show that populations of sensory or context neurons modulate the recurrent dynamics in a population of state neurons which encode the state transition dynamics. This work was extended to factorized state representations that generalize to whole-body control systems like humanoids. [73]

User Experience
Along with technology and marketing, UX is one of the three fundamental components of a successful product. [74] The increased number and level of participation of socially interactive robots in everyday activities motivate designers to systematically evaluate the quality of the interaction from a human-centered perspective. In addition to pragmatic qualities, such as usability or functionality, hedonic qualities should be assessed for UX evaluation. [75,76] Therefore, after discussing functional aspects of possible algorithms and techniques to transfer manipulation skills for LfD approaches, effects of human-related factors in establishing an improved and more intuitive interaction are explained and exemplified in this section. UX during interaction with technology has been investigated in extensive research. Hassenzahl et al. [77] focus on human needs, such as competence, relatedness, popularity, stimulation, meaning, security, or autonomy, which should be satisfied to have a positive experience with interactive products and technologies. Similarly, Thüring and Mahlke [78] emphasize the importance of emotional experiences during interaction between human and technology by measuring subjective feelings, motor expressions, physiological reactions, cognitive appraisals, and behavior. Subsequently, we posit ideas of how to consider UX in interactive learning, how to improve outcomes via coadaptation, and how to exploit semiautonomous strategies.

User Experience in HRI
Human-related factors such as trust [79] or body representations [8,80] can have a significant impact on how users experience the interaction with robotic systems. While Hancock et al. [79] conclude robot-related factors to be most important, modulators of UX are observed to be highly relevant if human and robot cooperate tightly and, especially, if the robot is learning or exhibits autonomous behavior. [7,8] Considering HRI in general, the extensive meta-analysis in the study by Hancock et al. [79] identifies various factors such as engagement, expertise, workload, and situation awareness. In addition, Alenljung et al. [81] point out the importance of the evaluation of different aspects such as acceptance, usability, learnability, and credibility, some of which are only scarcely considered in previous research. While taking into account human-related factors might be beneficial regardless of the application domain, the users' perspectives to be considered might vary distinctly. Current approaches to explore tight interaction in wearable robots and teleoperation rely on human-in-the-loop experiments to investigate aspects such as comfort, presence, or body experience in real interaction scenarios. [8,80,82,83] Taking the example of robotic prostheses, embodiment appears to be a paramount metric in the assessment of UX. [7,8,84] According to Longo et al., [85] embodiment comprises ownership, location, and agency as submetrics, which are defined as the sensation that an object belongs to the body, the feeling of the object and the limb being in the same place, and the feeling of being in control of an object's movements, respectively. However, UX research focused on collaborative robots shows that safety and trust are the crucial factors to be considered. [86,87] In short, tasks to be achieved directly influence the UX evaluation. Sun and Sundar [88] report that users tend to evaluate the robot and interaction more positively when they expect the robot to be task-oriented rather than interactionoriented, whose anticipated task scope is less specific and actions are harder to grasp by users at the first glance. In other words, when goals of interaction are too general, it is hard to evaluate whether a positive UX is achieved or not. Therefore, generalizing interaction for multiple tasks without informing the user about them can be considered as a challenge to be overcome in HRI.
Beyond identifying and defining human-related factors, appropriate assessment metrics and measurement approaches are required to gain insight into UX during interaction. As computers and smart devices became prevalent in our daily lives, the human-computer interaction community already investigated sophisticated and developed UX measurement methods, [89] also applicable or adjustable to HRI research. Psychophysical methods, [90] such as method of constant stimuli, method of limits, and method of adjustment, or psychophysiological methods, [91] such as electroencephalography (EEG), electromyography (EMG), and electrodermal activity (EDA), are widely used both in humancomputer interaction [91,92] and in HRI [93,94] for the assessment of UX. Yet, psychometric evaluation techniques span a remarkable amount of UX evaluation in HRI. The information obtained in psychometric questionnaires is evaluated with statistical methods. [95][96][97][98][99][100] From a process-driven perspective, as Hartson and Pyla [101] define UX as the totality of effects felt by a user before, during, and after an interaction, UX measurement methods should be selected to conduct in preinteraction and postinteraction phases as well as in the course of the interaction. There are several case studies that use preinteraction and postinteraction questionnaires, [102] observation and questionnaires during interaction and postinteraction questionnaires. [103,104] Moreover, the combination of psychophysical, psychophysiological, and psychometric assessment methods is used in some research to obtain reliable measurements of UX during HRI. [93,105] The USUS evaluation framework allows researchers to assess four main factors, usability, social acceptance, social impact, and UX, by utilizing both physical and psychological measurement techniques. [106,107] The framework is exploited in various scenarios of HRI, such as a commercial social robot serving at lounges [108] or gates in airports or a humanoid robot controlled by unexperienced users via speech commands for pick-and-place tasks. [109] Beyond the discussed quantitative approaches that might be appealing to technically versed persons, qualitative and mixed-method approaches are important to reveal particular demands of individual users or subgroups of users, which might also be noninstrumental. [78,110] Lindblom and Alenljung [111] www.advancedsciencenews.com www.advintellsyst.com introduce an action and intention recognition methodology in HRI, ANEMONE, by combining UX evaluation methodology with the activity theory [112,113] and the seven stages of action model. [114] Although cognitive psychology is not closely related to UX, we would like to draw attention to cognitive models for the incorporation of an assistive device. Investigating cognitive models, e.g., of human body experience, [32] is a promising approach to achieve a more natural and intuitive interaction. According to Phillips et al., [115] a computational cognitive model lets a robot comprehend the noncomputational knowledge of a human by transforming it into a computational representation which can be understood by robots. Cognitive models can be used to predict human body experience by modeling multisensory integration, [32,116] to analyze spatial referencing humans use during HRI, [117] or to provide robots computational models of the decision-making process and to predict expectations of humans regarding robots' actions. [118] Bourgin et al. [119] pretrains neural networks with synthetic data generated by cognitive models to construct cognitive model priors to overcome challenges in decision prediction with noisy human behavior requiring massive sample sizes. Using personalized models, such pretraining approaches could even be used to adjust interaction strategies to individual demands. [120] Among the multitude of cognitive modeling approaches, Bayesian causal inference and predictive processing have been applied to multisensory integration and body experience [121,116,122] and connectionist models have been used to capture user engagement. [123] Focusing on higher cognition, cognitive architectures such as ACT-R, Soar, and others help to model complex cognitive processes such as decision-making. [124,125] Providing autonomous robots with learning capabilities enables adaptation of human users and learning algorithms to each other during interaction. Exploiting such coadaptation has the potential to improve the quality and efficiency of interaction, i.e., training quality and speed. [7] Yet, it is still a formidable challenge to shape such mutual interaction in a way so that it reduces the user effort while training and improves UX during interaction. [7,8] However, the design of more intuitive and efficient learning algorithms might be facilitated by considering mutual adaptation to achieve a comprehensive interaction. [126,127] For example, Beckerle et al. [8] state that the devices might perform self-tuning through adaptive calibration and/or machine learning while the users would improve their own motor skills and even their perception for better control. In the same work, it is also suggested that the robot could request the user to perform a certain task repeatedly to improve its own model of the world via incremental learning. Moreover, coadapting interaction strategies might predict user behavior based on data-driven models and deep learning. [128,129] Consequently, UX appears to be crucial for the development of transfer learning methods because human teachers tend to perform future-directed guidance and change their behavior over time based on their expectations regarding the capabilities of the autonomous learning system. [130]

User Experience of Shared Autonomy
While fully autonomous systems with skills superior to humans might be the first idea one has about robots, semiautonomously operated robotic systems might offer a better execution of complex, interactive tasks. Automatic control systems play an important role in low-level tasks, whereas the user only intervenes for guidance or correction of high-level tasks when needed. Semiautonomous operations can offload cognitive and physical load from users [131,132] without affecting their experience. [7,8] In the field of human-computer interaction, research shows that basic psychological needs are important factors that influence the users' experience. [30] Especially, feelings of autonomy and competence have been identified to be among the most important basic needs [133,134] and proven to play a role in influencing the degree of satisfaction after the human-computer interaction. [77] A user wants to feel independent and selfdetermining as well as capable and proficient. To put it another way, the user should not feel as being controlled while sharing control with a robot or any other agent. How to satisfy the users' needs of autonomy and competence in a shared-autonomy setting to ensure the positive UX is a challenge for future research. [7,8] Although sharing autonomy considering UX is a promising approach to improve HRI, it introduces some hard challenges as well. Specifically, the design of semiautonomy in HRI raises new questions because there is no general solution of how to appropriately share control between the robot and the human. [7,8] While several recent works analyze adaptive shared autonomy and corresponding technical requirements, [135,136] it is not yet solved how to adapt control sharing considering UX. [7] In addition, autonomous behavior should be predictable and provide assistance when appropriate and might thus need to be customized. [7,137] Therefore, it might be necessary to adapt control sharing with respect to temporal changes in the users' influences, i.e., abilities and preferences. [8] To clarify it with an example, one can think of the learning of the cup stacking task from Figure 1 via manipulation skills as explained in Section 2. In fully autonomous mode, the software selects most related movement primitive parameters for transfer when a new skill is needed to be trained. Self-improvement is implemented through a reward modulated learning rule based on model-based reinforcement learning algorithms or other alternatives. However, in a semiautonomous mode, the user could correct the robot motion during the execution. It should be kept in mind that the user influence may be proportional to the estimated uncertainty of the (semi-) autonomous system. Under uncertain or even critical circumstances, the user's interference triggers large weight updates which lead to rapid movement adaptation. For such an adaptive user-triggered reinforcement learning rule, the multitask eligibility traces can be a suitable solution. Another point that should not be omitted is the fact that the control signals of human users and the robotic autonomy might have different structures. [138] To appropriately share autonomy in this case, recognition of user intent based on sensing multiple modalities is suggested in the literature, e.g., tactile sensing and EMG [50,[139][140][141] For the assessment of the users' (cognitive) load during interaction with a semiautonomous robot, psychometric questionnaires such as the NASA Task Load Index [142] or measures based on interaction data, such as contribution from autonomy, similarity of commands, fluency of user commands, and number of interface interactions, can be used. [143]

Conclusion
Recent developments in LfD methods can foster people to collaborate with robots for a higher quality of living by offering a more intuitive and inartificial interaction. Examining LfD approaches, we devise transfer learning strategies for HRI and outline the potential of adaptation and transfer of manipulation skills. Adaptation and transfer of manipulation skills appear to be promising tools of such approaches. As model-based reinforcement learning algorithms are suitable for transfer learning, we present an extensive catalog of these algorithms. Beyond robotcentered improvements, considering UX during the interaction, and thereby fostering coadaptation of the user and the learning systems, is of huge potential. We envision coadaptation can be considered as the self-modification of behavioral approaches of both human and robot during LfD, to respond to the demands of each other in a process of bidirectional learning. As discussed, the assessment and evaluation of UX is promising to improve learning during interaction. While applications of psychometric, psychophysical, and psychophysiological assessment methods in interactive learning exist, we advocate for their systematic and continuous consideration as well as extending this by online data to improve UX. Moreover, shared autonomy could help to create intuitive assistance and convenient operation when considering human experience. Apart from outlining the potential of integrating UX measures in the control sharing algorithms, our review shows two challenges that have to be overcome in future works. First, a reliable and consistent decision algorithm is needed to share control appropriately and at the right time so that the intended task can be carried out faultlessly. The second challenge stems from the different structures of control signals of a person and robot. Figure 3 shows an overview diagram of the learning strategy reviewed in this article. The direct interaction between the user and the robot is just the tip of the iceberg. The red arrows indicate the interaction that can be controlled and/or perceived by the user who actively takes part in the teaching process. In contrast to this, the blue arrows show the information flow running in the background of interaction. User indirectly contributes the model training as the source of UX. We envision this contribution offers user a more intuitive interaction by improving transfer learning. As the ultimate goal, this review points out extraction of the expert from the loop as the expert is still required for the assessment and integration of the data for training purposes in the background operation.
Further research might consequently focus on the adaptive online adjustment of shared control. We believe that a UX evaluation framework for evaluating shared autonomy is needed. A mapping of UX to the control domain, e.g., as an adaptation measure or an intent detection, could be a major step toward intuitively assisting robots in daily life activities. An important question to ask is whether we can foster the interaction to enrich tactile data for learning purposes which is essential for skill adaptation. Following a holistic approach by exploiting both spiking neural networks and cognitive multisensory integration models of humans can fuse the proprioceptive and exteroceptive feedback information with state transition models that generalizes over multiple skills. Cognitive models might also offer a promising approach to improve UX during the interaction by enhancing the capabilities of robots and enabling more human-like interaction. [32] We believe blending UX information, learning approaches, and cognitive models have the potential to lead HRI research to a new level by enabling more intuitive control and higher acceptance of assistive and collaborative robots.