Predictive Learning of Error Recovery with a Sensorized Passivity-Based Soft Anthropomorphic Hand

An additional requirement for prostheses is low-cost, simplicity, and sensory feedback during use. [19,88,89

more recently preferred. [38] Even then, most of these grasp strategies rely on quasi-static assumptions. The human grasp strategy heavily relies on the passive properties of the hand. They are characterized by dynamic interaction-based strategies that require little visual-feedback. [5] These morphological principles have been applied to develop soft robotic hands that require minimal grasp planning and visual processing to solve various tasks. [39,40] Recent works have shown that these soft robotic designs also provide can leverage human knowledge to generate one-step grasp strategies, owing to their anthropomorphic nature. [6] With internal models and contact information, these open-loop strategies can be converted to sequential closed-loop strategies, making the controller more robust to varying objects. [41] Actions affect our cognition, so high-level decision making should not be decoupled from low-level control. [42] Error detection and recovery gives some coupling, which has the potential to improve manipulation capabilities. For grasp adaptation, there has been promising investigation into detecting "anomalies" during manipulations, [34,43] training error detection using sequential neural networks with visual inputs, [44] and frameworks for error recovery. [20,45] Though there is notable lack of investigation in this area with soft hands, slip detection and prevention is one of the most common grasp adaptation strategies. [46][47][48][49][50] However, failure due to slip is just one of the many failure modes that occur after grasping and requires dense tactile information and high control bandwidth for adaptation. For passive soft interactionbased designs, like in the case of passive prosthetic devices, such high sensory and control bandwidth is rarely available. Moreover, there are other modes of failure that can arise even before full closure with the object. Recent works have worked on identifying grasp quality using dense vision-based tactile sensors on static grasps. [51,52] In this work, we present predictive grasp quality estimation in a soft passive anthropomorphic hand with sparse sensory data, as shown in Figure 1. We trained a soft, passive, anthropomorphic hand with embedded soft sensors to detect and recover from errors even before they cause failures. Interaction-based wrist control [6,53] is augmented with a form of active perception by a soft skin with 32 soft barometric sensors. Our framework demonstrates adaptive grasping capabilities while being able to reproduce multiple grasp classifications on a wide variety of objects when trained on a single object with a single trajectory. We study the grasping performance of the hand in a selfresetting environment, which allows large-scale experimentation. We observe emergent grasping failure and success modes under real and artificial perturbations. A long short-term memory (LSTM) network is used to predict real-time failure and success from few trials, using exteroceptive and proprioceptive data from our soft, modular sensors. Real-time error predictions and a heuristic error recovery routine are implemented and compared to grasping with no feedback, resulting in an 144% improvement in success rate. In addition, we demonstrate the system's generalizability when grasping and predicting errors with unknown objects and interference (Movie 1).
The main contributions of this work are 1) the anthropomorphic hand and sensor design for robust and adaptable object grasping with tactile feedback; 2) an early grasp error prediction algorithm using an LSTM network and the sparse tactile data for the described open-loop grasping strategy; and 3) a method of error recovery that exploits early error predictions and passive hand behaviors to improve grasping success rate.

Passive Hand Design
The anthropomorphic hand improves upon the design seen in ref. [6], primarily with the addition of sensing capabilities. The hand was adapted from a commercial 3D model purchased from TurboSquid (www.turbosquid.com). [7] Muscles, tendons, and ligaments were removed, leaving only the skeleton. The bones were then modified for manufacturing of ligaments, tendon arrangement and skin molding. Collateral ligaments stabilized each joint. [54] For all joints, except the thumb carpometacarpal (CMC), the collateral ligaments formed an "S" shape over each side, [55] this allowed rolling and prevents sliding. These ligaments allowed some abduction/adduction for the metacarpophalangeal (MCP) joints. Though, the thumb CMC required a greater range of motion, [56,57] therefore an additional ligament was added for stability at larger deflections. Ligament mounting holes were modeled on each bone at the limits of the rolling motion. Volar plates, which prevent hyperextension, [58] were omitted from this design for reduced complexity and ease of manufacturing ( Figure 2B).
Tendon pulleys were modeled as rigid loops embedded into the bones for a five-tendon arrangement. Each finger had four degrees of freedom (DoF), therefore required a suitable arrangement of five antagonistic tendons for control. [59] The chosen arrangement was two flexor tendons, one to each of the intermediate and distal phalanx, and three extensor/abductor tendons, one to the distal phalanx while avoiding the proximal-interphalangeal (PIP) joint, the other two connecting to the intermediate phalanx while running laterally on the MCP for abduction/adduction ( Figure 2B). This arrangement approximated the flexor tendons Demonstrated with a wrist-driven soft hand-which achieves grasping through sequential hand-environment interactions rather than any internal actuation-prediction of future errors in an open loop grasp can be learned using exteroceptive and proprioceptive information from a barometric sensing skin. With a self-resetting environment, large-scale experiments can generate training data and evaluate error recovery performance. and extensor hood of a real finger. [60,61] Mounting holes for these pulleys were directly modeled into the bones.
A 3D scan of similar proportions to the skeleton was added, and the bones were aligned within. By subtracting the skeleton and a cavity around each joint, [53] the scan became a model of the soft tissues of a real hand. [62] This soft tissue model was where the sensing receptors are modeled. Each receptor was an "L"shaped tube with a curve in the stem to run around the bones. Four shorter receptors (2 mm) were placed in each fingertip where space was more constricted, these were aligned at a skin depth midway between the outer surface and bone to minimize any thin walls and are more sensitive to tactile pressure. Four longer receptors (4 mm) were arranged around each joint and placed deeper under the surface and close to the joint cavity, therefore were sensitive to both tactile pressure and skin deformation from joint bending. Two additional receptors were placed in the palm close to the skin surface, one in between the thumb and index finger and the other beneath the thumb. This gave a total of fifty receptor locations over the palm, thumb, index, and middle finger (Figure 2).
One final modification to the skeleton was made, modeling fingernails. These served dual purpose, one being improved grasping capabilities with fingernail pinching and the other being stabilization during the skin molding process. For the molding process, the inner mold was modeled from the skeleton without ligament and tendon modifications, only joint cavities and fingernails. The outer mold was primarily a negative of the 3D hand scan, though individual sections for each of the thumb, inde,x and middle finger containing higher detail receptor positives are generated to allow mold assembly.
The final hand component was the wrist-mounting hardware ( Figure 2A). Consisting of a hand-mounting plate with tendon routing holes, a UR5 robot arm mounting plate with tendon spring anchors, and a sensor mounting shield on the rear.

Hand Manufacturing
The first manufacturing step was 3D-printing components. Fused deposition modeling with PLA was used for the wristmounting hardware and the lower detailed inner and outer molds. For more detailed parts, the bones, and receptor molds, inkjet printing was utilized with a Stratasys Objet500. The bones were printed with Stratasys Durus for strength and toughness, and the receptor molds were printed with a blend of Stratasys Rigur (high stiffness) and Agilus (low stiffness) for stability during molding and large elongations before breaking during mold assembly.
For the skeleton assembly process, the bones were printed with a scaffold keeping them aligned during ligament fabrication, these were snapped off once assembly was complete. [53] Ligaments were formed from Festo 2 mm flexible tube with shore hardness D52, cut to individual lengths for each joint, then bonded into the modeled ligament mounting holes using Araldite two-part epoxy. Flexible tubing provided a robust flexure joint with low rolling resistance and limited extensibility to Receptors are air chambers embedded throughout the soft skin tissue, where density, placement, and geometry are highly customizable. Receptors are coupled to surface mount barometric sensors through pneumatic channels. The hand is tendon-driven, with each tendon connected to reconfigurable springs with tuneable pretension and stiffness. B) Hand skeleton and tendon routing. Anatomical joints constrained by ligament pairs and 5 tendons per finger. C) Modular receptor design. Modeled air channels connect receptor air pockets to ports on the rear of the skin for interfacing with pneumatic tubing. Sensitivity to exteroceptive force or joint deformation can be controlled by receptor placement and geometry. D) Receptor distribution. 32 total receptors are placed in the thumb, index, middle finger, and palm. reduce joint dislocation. Tendon pulleys were "U" shapes with 2 mm diameter and 0.6 mm thickness formed of single core, copper, stripped 23AWG wire. These were inserted in the modeled mounting holes and bonded with Araldite epoxy. To resist higher forces, pulleys for the flexor tendons penetrated the thickness of the bone and hook onto the rear side. The final skeleton assembly step was attaching the wrist-mounting hardware and routing the tendons. Tendons tied to anchors at predefined termination points, then were routed through pulleys toward the wrist mounting hardware. Each tendon was looped through a spring and clamped to the hand mounting plate with a screw. Spring stiffnesses can be limited individually using hooks. With this mounting method, pretension and stiffness of each tendon can be tuned for different starting postures and interaction force behaviors. [6] The sensorized soft skin was cast into the 3D-printed, assembled mold using Smooth-on Ecoflex 00-30 two part silicone. After mixing, the silicone was vacuumed to remove any air bubbles, then pored into the mold, and allowed to cure for four hours. The cast skin was carefully removed from the mold, then placed over the preassembled skeleton like a glove. The skin was bonded to the distal phalanx of each finger, underneath the fingernail, using Smooth-on Silpoxy. The skin over the other joints was left unbonded and held in place via friction. For sensor connections, silicone tubing (BS2848, shore hardness: A40, inner diameter: 0.5 mm, outer diameter: 1.5 mm) was inserted to the chosen receptors, ten receptors on each of the thumb, index and middle finger and two in the palm. Tubing is was routed on the rear of the finger and bonded to the skin with Silpoxy. As small a tube as possible was preferred, to reduce any added elasticity to the finger motions, allow for higher density of sensor placement and to increase sensitivity by maximizing change in receptor chamber volume over total sensor volume.
The final step of hand assembly was the sensor readout. Pressure within each sensing receptor was transduced with NXP MPXH6300AC6U absolute pressure sensors. Analog voltages from each pressure sensor were measured with 16-bit ADS1115 analog to digital converters (ADC) capable of 860 samples sec À1 . These were I2C devices with four ADC channels and four possible addresses, hence a single I2C bus can support 16 sensor channels. Custom PCBs were manufactured for mounting and connecting the pressure sensors to the ADCs. Eight of these PCB were mounted on the sensor shield on the wristmounting hardware, where sensor tubing can be safely connected ( Figure 2A and S1, Supporting Information). The total of 32 available sensing channels was divided onto two independent I2C buses of an FRDMKL25Z microcontroller. This controller configures each ADC and collates sensor readouts ready for a master device. With this configuration, the maximum sampling frequency of all channels together was 26.9 Hz, though to account for delays in communications and stabilizing timings for processing, we sampled all channels at 10 Hz.

Wrist-Control Framework
Grasping with a passive hand relieds on wrist-driven interactions with the environment. Wrist control was essential in grasping and manipulation, especially in more constrained environments. Edge grasps were an example of wrist control being used to overcome limitations. [5,63] Techniques for generating similar environmental constraints' exploitation were generally bespoke algorithmic solutions. [64] These approaches required significant teaching effort, which the wrist-control framework [6] attempts to solve. This was a method for training and adapting grasps based-on sequential interaction classification from only a single demonstration trajectory. The following process generated template trajectories for distinct grasping strategies on familiar objects: Kinaesthetic teaching was used to generate a trajectory for a particular grasping strategy on a training object The trajectory D was subsampled into k keypoints at interaction "inflection" points, resulting in a simplified trajectory R with a discrete set of interactions Each interaction was given a classification that defines how it adapts to changing object geometry. Therefore, by inputting a familiar object with changes in its primitive geometry, this template trajectory T can allow grasping adaptation ( Figure 4A).
(3) Figure 3B shows an example trajectory with the wrist position, orientation, and sensor readout. The sensor readout enhanced key point extraction and classification, which previously was processed manually, and with the additional sensor information had the potential to be automated. Grasping and manipulation via sequential interactions can not only improve robustness and simplify control with a soft hand [41] but it can also add redundancy [6] and extend behavioral diversity. [7] For successful robotic grasping, a minimum amount of knowledge is required about itself and the environment. With a passive adaptive hand, this information requirement was offset by the information gain through environmental interactions, therefore success was tolerant to significant deviations in self and environment state. [6] However, large uncertainty in the initial states can cause failures in later interactions, this is where additional information gain can keep the robot within the range of tolerated states.

Experimental Setup
To facilitate large-scale grasping experimentation, the hand was mounted on a UR5 arm and the grasping environment is self-resetting. Figure 1 shows the robot and environment. By grasping the sphere and releasing onto a plate with a shallow recession, the sphere always returns to the center ready for the next trial. Since the hand was passive, releasing the object was achieved by pushing into a fixed dowel as part of the grasping trajectory. A grasp is considered a success if the object remains in the hand up until the point where it is knocked out by the dowel.
Many grasping strategies can be chosen for the same object. [6] Different strategies have strengths and weaknesses under different environmental constraints and uncertainties. The primary grasping strategy in this article is chosen such that failures are relatively common ( Figure 5C) but occur with little dependence on injected noise. The trajectory was recorded using the wristcontrol framework, where kinaesthetic teaching gave a six DoF time series D sampled at 10 Hz on a central PC which was them subsampled into seven keypoints ( Figure 4B). The simplified trajectory can be replayed as a set of waypoints. Artificial noise was added in the form of a constant shift to the trajectory in the plane of the table, (r % Uð0, 5 mmÞ, θ % UðÀπ, πÞ, R n ¼ R þ ½r cosðθÞ, r sinðθÞ, 0, 0, 0, 0), essentially adding uncertainty to the starting location of the object. This noise was distributed radially, with uniform probability between 0 and 5 mm distance from the object origin.
Simultaneous to recording and replaying trajectories, sensor data S were streamed to the central PC and video was recorded from a webcam. To ensure reliable sampling rate and data synchronization, a central PC process collated all data. The result was a synchronized AVI file and tabulated data file (CSV) with timestamp, robot pose, and sensor readout.
S ¼ ½s 0 , s 1 , : : : s 31 i¼½0, : : : nÀ1 Grasping experiments can now be performed in rapid succession with minimal oversight. Six hundred fifty grasping trials were performed over 5 d, with different environmental conditions such as temperature, which offsets the barometric sensor readings. Footage of each trial was manually reviewed, and the final outcome was labeled. The label corresponded to the classification of outcome which was qualitatively evaluated, e.g., failure due to missing the object or failure due to the object getting trapped ( Figure 5B). This data set was used to train the network. With the network trained, sensor readout at 10 Hz was forwarded . Experimental setup and passive grasping/error recovery control architecture. A) System diagram for data collection. Learning experiment takes a recorded trajectory for wrist-driven grasping, adds perturbation, then records sensor data and grasp outcome. B) Example recorded trajectory and sensor time-series data through phases of reaching, interacting, and releasing the object. C) Modified system diagram for error recovery experiment, network predictions lead to error recovery actions once the prediction is "certain". D) Prediction network architecture trained on data from the learning experiment and outputting outcome predictions in real time.
www.advancedsciencenews.com www.advintellsyst.com Figure 4. Passive hand motions and adaptive behaviors. A) Skeleton range of motion in the Kapandji thumb test. [74] Hand posture can be preset by tuning tendon springs (Figure 2A). High test score excluding ring and little finger contributions. B) Passive shape adaptation. Similar to the "finray effect", [75] forces toward the base of a finger cause negative bending. As the MCP joint is extended, flexor tendon tension is increased, causing bending in the PIP and DIP joint. C) Complex and anisotropic stiffness for diverse interaction, e.g., abduction/adduction stiffness varies with finger extension due to joint geometry. [68] Lower stiffness when extended allows the finger to deflect, higher stiffness when flex provides higher passive grasping force. www.advancedsciencenews.com www.advintellsyst.com to the recurrent network, which returned a live prediction of grasping success. Trials can be run with or without error recovery enabled. The network and error recovery was tested with 250 trials (Movie S4, Supporting Information), half have recovery disabled as a control, the other half error recovery iswas enabled.

Network Architecture and Training
The network for predicting failure was designed to perform a sequence-to-one regression. The architecture of the deep LSTM network is shown in Figure 3D. The network was made of two layers of LSTM units with a size of 60 and 15, with a dropout unit in between. The first LSTM layer performed a "sequence-to-sequence" transformation and the second LSTM layer performed a 'sequence-to-one' transformation.
The dropout unit had a dropout probability of 0.2. The network creation and training were done on the MATLAB Deep Learning Toolbox.
Six hundred and fifty sequence data were collected for training. All the sequence had its corresponding scalar outcome variable. Here, we gave a value of 1 for failure and 0 for success, in order to convert the learning to a regression problem. The data set was augmented to a larger set, eighty times its size, by clipping the sequences randomly within a range of 5-15 s. Time exponentially decreasing noise was also added to the sensor data. This data augmentation forces the network to predict the grasp outcome as early as possible without overfitting to the small dataset. Six hundred and ten samples were used for learning and the rest for testing.

Error Recovery
The error recovery process allowed us to test the real-world implications of error detection. Figure 3C shows the modified wristcontrol framework for error recovery. During an activity, the sensor readout was fed to a network, which generates live outcome predictions. When the network is certain of failure, the output regresses to a steady "1", if the network is certain of success, the output regresses to a steady "0". If there is uncertainty, the value may oscillate. The "certain" states generally occur late in the trajectory, often times after failure or potential for failure has passed.
To perform an error recovery process, a decision had to be made whether to act or not during the trajectory. In order to obtain a prediction as soon as possible, analysis was performed to extract a prediction to act upon when network certainty was lower. The network output Z was passed through a first-order For the error recovery process, a heuristic strategy was chosen since. From the training data, one mode of failure dominated, with %50% of total outcomes, therefore the recovery strategy targeted this. In this failure mode "F1" (Figure 5B), the sphere was trapped under the index finger and ejected before the grasp can be modulated by thumb contact. A simple hypothesis for recovery is to relieve pressure and bring the thumb into contact with the sphere before the elastic force within the index finger builds up too high. This can be achieved with a translation (away from the camera in Figure 6A). If this intuitive recovery technique is successful, then error detection is demonstrated to have great potential in robust grasping and manipulation when used in conjunction with human intuition or more advanced recovery processes which can account for types of failure, or even used as part of a training process itself.

Statistical Analysis
Statistical analysis was performed on the results to test the significance of the findings. A Chi-squared test was performed on the outcomes of the error recovery trials compared to the control trials ( Figure 6B). The significance of the change in rates of the success and failure modes was tested to a 0.00001 significance with a single-ended p-value.

The Soft Anthropomorphic Hand
Embedding mechanical intelligence within a robotic hand increases greater behavioral diversity through mechanical complexity and redundancy. [7,65,66] In addition, intelligent behaviors are readily exploited by cheap control, [67] especially in common repetitive tasks such as grasping. [68] This is key for developing intelligent and general-purpose hands for use in industrial, social, and prosthetic devices. [69,70] The human hand provides the ideal starting point for the design of such intelligent systems. Not only are they highly generalized and adaptable, but much of the modern world is designed with the specific ergonomics of human-hand interactions. [4] For this investigation into passive adaptive grasping and error prediction, we have developed a sensorized soft hand with an anatomically accurate structure. The replica synovial joints ( Figure 2) ground the hand design with biological basis. [6,71,72] These types of joints are the most common and most mobile joint seen in mammals, however, are exceedingly rare in robotics. In part due to the inherent instability of floating bones versus pin or socket joints. While synovial joints may have arisen due to biological constraints, they demonstrate significant advantages in terms of efficiency, customizability, and resilience. [73] Figure 2B shows the underlying skeletal structure of our passive hand. This design allows significant ranges of motion, including thumb abduction, resilience to impacts, and dislocation, while retaining controllability and passive-adaptive behaviors ( Figure S3, Supporting Information adaptive grasping).
The sensorized skin receptors act according to the ideal gas law. Air chambers located within the soft skin are deformed under tactile loads or joint bending ( Figure 2C). The change www.advancedsciencenews.com www.advintellsyst.com in volume corresponds to a change in pressure measured by barometric sensors mounted on the wrist (Figure 2A). The skin has a resolution of 0.69 mN with a 16-bit ADC, a force range of 23-5800 mN, and a response time up to 1 ms (Table S1, Supporting Information).

Passivity and Anthropomorphic Design Provides One-Step Stable Adaptive Grasps
A previous study demonstrated adaptive grasping with the skeletal hand under a framework for passive grasping with sequential interactions through control of the wrist. [6] The skeletal hand is able to grasp spheres with diameter 25-75 mm from single demonstration with an overall success rate of 57.7%, with the wrist adaptations contributing to an 86% increase in success rate over simple open-loop control. Figure 4 outlines some of the capabilities of our passive system. For use in a diversity of tasks, high posture control is desired. Using the Kapandji thumb test as an evaluation of thumb range of motion, [74] Figure 4A shows successful posture setting to extreme positions in the test (some positions untestable). Additionally, intelligent behaviors in the hand can be exploited for more robust grasping or enabling new interactions. Figure 4B shows the curling behavior of the index finger. Due to the tendon layout and joint design, when pressing a finger toward the base, the tip deflects toward the force. This negative bending, similar to the finray effect, [75] enhances passive shape adaptation. In Figure 4C, posture-dependent stiffness is demonstrated. Due to bone geometry at joint interfaces, abduction/adduction in the flexed position is restricted relative to the extended position, similar to a human hand. [56] This can be exploited during grasping, where when forming grasps, lower stiffness is useful, then when holding, higher stiffness is preferred.
To evaluate adaptive grasping augmented by the soft sensorized skin, learned trajectories can be tested on everyday objects of similar size to the training object used during kinaesthetic teaching. The hand, in a partially closed position, is driven through a series of interactions with a 60 mm sphere (Movie S1, Supporting Information). This trajectory has been simplified by interaction-based key point extraction. [6] One limitation from the wrist-driven framework is the subjective manual trajectory processing, the presence of sensory information can enable automation, and objectification of the manual steps such as keypoint extraction and interaction classification. [6] Table S2, Supporting Information shows the grasping success rates of each test object. Three of fourteen objects have no successes, either too heavy and small (battery), too small and Figure 6. Error recovery experiment results. A) Heuristic error recovery procedure. Prediction is monitored, as soon as error is "certain," the recovery intervention is implemented. The routine corrects for the most common error (F1: Figure 5), relieving the trapped object with a shift, primarily in the x direction. B) Change in outcomes when implementing error recovery, successes increase and failure mode 1 decreases significantly (to 0.001%). Other failures are not significantly affected. C) Real grasping outcomes broken down by time of prediction, whether success was predicted (upper) or not (lower) and intervention disabled (left) as a control or enabled (right). Change in outcome distribution when failure is predicted and intervention enabled (lower right) demonstrates the recovery-early decisions are critical for successful recovery.
www.advancedsciencenews.com www.advintellsyst.com deformable (grapes), or too large and concave (spool). These grasps may not be possible with this particular hand starting posture and trajectory, as grasping force in this case depends on elastic forces from index finger and thumb deformations. The remaining objects have variable success rate. Note that no information about the object and its pose is provided to the controller, showcasing the adaptability of the hand design. Success rates can be improved by visual feedback, wrist adaptations, and error predictions, especially as some of the worst performing objects are orientation dependent unlike the training object. The wooden block, ground coffee, and computer mouse geometries are all anisotropic and successes are only seen in favorable starting orientations, this is a problem that can be solved at the grasp planning stage through vision or exploration of the object with tactile feedback. Notable objects that showcase the passive hands capabilities include the highly deformable bubble wrap, which has a high success rate when aligned with the hand; the smaller sphere, with a 100% success rate; the ground coffee bag, which is much heavier (550%) than the teaching spheres; and the wooden block, whose lack of smooth, curved surfaces does not prevent successful grasping in all cases (Grasping success: Movie S2, Supporting Information, grasping failure: Movie S3, Supporting Information).

Recurrent Neural Networks Can Predict Grasping Outcome Before Lift-Off
The trajectory in Figure 4B is used for the majority of experiments. We expect the hand to be able to adapt and grasp from different starting states, such as under object position uncertainty. This recorded grasp policy is perturbed to generate training data and can be used for large-scale grasping experimentation, Figure 4A,C.
As the grasp policies are open-loop trajectories, the sensor data history from the onset of contact is indicative of future outcomes. We use an LSTM network ( Figure 3D) to predict the grasp outcome in real time, assuming future actions are fixed. Even though the labeling of the data set for training can only be done after the end of each grasp episodes ( Figure 3A), our training framework drives the network to predict grasp outcomes as early as possible (See Materials and Methods). The speed of detection and its certainty depends on the nature of failure. An example of the network prediction for different kind of grasp outcomes is shown in Figure 3B.
The prediction accuracy with respect to the length of sensory data on the test set provides an insight on the highest accuracy achievable with the current setup and training data ( Figure 5A (lower)). Given the whole sequence of sensory data after liftoff (time > 10 s), the learned network can predict the final outcome with very high accuracy. Note that this is not a trivial problem because the sensors are significantly affected by nonlinearities like hysteresis and drift in the material and the temperature surrounding the skin. The prediction accuracy reduces as the length of the sensory data is reduced, as expected. However, most of the failure cases are still predicted early and with higher confidence, which is vital for error recovery and robustness. Figure 5A (middle) shows the average predictions for the test set, labeled with the actual outcome, where it can be observed that failure predictions are more accurate and robust. Even though convergence of the predictions happens at a later stage, by observing the trends of the prediction, we can make reasonable predictions of the outcome early enough that it can be used for error recovery using low-bandwidth controllers.

Large-Scale Grasping Experimentation Reveals Multiple Failure Modes with Simulated Noise
The passive anthropomorphic hand has a vast number of possible grasps. Not only are there multiple grasp type possible [76,77] but also multiple possible trajectories to achieve the same grasp type. Limiting the hand to a single starting posture (Figure 2A) narrows down the grasping possibilities; however, there is still a multitude to choose from.
An exploration of grasping trajectories is performed, with the criteria that we find a trajectory with a significant proportion of failure modes, therefore less training and test data are needed to observe failures and recovery. Figure 3B shows this trajectory. Figure 5C and Table S3, Supporting Information show the results from a trial of 200 grasps; the first 100 trials follow the same trajectory without artificially added perturbation, and the last 100 have a random perturbation applied in the plane of the table to simulate uncertainty in grasping location. The first 100 trials have a success rate of 69%, which drops to 51% with perturbation. Failures are distributed over four distinct modes ( Figure 5B and S5, Supporting Information left), with the first failure mode being the mode common. Figure 5C  During exploration, a second trajectory of note was found. This trajectory ( Figure S5, Supporting Information right) demonstrated a much higher success rate than the trajectory used for the remaining experiments ( Figure S5, Supporting Information left). Table S4, Supporting Information shows the outcomes of 200 trials, 100 with added noise. With no noise, this trajectory achieved a 100% success rate, signaling that it is robust to uncertainties in the starting hand posture and that trajectories can be optimized for particular environments (Movie S5, Supporting Information). With an appropriate trajectory, the passive hand framework can achieve highly robust grasp performances under slight environmental uncertainties. As we are interested in error prediction and recovery, this grasp trajectory is not investigated in detail. The trajectory did see a significant drop in success rate under the presence of added noise, down to 67%. Within the 67 successes, three were observed as distinct new grasps, where the teaching sphere ended in a power grasp against the palm, rather than a pinch between the thumb and index finger. This gives a passive boost in performance which can be exploited in more novel and challenging tasks. In addition, this demonstrates the behavioral diversity enabled by passive design which can be exploited by minor changes in control (as this mode is observed to be starting position dependent, Figure S4A, Supporting Information right).

Error Recovery is Possible via Heuristics-Based Wrist Trajectory Adaptation
Passive grasping relies upon successful completion of a series of interactions. There is tolerance in these interactions. Passivity in design aims to increase these tolerances, meaning less information is required about the self and environment (Figure 4). In this way, the task of information gathering is exported from perception systems to the physical dynamical system. This means for many simple or familiar tasks, much lesser burden is placed on the perception system, though for more complex or unfamiliar tasks, perception is required. The addition of a perception system can allow reaction to surprise during grasping. [43,78] This way, expensive attention is required less frequently. [51] If this reaction comes too late, the grasp may have already failed, generally requiring an expensive rerun of the grasp. [79] Therefore, if potential failures can be detected early, they can be accounted for without either expensive attention or regrasping (Movie S6, Supporting Information).
From Figure 5C, we see failure mode one is the most common. In this mode, the teaching sphere is trapped under the index finger and the table, rather than in a more stable contact with the thumb ( Figure 5B). This leads to a build-up of pressure, which when released knocks the sphere out of the grip. A heuristic recovery method is introduced, which intervenes with a shift to the trajectory when the failure prediction is "certain". This shift relieves pressure by rolling the sphere from under the finger and bringing it into contact with the thumb. Figure 6A illustrates the error recovery process. Of note, the recovery can intervene at any moment in time, so can react immediately when failure prediction exceeds a threshold ( Figure 3C) (Movie S7, Supporting Information).
The heuristic recovery intervention only operates as intended for a single failure mode, though acts as a proof of concept for improving manipulation robustness with error prediction. With a cheap control technique and exploitation of the anthropomorphic hand's passive dynamics, intuitive and natural skills can be observed.
With both the error predictions and recovery functioning, an experiment is run to evaluate the combined performance. Two hundred fifty grasping trials are run, alternating between open-loop control as a controlled baseline and with intervention enabled ( Figure 3C). Table S5, Supporting Information shows all real outcomes and predictions for these trials. Observing the baseline results, the true positive rate of successes is 36% and for failures is 95%. Figure 6B shows the change in success and failure modes between the baseline and trials with intervention enabled. With intervention enabled, the number of successful outcomes increases from 25 to 60 and the number of failures decreases from 72 to 39. As the distribution of predictions is approximately equal between the two cases, this would suggest the intervention is impactful (with statistical significance p < 0.00001). The remaining failure modes are grouped and do not change significantly, this is expected due to the targeted nature of the error recovery.
Average predictions over time show patterns in the different outcomes. Figure S6, Supporting Information shows mean and standard deviation for the baseline trials (left), success (top) and failure (bottom) outcome chosen. Incorrect decisions happen early on in the trajectories, when the predictions are less distinguishable. The more common occurrence being failure outcomes looking similar to successful outcomes before 8 s. By waiting to make a decision, more information can be gathered, and higher accuracy can be achieved (with an upper limit from the regression accuracy, Figure 5A). However, this results in delayed reaction to recover from errors. There is high prediction variance within the different outcome modes, especially in failure mode two. This is partly because of our coarse classification of outcomes. Within each success and failure mode, there is some variance in terms of catastrophic failure point. Especially in failure mode 2, in which the sphere is dropped any time after lifting and before being reset. Therefore, significant variance in sensor information is expected, potentially causing overlaps in sensory information which can be resolved by gathering more information (either from further hand-environment interactions or with placement of additional sensing receptors).
Observing the decision times for the different predictions and real outcomes confirms the effect of the recovery intervention ( Figure 6C). First, the shapes of the total cumulative predictions counts are near identical between the baseline and intervention modes. Second, for the success predictions, when no recovery happens in either the baseline or in the intervention case, the distribution of real outcomes is on average within 13%. Therefore, the difference in real outcomes when failure is predicted is evident.
There are patterns in both the successful predictions and failure predictions based on the time the decisions are made. For the cases where failure outcomes were misidentified as successes ( Figure 6C top, modes F1 and F2), 87% are decided before 9 s into the trajectory. Failure prediction accuracy increases if they can be identified early (before 8.0 s) or later (after 9.5 s) ( Figure 6C bottom left). The first peak is critical for the recovery routine. From the predicted failures in the intervention case, the successful outcomes significantly increase over the baseline case. However, after 8.1 s into the trajectory, there are no additional success outcomes. The failure decision has to come before this time for a chance of recovery. At 7.8 s, 35 failures are identified, 34 of them are successful outcomes (assuming 3 misclassified successful outcomes from the baseline failure predictions), and the recovery success rate is %97%. At 8.4 s, the recovery success rate drops to %87% and at 9.5 s drops to %66%.

Grasp Predictions are also Generalizable
While the error predictions are only trained on the grasping of spheres with position uncertainty, grasping success potentially is indicated by underlying patterns in the sensor readout which are not unique to spheres. Patterns such as net forces, enclosure of the object, or stability of each contact (e.g., lack of slip) have the potential to be learned and transferred by the network to wider situations. Table S2, Supporting Information shows the prediction accuracy on the set of everyday objects. The accuracy compares the decision point of the prediction to the real outcome. The most accurate predictions are seen with the 50 mm sphere (100%), the battery (90%), and the spool (80%). For the sphere, the grasp is near identical to the grasp of the training object (60 mm sphere), the battery and spool both have no successes and fail early in the trajectory, giving more information for the predictions. The lowest accuracy is seen in the 70 mm sphere (50%), bubble wrap (20%), and bottle (20%). The 70 mm sphere grasp is successful only in a different grasping mode which utilizes the middle finger, hence the predictions are less accurate for the successes. Additionally, both the bubble wrap and bottle succeed with slightly different emergent grasps (Movie S2, Supporting Information). This would suggest the network is overfitting on the success outcomes. This is not unexpected due to the lower diversity in grasping success in the training data versus the diversity in failures.
The predictions demonstrate some intuition for more general cases. This intuition is observed in predictions for familiar objects, in addition to predictions with unfamiliar environmental interference. For example, when grasping and the object is knocked out of the grasp during lift, the prediction rapidly updates to a failure ( Figure S7 and Movie S8, Supporting Information). This exact interference case is not present in the training data, though failure mode two can look similar. Other live prediction updates do back up the intuition and generalized performance, particularly the ability of the prediction to update after the heuristic error recovery. The final prediction of the network (at 15 s) has 100% accuracy (50 of 50) for the baseline cases where failure was correctly predicted, for the intervention cases where failure was predicted but not recovered the final prediction was correct 100% (21 of 21), for cases where the intervention succeed, the prediction correctly updated to success 97% (35 of 36) of the time. The failure predictions were made on average 7.403 s into these trajectories, then by 8.2 s, the average prediction became uncertain (regression output 0.5) and by 8.5 s, the prediction average (0.143) passed the threshold to a success (<0.2). Therefore, there is potential for closed-loop error recovery and self-supervised learning of recovery behaviors.

Discussion
Performance of robotic manipulators in general-purpose tasks is lacking when compared to human manipulation. Two significant problems we see in robotic manipulation include the development of appropriate designs that can exhibit adaptive dynamic behaviors and the development of control strategies for adaptation and learning in novel and niche situations. The passive dynamics of the hand have shown to be essential for generating diverse adaptive behaviors through interactions, [5,7] ensuring low control effort through strategies including passive shape adaptation [9] and stiffness control. [7] In passive soft grasping, emergent behaviors allow robustness to uncertainties in object and environment properties, though these behaviors can result in complex sensory responses which makes perception difficult. The described soft anthropomorphic hand is the ideal platform for testing passive dynamic behaviors and investigating sensing strategies for intuitive manipulation. In total, 1240 grasping trials were performed (650 training, 250 heuristic recovery testing, 140 everyday object trials, 200 alternate trajectory grasps), providing diverse manipulation data.
The passive anthropomorphic hand demonstrates significant adaptive grasping behaviors. Trained in one-shot on a regular sphere, the hand is able to successfully grasp 11 of 14 everyday objects chosen at random with similar size to the sphere. These objects include highly irregular objects such as a computer mouse and highly deformable objects such as a roll of bubble wrap. The soft sensing skin demonstrates a low-cost, highly sensitive, and modular design (sensor response is customizable and receptors can be placed in any geometry of hand). These sensors provide tactile and proprioceptive information, though the current solution has limited receptor density and scalability is challenging with individual barometric pressure gauges for each sensing channel. The sparse data are informative for studying the hand-environment interactions, Figure 3B shows sensor channel peaks corresponding to the extracted keypoints, which mark changing interactions. [6] The anatomically correct design approach coupled with the soft sensing capabilities allows intuitive training from nonexpert users.
Our error prediction approach is able to overcome traditional soft sensing problems including nonlinearities, drift, and temperature dependence by using recurrent learning architectures. Low-level, high-bandwidth control is performed in a pure mechanical manner by the passive-adaptive behaviors ( Figure 4) and emergent grasps ( Figure S3, Supporting Information). Hence, sensory feedback is only required for higher level monitoring and predictive reactions. When incentivizing early predictions, prediction accuracy can be as high as 79% with data length of 80 (8 s), improving to 98% with data length 150. Extracting the trends of the prediction, we are able to make a decision about the outcome in advance of catastrophic grasp failure (greater than 1 stable contact to one or fewer, e.g., object is ejected from grasp), therefore allowing us to correct for any error and recover the grasp. Implementation of a heuristic recovery routine targeted at the most common failure mode improves the success rate from 19.8% to 48.4%. Of the 35 earliest failure predictions, 34 were successfully recovered. The recovery strategy is a simple translation added based on intuition, showing the power of this predictive approach only requiring simple interventions, which is promising for scalability and developing recovery routines for other manipulations and failure modes, potentially through self-supervised learning. [80,81] To go beyond just improving grasping robustness and adaptation, we hope to see the emergence of intuitive behaviors. The first of these is being able to recognize grasping failures on unfamiliar objects. When only trained on grasping of a regular sphere, the network is able to detect failures with 68% accuracy (successes with 41%). This is promising and highly likely to be improved with a more diverse training set. Second, is being able to react to external sources of interference. Even when the object is knocked out of the hand post-lift, the network output reacts within 0.2 s to update the prediction and smaller sources of interference. Additionally, by being similar to a human hand, the achievable grasp types are well understood, [76,77,82] the hand matches ergonomically with the environment, and human input during teaching is intuitive due to familiarity. [83] This has great implications for making robust and intuitive manipulation systems, for example, in flexible industrial or logistic robotic applications where the environment can constantly change, but highly robust and adaptable systems are required. Or for prostheses, where the anthropomorphic design is highly desirable, both for esthetics [84,85] and for the ease of exploitation. [86,87] www.advancedsciencenews.com www.advintellsyst.com