A Braitenberg Vehicle Based on Memristive Neuromorphic Circuits

The Braitenberg vehicle as a simple conceptual model to characterize the response behaviors of animals or insects under a stimulus is widely used to develop autonomous vehicles able to adapt to the varying environments. Considerable effort has been devoted to building neuromorphic processors with the software in the vehicles; however, there has been no demonstration of Braitenberg vehicle with neuromorphic hardware so far. Herein, a Braitenberg vehicle with simple memristive neuromorphic circuits is built for the first time. This vehicle exhibits adaptive behaviors in the supervised learning process and is eventually trained to conduct the task of tracking path. Moreover, the memristive circuit in the vehicle demonstrates a very short response latency (≈56 ns) to input sensory information. Herein, an alternative promising solution to build self‐adaptive robots and pave the way for the realization of autonomous robots based on memristive neuromorphic circuits is offered.


Introduction
Self-adaptive behaviors of living organisms are of crucial importance to survive in the varying environments, [1] which help animals or insects avoid obstacles. To understand these behaviors in a simple way, Valentino Braitenberg conducted a thought experiment. The results show that the vehicles with different connections between front sensors and rear wheels exhibit entirely different behaviors [2] under the same stimulus. This famous thought experiment has inspired the development of autonomous robots that are capable of avoiding obstacles and path tracking as well as navigation. [3][4][5][6] In spite of much success in achieving autonomous robots based on this vehicle model, the processors of the smart robots are still based on von Neumann architectures. [7,8] So far, the well-known Braitenberg vehicle model has never been implemented on neuromorphic hardware.
Neuromorphic architecture allows for implementation of dynamic neural control through physical computation. [9][10][11][12] On this architecture, information is transmitted and processed in the analog domain, enabling low power consumption and low latency. Previous works have shown that various neural networks can be mapped onto memristive crossbar arrays for selfadaptive learning. [13][14][15][16] For the memristive architecture, the nonvolatile property and processing-in-memory improve the energy efficiency, [13] which arises from the ions migration. [17,18] The advantage of memristive architectures is not limited to energy efficiency. Avoiding the conversion between analog and digital considerably reduces time delay in information transmission and processing, [19] together with the advantage of information processing in parallel on the memristive crossbar arrays. This is important to real-time interactive applications such as autonomous vehicles where sensory information is required to be processed and then quickly be sent to motor controllers without time delay. [20,21] Considering the aforementioned advantages, it is interesting to build the Braitenberg vehicle with memristive neuromorphic circuits. Here, we for the first time demonstrate a Braitenberg vehicle based on simple memristive neuromorphic circuits. The vehicle has been built with two grayscale sensors, a memristive neuromorphic circuit, and motors. With a supervised learning rule, the vehicle can be trained to track the path. Furthermore, a very short response latency (about 56 ns) is achievable for the memristive circuit of the vehicle. This work may open up a new avenue for the development of memristive crossbar array-based autonomous robotics with excellent adaptive capability to varying environments.

Braitenberg Vehicle
For insects such as ants, the obstacle avoidance behavior is one of the self-adaptive capabilities in their living environments (see Figure 1a). This behavior involves sensory receptors, neural network, and effectors. Receptors (e.g., eyes of ants) act as afferent neurons and respond to environmental stimuli. The sensed signals from receptors are transmitted through nerve afferent fibers to the special brain compartment where neurons are interconnected in a complex way. [22] After processing the information, the ant brain would send an instruction to the locomotor system to generate obstacle avoidance behavior (indicated by red dashed line in Figure 1a). This behavior is determined by the neural network of the ant brain. The variation of connection between neurons (i.e., synapse) corresponds to the learning process. It is well known that memristive devices can emulate the synaptic plasticity. [15,[23][24][25] We have fabricated the memristive crossbar arrays (see Figure 1b). The switching layer (Ta/Ta 2 O 5 ) in the memristive devices (see Figure 1c) is responsible for the emulation of synaptic plasticity. The details for fabrication have been provided in Experimental Section. We characterized the conductance variation of the fabricated devices by applying electrical pulse and presented the results in Figure 1d. The conductance shows an increase or reduction under a consecutive stimulus of the electrical pulses, which emulates the plasticity feature of biosynapses. Using the memristive crossbar array as a single-layer artificial neural network, we are able to build a path-tracking vehicle (see Figure 1e). The key parts of this vehicle consist of two grayscale sensors, a bioinspired processor, and a motor system. The sensors symmetrically installed at the front of chassis are used to detect grayscale changes of the pathway. The bioinspired processor responsible for recognizing special patterns and generating motor signals is a memristive neuromorphic circuit integrated on a printed circuit board. It is the 2 Â 2 memristive crossbar array implemented in this circuit that serves as interconnected neural synapses. A steering engine included in the motor system generates different turning behaviors in response to the motor signals. In the sensorimotor loop of the designed system, the real-time signals input through sensors are relayed to the single-layer neural network mapped onto the memristive crossbar array. Also, then the signals are processed through a series of linear computation operations including multiplication and summation. The analog motor signals from the processors are modulated into a pulse-width modulation signal to control the vehicle behaviors. Figure 1f shows the top view of the bioinspired vehicle based on memristive neuromorphic circuits. Different from the traditional autonomous vehicles, [26] the grayscale sensors and motor systems are wired by the memristive circuits, and the detected information is processed by the memristive crossbar array. In the model of the Braitenberg vehicle, the sensors and the wheels are connected by a special wire called Mnemotrix. This wire was proposed to exhibit stimulus-tunable resistance. The unique feature of the Mnemotrix allows for generation of self-adaptive behavior to environmental stimulus. [2] Interestingly, the vehicle based on memristive neuromorphic circuits demonstrates similar ability. It is capable of avoiding obstacles after training the memristive crossbar array, as indicated by the red dashed line in Figure 1g.

Training of the Braitenberg Vehicle for Path Tracking
The locomotion mechanism of the Braitenberg vehicle is determined by the neuron interconnections. By changing the connections, the Braitenberg vehicle exhibits different and complex behaviors under same stimulus. Therefore, the capability of learning and adapting to the varying environment is crucial for the practical application of the vehicle based on memristive neuromorphic architecture. Figure 2a shows a flowchart to illustrate how a supervised learning process is performed in the memristive neuromorphic circuits. The weight in the artificial neural network would be updated according to the punishment or reward feedback instructed by the supervisor, which is based on different response behaviors of the vehicle to input signals. After a few iterations of the feedback loop, a new mapping relationship between the input signals and output behaviors would be established, which indicates that the vehicle acquires a new skill. Traditionally, weight-update process of the memristive neural network is controlled by the software in a digital controller, such as a personal computer or microcontroller unit (MCU). [13,15] Note that we have implemented neuromorphic neural networks in the vehicle using memristive crossbar arrays. On this neuromorphic hardware, we update the weights in the artificial neural network through a simplified method, without using the traditional digital controller.
In this work, the weight-update method for a single-layer neural network can be descripted mathematically by the following equation ΔW i,j , the desired change for the weight connecting the neuron i and j; a, the constant learning rate; AI i , the input from the afferent neuron I; EO j , the output from the efferent neuron j; and TS j,k , target output from the neurons j at the kth sample of training set. We have used MNIST dataset within MATLAB to benchmark the update method. The result shows a recognition rate similar to that of the gradient descent algorithm. Different from the gradient descent algorithm, this simplified update method only involves the operations of multiplication and subtraction, which is readily implemented with simple circuit. To map the neural network onto the memristive crossbar array, both positive and negative weights are required. Nevertheless, the conductance of the memristive devices cannot be negative. To solve this issue, we use a differential pair of memristors W þ and W À to represent each weight W in the neural network. W þ and W À are updated with ΔW þ ¼ ΔW 2 and ΔW À ¼ À ΔW 2 : In the experiment, each ΔW was translated into an electrical pulse applied onto the positive terminal of the memristive devices. Note that the learning scheme relies on a negative feedback mechanism that the weights in the neural network would be always adjusted by the supervisor. In this way, the high precision of ΔW in each iteration is not required. This allows us to easily run the learning process without any operation of verifying conductance. For some specific scenarios, Equation (1) can be simplified to A positive or negative ΔW i,j indicates a reward or punishment feedback, respectively. This allows us to implement the feedback process in the analog circuit.
Based on the simplified rule of weight update, we implemented a supervised learning algorithm in the memristive analog circuits of the vehicle (circuit details are shown in the Figure 2b). With the assembled circuit, the vehicle is trained to track along black path. Figure 3a shows the typical dynamic processes of the weight update in the memristive neural network during the learning process. In the training iteration from 0 to 16, the vehicle is trained to turn left in the environment where the left sensor is located in black region and right sensor in the white region. As we increase the number of iterative training processes, W 1þ is constantly increased whereas W 1À is reduced. From 17 to 37 iterations of training, the supervisor teaches the vehicle to turn right in the surrounding with left sensor in the white region and right sensor in the black region. Similarly, W 2þ and W 2À are updated in this iterative process. Figure 3b,c shows mappings of weights updated after different iterations (0, 24, 37) and corresponding vehicle behaviors, respectively. As shown in the left panel of Figure 3c, obviously, the vehicle at initial moments moves forward without exhibiting any turning behavior. After 24 rounds of the iterative learning process, the vehicle acquires a skill of turning left. Once the whole learning process is done, the vehicle is able to track along the black path as shown in the right panel of Figure 3c.
To understand the turning behaviors of the vehicle in detail, we present the mappings between turning angle and grayscale value on the pathway detected by the dual grayscale sensors of the vehicle at different learning stages. The mapping results are obtained through simulating memristive circuits connecting sensors and motor systems, with results shown in Figure 3d. This simulation results show that moving behaviors of the vehicle after training are not limited to only two types of behaviors (turning left or right), corresponding to two types of environment conditions during training processes. Instead, the vehicle exhibits continuous behaviors in response to varying environments, which is beyond the training set. This enables the vehicle to possess reasonable reactions to unknown environments. Such a good adaptive behavior to the varying environments is mainly attributed to the generalization ability of the neural network, [26] as well as the continuity nature of analog signal. The generalization ability may be further enhanced by using features of memristors (e.g., randomness in write operation). [27] Note that Braitenberg vehicle exhibits the distinct adaptive behaviors to same stimulus, dependent on specific interconnections between sensors and wheels. The behaviors of Braitenberg vehicle resemble that of our vehicle in response to the varying surroundings. In the memristive neural network of our vehicle, the variation of synaptic weights is equivalent to the change in neuron interconnections. To some extent, we may claim that the change of neuron interconnections gives rise to the learning behaviors of our vehicles.

Response Speed of the Memristive Neuromorphic Circuit
With using neuromorphic circuits as a processor of the vehicle, all the computation operations can be performed in parallel in the analog domain. As a result, the vehicle is expected to show fast response speed. To investigate signal kinetics of the memristive circuit module, we measured the response speed by replacing analog sensors with square wave pulses as input. www.advancedsciencenews.com www.advintellsyst.com The response speed performance of the memristive neuromorphic circuit can be characterized by delay time, which is defined by the time lag between rising (or falling) to 50% of input signal and falling (or rising) to 50% of output signal from the memristive neuromorphic circuit. The measured results show that response latency of the memristive neuromorphic circuit is very short, ranging from 36 to 76 ns (see Figure 4a) for different input patterns. For this vehicle, the response speed is fundamentally determined by a three-stage amplifier circuit and can be further accelerated using faster amplifiers and more advanced integration technology. [28] To highlight the advantage of the memristive neuromorphic circuit in response speed, we have characterized the response latency of the MCUs widely used in smart vehicles or robots (see Figure 4b), which is responsible for computing and information processing. The MCU exhibits a latency on the order of microseconds. The result demonstrates that the memristive neuromorphic circuit is superior over the MCU-based solution in terms of response speed. More details about the MCU are provided in Experimental Section. This promising advantage mainly stems from the following reasons. For the neuromorphic architecture, expanding the scale of a neural network will only lead to an increase in space complexity (memory size) but not in time complexity, which is impossible to be realized in von Neumann architecture. Moreover, environmental information is required to be converted into digital information for processing in MCU. Then, the processed digital information would be eventually converted into output in the analog domain (see Figure 4c). These conversions would  www.advancedsciencenews.com www.advintellsyst.com

Conclusion
In conclusion, inspired by the model of Braitenberg vehicle, we built a neuromorphic vehicle with memristive neuromorphic circuits. The developed vehicle exhibits a good self-adaptive capability to the varying environments after a supervised learning process. Furthermore, we demonstrate that the neuromorphic vehicle shows a short latency in response to input signals. This work finally realizes the promise of prior study by Valentino Braitenberg and proves the famous model of Braitenberg vehicle with simple memristive hardware architecture. This work may provide a roadmap for developing smart vehicles or robots based on memristive crossbar arrays.

Experimental Section
Memristor Fabrication: Memristive devices are defined by the crossing region of the top and bottom electrodes with a metal-insulator-metal sandwich structure. Both the top and bottom electrodes of crossbar devices were patterned by standard photolithography and then lifted off in N-methyl pyrrolidone. Electron-beam evaporation was used to deposit top/bottom electrodes, in which Pd (%40 nm) as an inert metal was used to protect the switching layer. For the switching layer, the 10 nm Ta 2 O 5 and 80 nm Ta were deposited through a standard radio frequency sputtering process.
Fabrication of the Braitenberg Vehicle: The memristive neuromorphic circuit-based Braitenberg vehicle comprises memristive circuit, battery, motor system (including steering engine and main engine), analog grayscale sensors, and vehicle body. The core memristive circuit was integrated on customized printed circuit boards.
Weight Update: The learning processes shown in Figure 3 were performed on the vehicle platform. To monitor the changing weights during the learning process, a read operation was implemented after each iteration. For the read operation, constant input signals generated by an Agilent B1500A parameter analyzer replaced the signals from the sensors, and then output signals were measured with the analyzer. Thus, the weights in the memristive neural network were calculated according to the inputs and outputs.
Latency Test of the MCU: In the control experiment, an MCU (STM32F103ZET6) was used, which is a widely used embedded processor, with 72 MHz clock frequency, fast I/O ports, and integrated ADCs as well as DACs. The MCU was programmed to implement the same path-tracking task as the memristive neuromorphic circuit. To characterize the latency, the MCU was operated to conduct the continuous mapping from input analog signals to output analog signal. The measured result revealed an average response latency of 38 μs, which is much longer than that of our memristive neuromorphic circuit implemented in the main text.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.