Learning outdoor mobile robot behaviors by example
Version of Record online: 12 JAN 2009
Copyright © 2009 Wiley Periodicals, Inc.
Journal of Field Robotics
Special Issue: Special Issue on LAGR Program, Part II
Volume 26, Issue 2, pages 176–195, February 2009
How to Cite
Roberts, R., Pippin, C. and Balch, T. (2009), Learning outdoor mobile robot behaviors by example. J. Field Robotics, 26: 176–195. doi: 10.1002/rob.20278
- Issue online: 16 JAN 2009
- Version of Record online: 12 JAN 2009
- Manuscript Accepted: 16 DEC 2008
- Manuscript Received: 8 APR 2008
We present an implementation and analysis of a real-time, online, supervised learning system for nonparametrically learning behaviors from a human trainer on a mobile robot in outdoor environments. This approach enables a human operator to train and tune robot behaviors simply by driving the robot with a remote control. Hand-designed behaviors for outdoor environments often require many parameters, and complicated behaviors can be difficult or impossible to specify with a manageable number of parameters. Furthermore, their design requires knowledge of the robot's internal models and knowledge of the environment in which the behaviors will be used. In real-world scenarios, we can design new behaviors using our learning system much more quickly than we can write hand-crafted behaviors. We present the results of training the robot to execute several specialized and general-purpose behaviors, including traversing a slalom, staying near “cover,” navigating on paths, navigating in an obstacle field, and general-purpose navigation. Our system learns and executes most of these behaviors well after 1–4 h of operator training time. In quantitative tests, the learned behavior is not as robust as a hand-crafted behavior but often completes obstacle courses more quickly. Additionally, we identify the factors that influence the effectiveness of this approach and investigate the properties of the training data provided by the human trainer. On the basis of our analyses, we suggest future work to ensure sufficient training, handle conflicting training examples, model robot dynamics, and further investigate dimensionality reduction of perception features. © 2009 Wiley Periodicals, Inc.