Abstract
In this paper, we focus on the problem of learning from demonstration (LfD) where demonstrations with different proficiencies are provided without labeling. To this end, we model multiple policies with different qualities as correlated Gaussian processes and present a leverage optimization method that estimates the leverage of each policy where the difference between two leverages defines the correlation between the corresponding policies. To recover a single policy function of an expert, we present a sparsity constraint on the leverage parameters. We first show that the proposed leverage optimization method can recover the correlations between sensory fields where the fields are realized from correlated Gaussian processes and sensor measurements are collected from the fields. Furthermore, we applied the proposed method to autonomous driving experiments, where demonstrations are collected from three different driving modes. While the driving policies are not realized from correlated processes, the proposed method assigns reasonable leverages to the driving demonstrations. The estimated driving policy of an expert, which incorporates the optimized leverages, outperforms previous LfD methods in terms of both safety and driving quality.
Original language | English |
---|---|
Article number | 8626460 |
Pages (from-to) | 564-576 |
Number of pages | 13 |
Journal | IEEE Transactions on Robotics |
Volume | 35 |
Issue number | 3 |
DOIs | |
Publication status | Published - 2019 Jun |
Externally published | Yes |
Bibliographical note
Funding Information:Manuscript received February 17, 2018; accepted November 28, 2018. Date of publication January 25, 2019; date of current version May 31, 2019. This paper was recommended for publication by Associate Editor D. Kulic and Editor A. Billard upon evaluation of the reviewers’ comments. This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science and ICT under Grant NRF-2017R1A2B2006136. (Corresponding author: Songhwai Oh.) The authors are with the Department of Electrical and Computer Engineering and the Automation and Systems Research Institute, Seoul National University, Seoul 08826, South Korea (e-mail:, [email protected]; kyungjae. [email protected]; [email protected]).
Publisher Copyright:
© 2019 IEEE.
Keywords
- Autonomous navigation
- learning from demonstration (LfD)
- leveraged Gaussian processes (LGPs)
- robust estimation
ASJC Scopus subject areas
- Control and Systems Engineering
- Computer Science Applications
- Electrical and Electronic Engineering