Abstract
In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.
Original language | English |
---|---|
Pages (from-to) | 3296-3311 |
Number of pages | 16 |
Journal | International Journal of Control, Automation and Systems |
Volume | 20 |
Issue number | 10 |
DOIs | |
Publication status | Published - 2022 Oct |
Bibliographical note
Funding Information:This research was supported by the MOTIE under the Industrial Foundation Technology Development Program supervised by the KEIT (No. 20008613).
Publisher Copyright:
© 2022, ICROS, KIEE and Springer.
Keywords
- Deep reinforcement learning
- demonstration-based learning
- end-to-end robot control
- hierarchical reinforcement learning
ASJC Scopus subject areas
- Control and Systems Engineering
- Computer Science Applications