Reinforcement learning based on movement primitives for contact tasks

Young Loul Kim, Kuk Hyun Ahn, Jae Bok Song

Research output: Contribution to journalArticlepeer-review

33 Citations (Scopus)

Abstract

Recently, robot learning through deep reinforcement learning has incorporated various robot tasks through deep neural networks, without using specific control or recognition algorithms. However, this learning method is difficult to apply to the contact tasks of a robot, due to the exertion of excessive force from the random search process of reinforcement learning. Therefore, when applying reinforcement learning to contact tasks, solving the contact problem using an existing force controller is necessary. A neural-network-based movement primitive (NNMP) that generates a continuous trajectory which can be transmitted to the force controller and learned through a deep deterministic policy gradient (DDPG) algorithm is proposed for this study. In addition, an imitation learning algorithm suitable for NNMP is proposed such that the trajectories similar to the demonstration trajectory are stably generated. The performance of the proposed algorithms was verified using a square peg-in-hole assembly task with a tolerance of 0.1 mm. The results confirm that the complicated assembly trajectory can be learned stably through NNMP by the proposed imitation learning algorithm, and that the assembly trajectory is improved by learning the proposed NNMP through the DDPG algorithm.

Original languageEnglish
Article number101863
JournalRobotics and Computer-Integrated Manufacturing
Volume62
DOIs
Publication statusPublished - 2020 Apr

Keywords

  • AI-based methods
  • Deep Learning in robotics and automation
  • Force control

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Mathematics(all)
  • Computer Science Applications
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'Reinforcement learning based on movement primitives for contact tasks'. Together they form a unique fingerprint.

Cite this