Interactive Video Object Segmentation Using Global and Local Transfer Modules

Yuk Heo, Yeong Jun Koh, Chang Su Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

An interactive video object segmentation algorithm, which takes scribble annotations on query objects as input, is proposed in this paper. We develop a deep neural network, which consists of the annotation network (A-Net) and the transfer network (T-Net). First, given user scribbles on a frame, A-Net yields a segmentation result based on the encoder-decoder architecture. Second, T-Net transfers the segmentation result bidirectionally to the other frames, by employing the global and local transfer modules. The global transfer module conveys the segmentation information in an annotated frame to a target frame, while the local transfer module propagates the segmentation information in a temporally adjacent frame to the target frame. By applying A-Net and T-Net alternately, a user can obtain desired segmentation results with minimal efforts. We train the entire network in two stages, by emulating user scribbles and employing an auxiliary loss. Experimental results demonstrate that the proposed interactive video object segmentation algorithm outperforms the state-of-the-art conventional algorithms. Codes and models are available at https://github.com/yuk6heo/IVOS-ATNet.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
PublisherSpringer Science and Business Media Deutschland GmbH
Pages297-313
Number of pages17
ISBN (Print)9783030585198
DOIs
Publication statusPublished - 2020
Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
Duration: 2020 Aug 232020 Aug 28

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12362 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th European Conference on Computer Vision, ECCV 2020
Country/TerritoryUnited Kingdom
CityGlasgow
Period20/8/2320/8/28

Bibliographical note

Funding Information:
Acknowledgements. This work was supported in part by ‘The Cross-Ministry Giga KOREA Project’ grant funded by the Korea government (MSIT) (No. GK20P0200, Development of 4D reconstruction and dynamic deformable action model based hyper-realistic service technology), in part by Institute of Information & communications Technology Planning & evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01441, Artificial Intelligence Convergence Research Center (Chungnam National University)) and in part by the National Research Foundation of Korea (NRF) through the Korea Government (MSIP) under Grant NRF-2018R1A2B3003896.

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

Keywords

  • Deep learning
  • Interactive segmentation
  • Video object segmentation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Interactive Video Object Segmentation Using Global and Local Transfer Modules'. Together they form a unique fingerprint.

Cite this