Tracking-by-segmentation using superpixel-wise neural network

Se Ho Lee, Won Dong Jang, Chang-Su Kim

Research output: Contribution to journalArticlepeer-review

11 Citations (Scopus)


A tracking-by-segmentation algorithm, which tracks and segments a target object in a video sequence, is proposed in this paper. In the first frame, we segment out the target object in a user-annotated bounding box. Then, we divide subsequent frames into superpixels. We develop a superpixel-wise neural network for tracking-by-segmentation, called TBSNet, which extracts multi-level convolutional features of each superpixel and yields the foreground probability of the superpixel as the output. We train TBSNet in two stages. First, we perform offline training to enable TBSNet to discriminate general objects from the background. Second, during the tracking, we fine-tune TBSNet to distinguish the target object from non-targets and adapt to color change and shape variation of the target object. Finally, we perform conditional random field optimization to improve the segmentation quality further. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art trackers on four challenging data sets.

Original languageEnglish
Article number8476565
Pages (from-to)54982-54993
Number of pages12
JournalIEEE Access
Publication statusPublished - 2018

Bibliographical note

Funding Information:
This work was supported in part by the Agency for Defense Development and Defense Acquisition Program Administration of Korea under Grant UC160016FD and in part by the National Research Foundation of Korea Grant through the Korean Government (MSIP) under Grant NRF-2015R1A2A1A10055037 and Grant NRF-2018R1A2B3003896.

Publisher Copyright:
© 2013 IEEE.


  • Tracking-by-segmentation
  • object segmentation
  • object tracking
  • visual tracking

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering


Dive into the research topics of 'Tracking-by-segmentation using superpixel-wise neural network'. Together they form a unique fingerprint.

Cite this