ORA3D: Overlap Region Aware Multi-view 3D Object Detection

Wonseok Roh, Gyusam Chang, Seokha Moon, Giljoo Nam, Chanyoung Kim, Younghyun Kim, Sangpil Kim, Jinkyu Kim

Research output: Contribution to conferencePaperpeer-review


Current multi-view 3D object detection methods often fail to detect objects in the overlap region properly, and the networks' understanding of the scene is often limited to that of a monocular detection network. Moreover, objects in the overlap region are often largely occluded or suffer from deformation due to camera distortion, causing a domain shift. To mitigate this issue, we propose using the following two main modules: (1) Stereo Disparity Estimation for Weak Depth Supervision and (2) Adversarial Overlap Region Discriminator. The former utilizes the traditional stereo disparity estimation method to obtain reliable disparity information from the overlap region. Given the disparity estimates as supervision, we propose regularizing the network to fully utilize the geometric potential of binocular images and improve the overall detection accuracy accordingly. Further, the latter module minimizes the representational gap between non-overlap and overlapping regions. We demonstrate the effectiveness of the proposed method with the nuScenes large-scale multi-view 3D object detection data. Our experiments show that our proposed method outperforms current state-of-the-art models, i.e., DETR3D and BEVDet.

Original languageEnglish
Publication statusPublished - 2022
Event33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom
Duration: 2022 Nov 212022 Nov 24


Conference33rd British Machine Vision Conference Proceedings, BMVC 2022
Country/TerritoryUnited Kingdom

Bibliographical note

Publisher Copyright:
© 2022. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition


Dive into the research topics of 'ORA3D: Overlap Region Aware Multi-view 3D Object Detection'. Together they form a unique fingerprint.

Cite this