TY - JOUR
T1 - Module of Axis-based Nexus Attention for weakly supervised object localization
AU - Sohn, Junghyo
AU - Jeon, Eunjin
AU - Jung, Wonsik
AU - Kang, Eunsong
AU - Suk, Heung Il
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - Weakly supervised object localization tasks remain challenging to identify and segment an entire object rather than only discriminative parts of the object. To tackle this problem, corruption-based approaches have been devised, which involve the training of non-discriminative regions by corrupting (e.g., erasing) the input images or intermediate feature maps. However, this approach requires an additional hyperparameter, the corrupting threshold, to determine the degree of corruption and can unfavorably disrupt training. It also tends to localize object regions coarsely. In this paper, we propose a novel approach, Module of Axis-based Nexus Attention (MoANA), which helps to adaptively activate less discriminative regions along with the class-discriminative regions without an additional hyperparameter, and elaborately localizes an entire object. Specifically, MoANA consists of three mechanisms (1) triple-view attentions representation, (2) attentions expansion, and (3) features calibration mechanism. Unlike other attention-based methods that train a coarse attention map with the same values across elements in feature maps, MoANA trains fine-grained values in an attention map by assigning different attention values to each element. We validated MoANA by comparing it with various methods. We also analyzed the effect of each component in MoANA and visualized attention maps to provide insights into the calibration.
AB - Weakly supervised object localization tasks remain challenging to identify and segment an entire object rather than only discriminative parts of the object. To tackle this problem, corruption-based approaches have been devised, which involve the training of non-discriminative regions by corrupting (e.g., erasing) the input images or intermediate feature maps. However, this approach requires an additional hyperparameter, the corrupting threshold, to determine the degree of corruption and can unfavorably disrupt training. It also tends to localize object regions coarsely. In this paper, we propose a novel approach, Module of Axis-based Nexus Attention (MoANA), which helps to adaptively activate less discriminative regions along with the class-discriminative regions without an additional hyperparameter, and elaborately localizes an entire object. Specifically, MoANA consists of three mechanisms (1) triple-view attentions representation, (2) attentions expansion, and (3) features calibration mechanism. Unlike other attention-based methods that train a coarse attention map with the same values across elements in feature maps, MoANA trains fine-grained values in an attention map by assigning different attention values to each element. We validated MoANA by comparing it with various methods. We also analyzed the effect of each component in MoANA and visualized attention maps to provide insights into the calibration.
UR - http://www.scopus.com/inward/record.url?scp=85175649878&partnerID=8YFLogxK
U2 - 10.1038/s41598-023-45796-8
DO - 10.1038/s41598-023-45796-8
M3 - Article
C2 - 37903879
AN - SCOPUS:85175649878
SN - 2045-2322
VL - 13
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 18588
ER -