Abstract
Prior work in multi-task learning has mainly focused on predictions on a single image. In this work, we present a new approach for multi-task learning from videos via efficient inter-frame local attention (MILA). Our approach contains a novel inter-frame attention module which allows learning of task-specific attention across frames. We embed the attention module in a "slow-fast"architecture, where the slow network runs on sparsely sampled keyframes and the fast shallow network runs on non-keyframes at a high frame rate. We also propose an effective adversarial learning strategy to encourage the slow and fast net-work to learn similar features to well align keyframes and non-keyframes. Our approach ensures low-latency multi-task learning while maintaining high quality predictions. MILA obatins competitive accuracy compared to state-of-the-art on two multi-task learning benchmarks while reducing the number of floating point operations (FLOPs) by up to 70%. In addition, our attention based feature propagation method (ILA) outperforms prior work in terms of task accuracy while also reducing up to 90% of FLOPs.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2021 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 2219-2229 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781665401913 |
| DOIs | |
| Publication status | Published - 2021 |
| Externally published | Yes |
| Event | 18th IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021 - Virtual, Online, Canada Duration: 2021 Oct 11 → 2021 Oct 17 |
Publication series
| Name | Proceedings of the IEEE International Conference on Computer Vision |
|---|---|
| Volume | 2021-October |
| ISSN (Print) | 1550-5499 |
| ISSN (Electronic) | 2380-7504 |
Conference
| Conference | 18th IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021 |
|---|---|
| Country/Territory | Canada |
| City | Virtual, Online |
| Period | 21/10/11 → 21/10/17 |
Bibliographical note
Publisher Copyright:© 2021 IEEE.
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
Fingerprint
Dive into the research topics of 'MILA: Multi-Task Learning from Videos via Efficient Inter-Frame Attention'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS