Abstract
In this paper, we present our solutions for a spectrum of automation tasks in life-saving intervention procedures within the Trauma THOMPSON (T3) Challenge, encompassing action recognition, action anticipation, and Visual Question Answering (VQA). For action recognition and anticipation, we propose a pre-processing strategy that samples and stitches multiple inputs into a single image and then incorporates momentum- and attention-based knowledge distillation to improve the performance of the two tasks. For training, we present an action dictionary-guided design, which consistently yields the most favorable results across our experiments. In the realm of VQA, we leverage object-level features and deploy co-attention networks to train both object and question features. Notably, we introduce a novel frame-question cross-attention mechanism at the network’s core for enhanced performance. Our solutions achieve the rank in action recognition and anticipation tasks and rank in the VQA task. The source code is available at https://github.com/QuIIL/QuIIL_thompson_solution.
| Original language | English |
|---|---|
| Title of host publication | AI for Brain Lesion Detection and Trauma Video Action Recognition - 1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge, Held in Conjunction with MICCAI 2023, Proceedings |
| Editors | Rina Bao, Ellen Grant, Yangming Ou, Andrew Kirkpatrick, Juan Wachs |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 82-93 |
| Number of pages | 12 |
| ISBN (Print) | 9783031716256 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge Held in Conjunction with 26th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2023 - Vancouver, Canada Duration: 2023 Oct 12 → 2023 Oct 16 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 14567 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge Held in Conjunction with 26th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2023 |
|---|---|
| Country/Territory | Canada |
| City | Vancouver |
| Period | 23/10/12 → 23/10/16 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Keywords
- VQA
- Video classification
- co-attention
- contrastive learning
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science
Fingerprint
Dive into the research topics of 'QuIIL at T3 Challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS