QuIIL at T3 Challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View

  • Trinh T. L. Vuong
  • , Doanh C. Bui
  • , Jin Tae Kwak*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present our solutions for a spectrum of automation tasks in life-saving intervention procedures within the Trauma THOMPSON (T3) Challenge, encompassing action recognition, action anticipation, and Visual Question Answering (VQA). For action recognition and anticipation, we propose a pre-processing strategy that samples and stitches multiple inputs into a single image and then incorporates momentum- and attention-based knowledge distillation to improve the performance of the two tasks. For training, we present an action dictionary-guided design, which consistently yields the most favorable results across our experiments. In the realm of VQA, we leverage object-level features and deploy co-attention networks to train both object and question features. Notably, we introduce a novel frame-question cross-attention mechanism at the network’s core for enhanced performance. Our solutions achieve the rank in action recognition and anticipation tasks and rank in the VQA task. The source code is available at https://github.com/QuIIL/QuIIL_thompson_solution.

Original languageEnglish
Title of host publicationAI for Brain Lesion Detection and Trauma Video Action Recognition - 1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge, Held in Conjunction with MICCAI 2023, Proceedings
EditorsRina Bao, Ellen Grant, Yangming Ou, Andrew Kirkpatrick, Juan Wachs
PublisherSpringer Science and Business Media Deutschland GmbH
Pages82-93
Number of pages12
ISBN (Print)9783031716256
DOIs
Publication statusPublished - 2025
Event1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge Held in Conjunction with 26th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2023 - Vancouver, Canada
Duration: 2023 Oct 122023 Oct 16

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14567 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st BONBID-HIE Lesion Segmentation Challenge and 1st Trauma Thompson Challenge Held in Conjunction with 26th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2023
Country/TerritoryCanada
CityVancouver
Period23/10/1223/10/16

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Keywords

  • VQA
  • Video classification
  • co-attention
  • contrastive learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'QuIIL at T3 Challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View'. Together they form a unique fingerprint.

Cite this