KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

Gangwoo Kim, Hajung Kim, Lei Ji, Seongsu Bae, Chanhwi Kim, Mujeen Sung, Hyunjae Kim, Kun Yan, Eric Chang, Jaewoo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we introduce CheXOFA, a new pre-trained vision-language model (VLM) for the chest X-ray domain. Our model is initially pre-trained on various multimodal datasets within the general domain before being transferred to the chest X-ray domain. Following a prominent VLM, we unify various domain-specific tasks into a simple sequence-to-sequence schema. It enables the model to effectively learn the required knowledge and skills from limited resources in the domain. Demonstrating superior performance on the benchmark datasets provided by the BioNLP shared task (Delbrouck et al., 2023), our model benefits from its training across multiple tasks and domains. With subtle techniques including ensemble and factual calibration, our system achieves first place on the RadSum23 leaderboard for the hidden test set.

Original languageEnglish
Title of host publicationBioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop
EditorsDina Demner-fushman, Sophia Ananiadou, Kevin Cohen
PublisherAssociation for Computational Linguistics (ACL)
Pages567-573
Number of pages7
ISBN (Electronic)9781959429852
Publication statusPublished - 2023
Event22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023 - Toronto, Canada
Duration: 2023 Jul 13 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023
Country/TerritoryCanada
CityToronto
Period23/7/13 → …

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization'. Together they form a unique fingerprint.

Cite this