Pre-trained Language Model for Biomedical Question Answering

Wonjin Yoon, Jinhyuk Lee, Donghyeon Kim, Minbyul Jeong, Jaewoo Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

The recent success of question answering systems is largely attributed to pre-trained language models. However, as language models are mostly pre-trained on general domain corpora such as Wikipedia, they often have difficulty in understanding biomedical questions. In this paper, we investigate the performance of BioBERT, a pre-trained biomedical language model, in answering biomedical questions including factoid, list, and yes/no type questions. BioBERT uses almost the same structure across various question types and achieved the best performance in the 7th BioASQ Challenge (Task 7b, Phase B). BioBERT pre-trained on SQuAD or SQuAD 2.0 easily outperformed previous state-of-the-art models. BioBERT obtains the best performance when it uses the appropriate pre-/post-processing strategies for questions, passages, and answers.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - International Workshops of ECML PKDD 2019, Proceedings
EditorsPeggy Cellier, Kurt Driessens
PublisherSpringer
Pages727-740
Number of pages14
ISBN (Print)9783030438869
DOIs
Publication statusPublished - 2020
Event19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019 - Wurzburg, Germany
Duration: 2019 Sept 162019 Sept 20

Publication series

NameCommunications in Computer and Information Science
Volume1168 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
Country/TerritoryGermany
CityWurzburg
Period19/9/1619/9/20

Bibliographical note

Publisher Copyright:
© 2020, Springer Nature Switzerland AG.

Keywords

  • Biomedical question answering
  • Pre-trained language model
  • Transfer learning

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Pre-trained Language Model for Biomedical Question Answering'. Together they form a unique fingerprint.

Cite this