Can Machines Learn to Comprehend Scientific Literature?

Donghyeon Park, Yonghwa Choi, Daehan Kim, Minhwan Yu, Seongsoon Kim, Jaewoo Kang

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)


To measure the ability of a machine to understand professional-level scientific articles, we construct a scientific question answering task called PaperQA. The PaperQA task is based on more than 80 000 'fill-in-the-blank' type questions on articles from reputed scientific journals such as Nature and Science. We perform fine-grained linguistic analysis and evaluation to compare PaperQA and other conventional question and answering (QA) tasks on general literature (e.g., books, news articles, and Wikipedia texts). The results indicate that the PaperQA task is the most difficult QA task for both humans (lay people) and machines (deep-learning models). Moreover, humans generally outperform machines in conventional QA tasks, but we found that advanced deep-learning models outperform humans by 3%-13% on average in the PaperQA task. The PaperQA dataset used in this paper is publicly available at

Original languageEnglish
Article number8606080
Pages (from-to)16246-16256
Number of pages11
JournalIEEE Access
Publication statusPublished - 2019

Bibliographical note

Funding Information:
This work was supported by the National Research Foundation of Korea (NRF) under Grant 2016M3A9A7916996 and Grant 2017R1A2A1A17069645.

Publisher Copyright:
© 2018 IEEE.


  • Artificial intelligence
  • crowdsourcing
  • data acquisition
  • data analysis
  • data collection
  • data mining
  • data preprocessing
  • knowledge discovery
  • machine intelligence
  • natural language processing
  • social computing
  • text analysis
  • text mining

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering


Dive into the research topics of 'Can Machines Learn to Comprehend Scientific Literature?'. Together they form a unique fingerprint.

Cite this