Most electronic medical records, such as free-text radiological reports, are unstructured; however, the methodological approaches to analyzing these accumulating unstructured records are limited. This article proposes a deep-transfer-learning-based natural language processing model that analyzes serial magnetic resonance imaging reports of rectal cancer patients and predicts their overall survival. To evaluate the model, a retrospective cohort study of 4,338 rectal cancer patients was conducted. The experimental results revealed that the proposed model utilizing pre-trained clinical linguistic knowledge could predict the overall survival of patients without any structured information and was superior to the carcinoembryonic antigen in predicting survival. The deep-transfer-learning model using free-text radiological reports can predict the survival of patients with rectal cancer, thereby increasing the utility of unstructured medical big data.
Bibliographical noteFunding Information:
This study was supported by a Severance Hospital Research fund for Clinical excellence (SHRC) (C-2020-0030), the Big Data Center at the National Cancer Center of Korea (2020-datawe08), the Ministry of Health & Welfare, Republic of Korea (grant number: HR20C0021(3)), and Korea University Grant.
Copyright © 2021 Kim, Lee, Choi, Baek, Choi, Lim, Kang and Shin.
- deep learning
- natural language processing (NLP)
- rectal cancer
- survival prediction
ASJC Scopus subject areas
- Cancer Research