Improved survival analysis by learning shared genomic information from pan-cancer data

Sunkyu Kim, Keonwoo Kim, Junseok Choe, Inggeol Lee, Jaewoo Kang

    Research output: Contribution to journalArticlepeer-review

    44 Citations (Scopus)

    Abstract

    Motivation: Recent advances in deep learning have offered solutions to many biomedical tasks. However, there remains a challenge in applying deep learning to survival analysis using human cancer transcriptome data. As the number of genes, the input variables of survival model, is larger than the amount of available cancer patient samples, deep-learning models are prone to overfitting. To address the issue, we introduce a new deep-learning architecture called VAECox. VAECox uses transfer learning and fine tuning. Results: We pre-trained a variational autoencoder on all RNA-seq data in 20 TCGA datasets and transferred the trained weights to our survival prediction model. Then we fine-tuned the transferred weights during training the survival model on each dataset. Results show that our model outperformed other previous models such as Cox Proportional Hazard with LASSO and ridge penalty and Cox-nnet on the 7 of 10 TCGA datasets in terms of C-index. The results signify that the transferred information obtained from entire cancer transcriptome data helped our survival prediction model reduce overfitting and show robust performance in unseen cancer patient samples.

    Original languageEnglish
    JournalBioinformatics
    Volume36
    DOIs
    Publication statusPublished - 2020

    Bibliographical note

    Funding Information:
    This research was supported by the National Research Foundation of Korea(NRF-2017R1A2A1A17069645, NRF-2016M3A9A7916996 and NRF-2014M3C9A3063541).

    Publisher Copyright:
    © The Author(s) 2020. Published by Oxford University Press.

    ASJC Scopus subject areas

    • Statistics and Probability
    • Biochemistry
    • Molecular Biology
    • Computer Science Applications
    • Computational Theory and Mathematics
    • Computational Mathematics

    Fingerprint

    Dive into the research topics of 'Improved survival analysis by learning shared genomic information from pan-cancer data'. Together they form a unique fingerprint.

    Cite this