Improving cancer classification accuracy using gene pairs

Pankaj Chopra, Jinseung Lee, Jaewoo Kang, Sunwon Lee

    Research output: Contribution to journalArticlepeer-review

    50 Citations (Scopus)

    Abstract

    Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).

    Original languageEnglish
    Article numbere14305
    JournalPloS one
    Volume5
    Issue number12
    DOIs
    Publication statusPublished - 2010

    ASJC Scopus subject areas

    • General

    Fingerprint

    Dive into the research topics of 'Improving cancer classification accuracy using gene pairs'. Together they form a unique fingerprint.

    Cite this