Improving cancer classification accuracy using gene pairs

Pankaj Chopra, Jinseung Lee, Jaewoo Kang, Sunwon Lee

Research output: Contribution to journalArticlepeer-review

44 Citations (Scopus)


Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).

Original languageEnglish
Article numbere14305
JournalPloS one
Issue number12
Publication statusPublished - 2010

ASJC Scopus subject areas

  • General


Dive into the research topics of 'Improving cancer classification accuracy using gene pairs'. Together they form a unique fingerprint.

Cite this