Classification of multiple cancer types by multicategory support vector machines using gene expression data

Yoonkyung Lee, Cheol Koo Lee

Research output: Contribution to journalArticlepeer-review

255 Citations (Scopus)


Motivation: High-density DNA microarray measures the activities of several thousand genes simultaneously and the gene expression profiles have been used for the cancer classification recently. This new approach promises to give better therapeutic measurements to cancer patients by diagnosing cancer types with improved accuracy. The Support Vector Machine (SVM) is one of the classification methods successfully applied to the cancer diagnosis problems. However, its optimal extension to more than two classes was not obvious, which might impose limitations in its application to multiple tumor types. We briefly introduce the Multicategory SVM, which is a recently proposed extension of the binary SVM, and apply it to multiclass cancer diagnosis problems. Results: Its applicability is demonstrated on the leukemia data (Golub et al., 1999) and the small round blue cell tumors of childhood data (Khan et al., 2001). Comparable classification accuracy shown in the applications and its flexibility render the MSVM a viable alternative to other classification methods.

Original languageEnglish
Pages (from-to)1132-1139
Number of pages8
Issue number9
Publication statusPublished - 2003 Jun 12
Externally publishedYes

Bibliographical note

Funding Information:
Y. L. would like to thank Grace Wahba and Yi Lin for their helpful suggestions and discussions, and Michael Ferris for his comments and helps on computation. The anonymous referees provided many helpful suggestions. This research was partly supported by NSF Grant DMS0072292 and NIH Grant EY09946.

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Classification of multiple cancer types by multicategory support vector machines using gene expression data'. Together they form a unique fingerprint.

Cite this