Abstract
Array comparative genomic hybridization (aCGH) is a newly introduced method for the detection of copy number abnormalities associated with human diseases with special focus on cancer. Specific patterns in DNA copy number variations (CNVs) can be associated with certain disease types and can facilitate prognosis and progress monitoring of the disease. Machine learning techniques have been used to model the problem of tissue typing as a classification problem. Feature selection is an important part of the classification process, because many biological features are not related to the diseases and confuse the classification tasks. Multiple feature selection methods have been proposed in the different domains where classification has been applied. In this work, we will present a new feature selection method based on structured sparsity-inducing norms to identify the informative aCGH biomarkers which can help us classify different disease subtypes. To validate the performance of the proposed method, we experimentally compare it with existing feature selection methods on four publicly available aCGH data sets. In all empirical results, the proposed sparse learning based feature selection method consistently outperforms other related approaches. More important, we carefully investigate the aCGH biomarkers selected by our method, and the biological evidences in literature strongly support our results.
Original language | English |
---|---|
Article number | 6654128 |
Pages (from-to) | 168-181 |
Number of pages | 14 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 11 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2014 |
Externally published | Yes |
Keywords
- DNA copy number variations
- Feature evaluation and selection
- aCGH
- biomarker detection
- cancer classification
ASJC Scopus subject areas
- Biotechnology
- Genetics
- Applied Mathematics