Abstract
This work is in the context of kernel-based learning algorithms for sequence data. We present a probabilistic approach to automatically extract, from the output of such string-kernel-based learning algorithms, the subsequences—or motifs—truly underlying the machine’s predictions. The proposed framework views motifs as free parameters in a probabilistic model, which is solved through a global optimization approach. In contrast to prevalent approaches, the proposed method can discover even difficult, long motifs, and could be combined with any kernel-based learning algorithm that is based on an adequate sequence kernel. We show that, by using a discriminate kernel machine such as a support vector machine, the approach can reveal discriminative motifs underlying the kernel predictor. We demonstrate the efficacy of our approach through a series of experiments on synthetic and real data, including problems from handwritten digit recognition and a large-scale human splice site data set from the domain of computational biology.
Original language | English |
---|---|
Title of host publication | Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 |
Editors | Vitor Santos Costa, Carlos Soares, Annalisa Appice, Annalisa Appice, Pedro Pereira Rodrigues, Vitor Santos Costa, Carlos Soares, João Gama, Alípio Jorge, Pedro Pereira Rodrigues, João Gama, Vitor Santos Costa, Alípio Jorge, Annalisa Appice, Pedro Pereira Rodrigues, João Gama, Annalisa Appice, Carlos Soares, Alípio Jorge, João Gama, Pedro Pereira Rodrigues, Vitor Santos Costa, Carlos Soares, Alípio Jorge |
Publisher | Springer Verlag |
Pages | 137-153 |
Number of pages | 17 |
ISBN (Print) | 9783319235240, 9783319235240, 9783319235240, 9783319235240 |
DOIs | |
Publication status | Published - 2015 |
Event | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2015 - Porto, Portugal Duration: 2015 Sept 7 → 2015 Sept 11 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 9285 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Other
Other | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2015 |
---|---|
Country/Territory | Portugal |
City | Porto |
Period | 15/9/7 → 15/9/11 |
Bibliographical note
Publisher Copyright:© Springer International Publishing Switzerland 2015.
ASJC Scopus subject areas
- Theoretical Computer Science
- Computer Science(all)