A heuristic method for selecting support features from large datasets

Hong Seo Ryoo, In Yong Jang

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    4 Citations (Scopus)

    Abstract

    For feature selection in machine learning, set covering (SC) is most suited, for it selects support features for data under analysis based on the individual and the collective roles of the candidate features. However, the SC-based feature selection requires the complete pair-wise comparisons of the members of the different classes in a dataset, and this renders the meritorious SC principle impracticable for selecting support features from a large number of data. Introducing the notion of implicit SC-based feature selection, this paper presents a feature selection procedure that is equivalent to the standard SC-based feature selection procedure in supervised learning but with the memory requirement that is multiple orders of magnitude less than the counterpart. With experiments on six large machine learning datasets, we demonstrate the usefulness of the proposed implicit SCbased feature selection scheme in large-scale supervised data analysis.

    Original languageEnglish
    Title of host publicationAlgorithmic Aspects in Information and Management - Third International Conference, AAIM 2007, Proceedings
    PublisherSpringer Verlag
    Pages411-423
    Number of pages13
    ISBN (Print)9783540728689
    DOIs
    Publication statusPublished - 2007
    Event3rd International Conference on Algorithmic Aspects in Information and Management, AAIM 2007 - Portland, OR, United States
    Duration: 2007 Jun 62007 Jun 8

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume4508 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other3rd International Conference on Algorithmic Aspects in Information and Management, AAIM 2007
    Country/TerritoryUnited States
    CityPortland, OR
    Period07/6/607/6/8

    Keywords

    • Combinatorial optimization
    • Feature selection
    • Large datasets
    • Supervised learning

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'A heuristic method for selecting support features from large datasets'. Together they form a unique fingerprint.

    Cite this