Biomedical named entity recognition using two-phase model based on SVMs

Ki Joong Lee, Young Sook Hwang, Seonho Kim, Hae Chang Rim

    Research output: Contribution to journalArticlepeer-review

    99 Citations (Scopus)

    Abstract

    Named entity (NE) recognition has become one of the most fundamental tasks in biomedical knowledge acquisition. In this paper, we present a two-phase named entity recognizer based on SVMs, which consists of a boundary identification phase and a semantic classification phase of named entities. When adapting SVMs to named entity recognition, the multi-class problem and the unbalanced class distribution problem become very serious in terms of training cost and performance. We try to solve these problems by separating the NE recognition task into two subtasks, where we use appropriate SVM classifiers and relevant features for each subtask. In addition, by employing a hierarchical classification method based on ontology, we effectively solve the multi-class problem concerning semantic classification. The experimental results on the GENIA corpus show that the proposed method is effective not only in reducing computational cost but also in improving performance. The F-score (β = 1) for the boundary identification is 74.8 and the F-score for the semantic classification is 66.7.

    Original languageEnglish
    Pages (from-to)436-447
    Number of pages12
    JournalJournal of Biomedical Informatics
    Volume37
    Issue number6
    DOIs
    Publication statusPublished - 2004 Dec

    Keywords

    • Bioinformatics
    • Hierarchical multi-class SVM
    • Named entity recognition
    • SVM
    • Two-phase model
    • Unbalanced class distribution

    ASJC Scopus subject areas

    • Computer Science Applications
    • Health Informatics

    Fingerprint

    Dive into the research topics of 'Biomedical named entity recognition using two-phase model based on SVMs'. Together they form a unique fingerprint.

    Cite this