Abstract
Background : Since the human genome project was completed in 2003, there have been numerous reports on cancer and related markers. This study was aimed to develop a system to extract automatically information regarding the relationship between cancer and tumor markers from biomedical literatures. Methods : Named entities of tumor markers were recognized by both a dictionary-based method and machine learning technology of the support vector machine. Named entities of cancers were recognized by the MeSH dictionary. Results : Relational and filtering keywords were selected after annotating 160 abstracts from PubMed. Relational information was extracted only when one of the relational keywords was in an appropriate position along the parse tree of a sentence with both tumor marker and disease entities. The performance of the system developed in this study was evaluated with another set of 77 abstracts. With the relational and filtering keyword used in the system, precision was 94.38% and recall was 66.14%, while without the expert knowledge precision was 49.16% and recall was 69.29%. Conclusions : We developed a system that can extract relational information between a tumor and its markers by incorporating expert knowledge into the system. The system exploiting expert knowledge would serve as a reference when developing another information extraction system in various medical fields.
Original language | English |
---|---|
Pages (from-to) | 79-87 |
Number of pages | 9 |
Journal | Korean Journal of Laboratory Medicine |
Volume | 28 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2008 |
Keywords
- Information extraction
- Tumor
- Tumor marker
ASJC Scopus subject areas
- Clinical Biochemistry
- Biochemistry, medical