Simple weighting techniques for query expansion in biomedical document retrieval

Young In Song, Kyoung Soo Han, So Young Park, Sang Bum Kim, Hae Chang Rim

    Research output: Contribution to journalArticlepeer-review

    1 Citation (Scopus)

    Abstract

    In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.

    Original languageEnglish
    Pages (from-to)1873-1876
    Number of pages4
    JournalIEICE Transactions on Information and Systems
    VolumeE90-D
    Issue number11
    DOIs
    Publication statusPublished - 2007 Nov

    Keywords

    • Biomedical document retrieval
    • Biomedical terminology
    • Biomedical terminology weighting
    • Query expansion

    ASJC Scopus subject areas

    • Software
    • Hardware and Architecture
    • Computer Vision and Pattern Recognition
    • Electrical and Electronic Engineering
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'Simple weighting techniques for query expansion in biomedical document retrieval'. Together they form a unique fingerprint.

    Cite this