Abstract
In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.
| Original language | English |
|---|---|
| Pages (from-to) | 1873-1876 |
| Number of pages | 4 |
| Journal | IEICE Transactions on Information and Systems |
| Volume | E90-D |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - 2007 Nov |
Keywords
- Biomedical document retrieval
- Biomedical terminology
- Biomedical terminology weighting
- Query expansion
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Simple weighting techniques for query expansion in biomedical document retrieval'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS