Comparative analysis of term distributions in a sentence and in a document for sentence retrieval

Kyoung Soo Han, Hae Chang Rim

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Most of previous works of finding relevant sentences applied document retrieval models to sentence retrieval. However, the performance was very poor. This paper analyzes the reason of this poor performance by comparing term statistics in a document with those in a sentence. The analysis shows that the distribution of within-document and within-sentence term frequency is not similar, and the distribution of document frequency is similar to that of sentence frequency. Considering the discrepancy between the term statistics, it is not appropriate that document retrieval models, as they stand, are applied to sentence retrieval.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsAlexander Gelbukh
PublisherSpringer Verlag
Pages484-487
Number of pages4
ISBN (Print)3540210067, 9783540210061
DOIs
Publication statusPublished - 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2945
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Comparative analysis of term distributions in a sentence and in a document for sentence retrieval'. Together they form a unique fingerprint.

Cite this