Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models

Jung Tae Lee, Sang Bum Kim, Young In Song, Hae Chang Rim

    Research output: Contribution to conferencePaperpeer-review

    51 Citations (Scopus)

    Abstract

    Lexical gaps between queries and questions (documents) have been a major issue in question retrieval on large online question and answer (Q&A) collections. Previous studies address the issue by implicitly expanding queries with the help of translation models pre-constructed using statistical techniques. However, since it is possible for unimportant words (e.g., non-topical words, common words) to be included in the translation models, a lack of noise control on the models can cause degradation of retrieval performance. This paper investigates a number of empirical methods for eliminating unimportant words in order to construct compact translation models for retrieval purposes. Experiments conducted on a real world Q&A collection show that substantial improvements in retrieval performance can be achieved by using compact translation models.

    Original languageEnglish
    Pages410-418
    Number of pages9
    DOIs
    Publication statusPublished - 2008
    Event2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation - Honolulu, HI, United States
    Duration: 2008 Oct 252008 Oct 27

    Other

    Other2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation
    Country/TerritoryUnited States
    CityHonolulu, HI
    Period08/10/2508/10/27

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Computer Science Applications
    • Information Systems

    Fingerprint

    Dive into the research topics of 'Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models'. Together they form a unique fingerprint.

    Cite this