Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models

Jung Tae Lee, Sang Bum Kim, Young In Song, Hae Chang Rim

Research output: Contribution to conferencePaperpeer-review

49 Citations (Scopus)

Abstract

Lexical gaps between queries and questions (documents) have been a major issue in question retrieval on large online question and answer (Q&A) collections. Previous studies address the issue by implicitly expanding queries with the help of translation models pre-constructed using statistical techniques. However, since it is possible for unimportant words (e.g., non-topical words, common words) to be included in the translation models, a lack of noise control on the models can cause degradation of retrieval performance. This paper investigates a number of empirical methods for eliminating unimportant words in order to construct compact translation models for retrieval purposes. Experiments conducted on a real world Q&A collection show that substantial improvements in retrieval performance can be achieved by using compact translation models.

Original languageEnglish
Pages410-418
Number of pages9
DOIs
Publication statusPublished - 2008
Event2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation - Honolulu, HI, United States
Duration: 2008 Oct 252008 Oct 27

Other

Other2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation
Country/TerritoryUnited States
CityHonolulu, HI
Period08/10/2508/10/27

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Bridging lexical gaps between queries and questions on large online Q&A collections with compact translation models'. Together they form a unique fingerprint.

Cite this