Identifying idiomatic expressions using phrase alignments in bilingual parallel corpus

Hyoung Gyu Lee, Min Jeong Kim, Gumwon Hong, Sang Bum Kim, Young Sook Hwang, Hae Chang Rim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Previous efforts to identify idiomatic expressions using a bilingual parallel corpus have focused on the method of using word alignments to catch the sense of individual words. In this paper, we propose a method of using phrase alignments rather than word alignments in a parallel corpus to recognize the sense of phrases as well as words. Our proposed scoring functions are based on the difference of translation tendency between a phrase and individual words. They can help us identify idiomatic expressions with a entropy variation and a translation difference between a phrase and individual words. Experimental results show that our proposed method is more effective than previous approaches for the identification of idiomatic expressions. In addition, we proved that linguistic constraints can be integrated into our method to improve the performance.

Original languageEnglish
Title of host publicationPRICAI 2010
Subtitle of host publicationTrends in Artificial Intelligence - 11th Pacific Rim International Conference on Artificial Intelligence, Proceedings
Pages123-133
Number of pages11
DOIs
Publication statusPublished - 2010
Event11th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2010 - Daegu, Korea, Republic of
Duration: 2010 Aug 302010 Sept 2

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6230 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2010
Country/TerritoryKorea, Republic of
CityDaegu
Period10/8/3010/9/2

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Identifying idiomatic expressions using phrase alignments in bilingual parallel corpus'. Together they form a unique fingerprint.

Cite this