TY - GEN
T1 - Acquiring Korean lexical entry from a raw corpus
AU - Yu, Wonhee
AU - Park, Kinam
AU - Jung, Soonyoung
AU - Lim, Heuiseok
PY - 2010
Y1 - 2010
N2 - This paper proposes a computational lexical entry acquisition model based on a representation model of the mental lexicon. The proposed model acquires lexical entries from a raw corpus by unsupervised learning like human. The model is composed of full-form and morpheme acquisition modules. In the full-from acquisition module, core full-forms are automatically acquired according to the frequency and recency thresholds. In the morpheme acquisition module, a repeatedly occurring substring in different full-forms is chosen as a candidate morpheme. Then, the candidate is corroborated as a morpheme by using the entropy measure of syllables in the string. The experimental results with a Korean corpus of which size is about 16 million full-forms show that the model successively acquires major full-forms and morphemes with the precision of 100% and 99.04%, respectively.
AB - This paper proposes a computational lexical entry acquisition model based on a representation model of the mental lexicon. The proposed model acquires lexical entries from a raw corpus by unsupervised learning like human. The model is composed of full-form and morpheme acquisition modules. In the full-from acquisition module, core full-forms are automatically acquired according to the frequency and recency thresholds. In the morpheme acquisition module, a repeatedly occurring substring in different full-forms is chosen as a candidate morpheme. Then, the candidate is corroborated as a morpheme by using the entropy measure of syllables in the string. The experimental results with a Korean corpus of which size is about 16 million full-forms show that the model successively acquires major full-forms and morphemes with the precision of 100% and 99.04%, respectively.
KW - Language learning
KW - Lexical acquisition
KW - Machine readable dictionary
KW - Mental lexicon
UR - http://www.scopus.com/inward/record.url?scp=78049497049&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78049497049&partnerID=8YFLogxK
U2 - 10.1109/ITCS.2010.5581289
DO - 10.1109/ITCS.2010.5581289
M3 - Conference contribution
AN - SCOPUS:78049497049
SN - 9781424475858
T3 - 2010 2nd International Conference on Information Technology Convergence and Services, ITCS 2010
BT - 2010 2nd International Conference on Information Technology Convergence and Services, ITCS 2010
T2 - 2010 2nd International Conference on Information Technology Convergence and Services, ITCS 2010
Y2 - 11 August 2010 through 13 August 2010
ER -