K-HALU: MULTIPLE ANSWER KOREAN HALLUCINATION BENCHMARK FOR LARGE LANGUAGE MODELS

  • Jaehyung Seo
  • , Heuiseok Lim*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent researchers and companies have been developing large language models (LLMs) specifically designed for particular purposes and have achieved significant advancements in various natural language processing tasks. However, LLMs are still prone to generating hallucinations-results that are unfaithful or inconsistent with the given input. As a result, the need for datasets to evaluate and demonstrate the hallucination detection capabilities of LLMs is increasingly recognized. Nonetheless, the Korean NLP community lacks publicly available benchmark datasets demonstrating the faithfulness of knowledge-based information. Furthermore, the few existing datasets that evaluate hallucination are limited in their access to the entire dataset, restricting detailed analysis beyond simple scoring, and are based on translated English knowledge. To address these challenges, we introduce K-HALU, a Korean benchmark designed to evaluate LLMs' hallucination detection in Korean. This benchmark contains seven domains, considering the faithfulness of statements based on knowledge documents compiled from Korean news, magazines, and books. For more strict evaluation, 40% of the dataset is structured as multiple-answer questions, requiring models to select all possible correct answers from the given options. Our empirical results show that open-source LLMs still struggle with hallucination detection in Korean knowledge, emphasizing the need for a more detailed analysis of their limitations.

Original languageEnglish
Title of host publication13th International Conference on Learning Representations, ICLR 2025
PublisherInternational Conference on Learning Representations, ICLR
Pages89557-89587
Number of pages31
ISBN (Electronic)9798331320850
Publication statusPublished - 2025
Event13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore
Duration: 2025 Apr 242025 Apr 28

Publication series

Name13th International Conference on Learning Representations, ICLR 2025

Conference

Conference13th International Conference on Learning Representations, ICLR 2025
Country/TerritorySingapore
CitySingapore
Period25/4/2425/4/28

Bibliographical note

Publisher Copyright:
© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'K-HALU: MULTIPLE ANSWER KOREAN HALLUCINATION BENCHMARK FOR LARGE LANGUAGE MODELS'. Together they form a unique fingerprint.

Cite this