Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification

Yookyung Kho, Jaehee Kim, Pilsung Kang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recently, prompt-based fine-tuning has garnered considerable interest as a core technique for few-shot text classification task. This approach reformulates the fine-tuning objective to align with the Masked Language Modeling (MLM) objective. Leveraging unlabeled data, prompt-based self-training has shown greater effectiveness in binary and three-class classification. However, prompt-based self-training for multi-class classification has not been adequately investigated, despite its significant applicability to real-world scenarios. Moreover, extending current methods to multi-class classification suffers from the verbalizer that extracts the predicted value of manually pre-defined single label word for each class from MLM predictions. Consequently, we introduce a novel, efficient verbalizer structure, named Mapping-free Automatic Verbalizer (MAV). Comprising two fully connected layers, MAV serves as a trainable verbalizer that automatically extracts the requisite word features for classification by capitalizing on all available information from MLM predictions. Experimental results on five multi-class classification datasets indicate MAV's superior self-training efficacy.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2023
PublisherAssociation for Computational Linguistics (ACL)
Pages13786-13800
Number of pages15
ISBN (Electronic)9798891760615
Publication statusPublished - 2023
Event2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Singapore, Singapore
Duration: 2023 Dec 62023 Dec 10

Publication series

NameFindings of the Association for Computational Linguistics: EMNLP 2023

Conference

Conference2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Country/TerritorySingapore
CitySingapore
Period23/12/623/12/10

Bibliographical note

Publisher Copyright:
© 2023 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification'. Together they form a unique fingerprint.

Cite this