Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification

Yookyung Kho, Jaehee Kim, Pilsung Kang

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    Recently, prompt-based fine-tuning has garnered considerable interest as a core technique for few-shot text classification task. This approach reformulates the fine-tuning objective to align with the Masked Language Modeling (MLM) objective. Leveraging unlabeled data, prompt-based self-training has shown greater effectiveness in binary and three-class classification. However, prompt-based self-training for multi-class classification has not been adequately investigated, despite its significant applicability to real-world scenarios. Moreover, extending current methods to multi-class classification suffers from the verbalizer that extracts the predicted value of manually pre-defined single label word for each class from MLM predictions. Consequently, we introduce a novel, efficient verbalizer structure, named Mapping-free Automatic Verbalizer (MAV). Comprising two fully connected layers, MAV serves as a trainable verbalizer that automatically extracts the requisite word features for classification by capitalizing on all available information from MLM predictions. Experimental results on five multi-class classification datasets indicate MAV's superior self-training efficacy.

    Original languageEnglish
    Title of host publicationFindings of the Association for Computational Linguistics
    Subtitle of host publicationEMNLP 2023
    PublisherAssociation for Computational Linguistics (ACL)
    Pages13786-13800
    Number of pages15
    ISBN (Electronic)9798891760615
    Publication statusPublished - 2023
    Event2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Singapore, Singapore
    Duration: 2023 Dec 62023 Dec 10

    Publication series

    NameFindings of the Association for Computational Linguistics: EMNLP 2023

    Conference

    Conference2023 Findings of the Association for Computational Linguistics: EMNLP 2023
    Country/TerritorySingapore
    CitySingapore
    Period23/12/623/12/10

    Bibliographical note

    Publisher Copyright:
    © 2023 Association for Computational Linguistics.

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Computer Science Applications
    • Information Systems
    • Language and Linguistics
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'Boosting Prompt-Based Self-Training With Mapping-Free Automatic Verbalizer for Multi-Class Classification'. Together they form a unique fingerprint.

    Cite this