Optimizing for Measure of Performance in Max-Margin Parsing

Alexander Bauer, Shinichi Nakajima, Nico Gornitz, Klaus Robert Muller

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Many learning tasks in the field of natural language processing including sequence tagging, sequence segmentation, and syntactic parsing have been successfully approached by means of structured prediction methods. An appealing property of the corresponding training algorithms is their ability to integrate the loss function of interest into the optimization process improving the final results according to the chosen measure of performance. Here, we focus on the task of constituency parsing and show how to optimize the model for the $F_{1}$-score in the max-margin framework of a structural support vector machine (SVM). For reasons of computational efficiency, it is a common approach to binarize the corresponding grammar before training. Unfortunately, this introduces a bias during the training procedure as the corresponding loss function is evaluated on the binary representation, while the resulting performance is measured on the original unbinarized trees. Here, we address this problem by extending the inference procedure presented by Bauer et al. Specifically, we propose an algorithmic modification that allows evaluating the loss on the unbinarized trees. The new approach properly models the loss function of interest resulting in better prediction accuracy and still benefits from the computational efficiency due to binarized representation. The presented idea can be easily transferred to other structured loss functions.

Original languageEnglish
Article number8825553
Pages (from-to)2680-2684
Number of pages5
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume31
Issue number7
DOIs
Publication statusPublished - 2020 Jul

Bibliographical note

Funding Information:
This work was supported in part by the Federal Ministry of Education and Research (BMBF) through the Berlin Big Data Center Project under Grant FKZ 01IS18025A, in part by the German Research Foundation (DFG) through Grant Math+ (EXC 2046/1) under Project 390685689, and in part by the Institute for Information &Communications Technology Planning &Evaluation (IITP) Grant funded by the Korea Government under Grant 2017-0-00451 and Grant 2017-0-01779. The work of A. Bauer was supported in part by Technische Universität Berlin under Project 10032745. The work of K.-R. Müller was supported by the German Ministry for Education and Research (BMBF) under Grant 01IS14013A-E, Grant 01GQ1115, and Grant 01GQ0850.

Funding Information:
Manuscript received October 24, 2018; revised March 20, 2019 and July 3, 2019; accepted August 2, 2019. Date of publication September 5, 2019; date of current version July 7, 2020. This work was supported in part by the Federal Ministry of Education and Research (BMBF) through the Berlin Big Data Center Project under Grant FKZ 01IS18025A, in part by the German Research Foundation (DFG) through Grant Math+ (EXC 2046/1) under Project 390685689, and in part by the Institute for Information & Communications Technology Planning & Evaluation (IITP) Grant funded by the Korea Government under Grant 2017-0-00451 and Grant 2017-0-01779. The work of A. Bauer was supported in part by Tech-nische Universität Berlin under Project 10032745. The work of K.-R. Müller was supported by the German Ministry for Education and Research (BMBF) under Grant 01IS14013A-E, Grant 01GQ1115, and Grant 01GQ0850. (Corresponding author: Klaus-Robert Müller.) A. Bauer and N. Görnitz are with the Machine Learning Group, Technische Universität Berlin, 10623 Berlin, Germany (e-mail: alexander. [email protected]; [email protected]).

Publisher Copyright:
© 2012 IEEE.

Keywords

  • Dynamic programming
  • graphical models
  • high-order potentials
  • inference
  • margin scaling
  • slack scaling
  • structural support vector machines (SVMs)
  • structured output

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Optimizing for Measure of Performance in Max-Margin Parsing'. Together they form a unique fingerprint.

Cite this