Parrot: Pareto-Optimal Multi-reward Reinforcement Learning Framework for Text-to-Image Generation

  • Seung Hyun Lee*
  • , Yinxiao Li
  • , Junjie Ke
  • , Innfarn Yoo
  • , Han Zhang
  • , Jiahui Yu
  • , Qifei Wang
  • , Fei Deng
  • , Glenn Entis
  • , Junfeng He
  • , Gang Li
  • , Sangpil Kim
  • , Irfan Essa
  • , Feng Yang
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Recent works have demonstrated that using reinforcement learning (RL) with multiple quality rewards can improve the quality of generated images in text-to-image (T2I) generation. However, manually adjusting reward weights poses challenges and may cause over-optimization in certain metrics. To solve this, we propose Parrot, which addresses the issue through multi-objective optimization and introduces an effective multi-reward optimization strategy to approximate Pareto optimal. Utilizing batch-wise Pareto optimal selection, Parrot automatically identifies the optimal trade-off among different rewards. We use the novel multi-reward optimization algorithm to jointly optimize the T2I model and a prompt expansion network, resulting in significant improvement of image quality and also allow to control the trade-off of different rewards using a reward related prompt during inference. Furthermore, we introduce original prompt-centered guidance at inference time, ensuring fidelity to user input after prompt expansion. Extensive experiments and a user study validate the superiority of Parrot over several baselines across various quality criteria, including aesthetics, human preference, text-image alignment, and image sentiment.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 - 18th European Conference, Proceedings
EditorsAleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
PublisherSpringer Science and Business Media Deutschland GmbH
Pages462-478
Number of pages17
ISBN (Print)9783031729195
DOIs
Publication statusPublished - 2025
Event18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 2024 Sept 292024 Oct 4

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15096 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period24/9/2924/10/4

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Parrot: Pareto-Optimal Multi-reward Reinforcement Learning Framework for Text-to-Image Generation'. Together they form a unique fingerprint.

Cite this