Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation

Seongbeom Park, Suhong Moon, Seunghyun Park, Jinkyu Kim

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    Current text-to-image generation methods produce high-resolution and high-quality images, but they should not produce immoral images that may contain inappropriate content from the perspective of commonsense morality. Conventional approaches, however, often neglect these ethical concerns, and existing solutions are often limited to ensure moral compatibility. To address this, we propose a novel method that has three main capabilities: (1) our model recognizes the degree of visual commonsense immorality of a given generated image, (2) our model localizes immoral visual (and textual) attributes that make the image visually immoral, and (3) our model manipulates such immoral visual cues into a morally-qualifying alternative. We conduct experiments with various text-to-image generation models, including the state-of-the-art Stable Diffusion model, demonstrating the efficacy of our ethical image manipulation approach. Our human study further confirms that ours is indeed able to generate morally-satisfying images from immoral ones.

    Original languageEnglish
    Title of host publicationProceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages4663-4672
    Number of pages10
    ISBN (Electronic)9798350318920
    DOIs
    Publication statusPublished - 2024 Jan 3
    Event2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 - Waikoloa, United States
    Duration: 2024 Jan 42024 Jan 8

    Publication series

    NameProceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024

    Conference

    Conference2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
    Country/TerritoryUnited States
    CityWaikoloa
    Period24/1/424/1/8

    Bibliographical note

    Publisher Copyright:
    © 2024 IEEE.

    Keywords

    • Algorithms
    • Algorithms
    • Explainable
    • Vision + language and/or other modalities
    • accountable
    • ethical computer vision
    • fair
    • privacy-preserving

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Science Applications
    • Computer Vision and Pattern Recognition

    Fingerprint

    Dive into the research topics of 'Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation'. Together they form a unique fingerprint.

    Cite this