Documents
Presentation Slides
ON THE DETECTION OF IMAGES GENERATED FROM TEXT
- Citation Author(s):
- Submitted by:
- Yuqing Yang
- Last updated:
- 8 November 2024 - 7:44pm
- Document Type:
- Presentation Slides
- Event:
- Categories:
- Log in to post comments
The introduction of diverse text-to-image generation models has sparked significant interest across various sectors. While these models provide the groundbreaking capability to convert textual descriptions into visual data, their widespread usage has ignited concerns over misusing realistic synthesized images. Despite the pressing need, research on detecting such synthetic images remains limited. This paper aims to bridge this gap by evaluating the ability of several existing detectors to detect synthesized images produced by text-to-image generation models. Our research includes testing four popular text-to-image generation models: Stable Diffusion (SD), Latent Diffusion (LD), GLIDE, and DALL$\cdot$E-MINI (DM), and leverages two benchmark prompt-image datasets as real images. Additionally, our research focuses on identifying robust, efficient, lightweight detectors to minimize computational resource usage. Recognizing the limitations of current detection approaches, we propose a novel detector grounded in latent space analysis tailored for recognizing text-to-image synthesized visuals. Experimental results demonstrate that the proposed detector not only achieves high prediction accuracy but also exhibits enhanced robustness against image perturbations while maintaining lower computational complexity compared to existing models in detecting text-to-image generated synthetic images.