StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale | Data Science by ODS.ai 🦜
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
In this paper, the authors propose StyleGAN-T, a model designed for large-scale text-to-image synthesis. With its large capacity, stable training on diverse datasets, strong text alignment, and controllable variation-text alignment tradeoff, StyleGAN-T outperforms previous GANs and even surpasses distilled diffusion models, the previous frontrunners in fast text-to-image synthesis in terms of sample quality and speed.
StyleGAN-T achieves a better zero-shot MS COCO FID than current state of-the-art diffusion models at a resolution of 64×64. At 256×256, StyleGAN-T halves the zero-shot FID previously achieved by a GAN but continues to trail SOTA diffusion models.
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of f...