GLIGEN: Open-Set Grounded Text-to-Image Generation.
GLIGEN (
Grounded-
Language-to-
Image
Generation) a novel approach that builds upon and extends the functionality of existing pre-trained
text-to-image diffusion models by enabling them to also be conditioned on grounding inputs.
Project page:
https://gligen.github.io/
Paper:
https://arxiv.org/abs/2301.07093
Github (coming soon):
https://github.com/gligen/GLIGEN
Demo:
https://huggingface.co/spaces/gligen/demo
@computer_science_and_programming