Speedy Synthetic Reality: Stable Diffusion XL Turbo Generates AI Images in Real-Time
Stable Diffusion XL Turbo is an AI image-synthesis model that can generate images based on a written prompt in real-time, offering impressive speed and efficiency.
A Breakthrough in Image Generation Speed
Stability AI has introduced Stable Diffusion XL Turbo, an AI image-synthesis model that can rapidly generate imagery based on a written prompt. What sets this model apart is its ability to produce image outputs in a single step, significantly reducing the number of steps required by its predecessor. By utilizing a technique called Adversarial Diffusion Distillation (ADD), which combines score distillation and adversarial loss, the model is able to enhance the realism of the generated images.
The research paper released by Stability AI on Tuesday delves into the details of the ADD technique, highlighting the model's similarity to Generative Adversarial Networks (GANs) in terms of producing single-step image outputs. While the images generated by SDXL Turbo may not be as detailed as those produced by its predecessor at higher step counts, the speed at which it generates images is truly impressive.
Unleashing the Full Potential of SDXL Turbo
The speed and efficiency of SDXL Turbo are evident when tested locally on an Nvidia RTX 3060. In comparison to a 20-step SDXL image with similar detail, SDXL Turbo can generate a 3-step 1024x1024 image in about 4 seconds, while the former takes 26.4 seconds. Smaller images are generated even faster, with a 512x768 image taking less than one second. Additionally, using a more powerful graphics card, such as an RTX 3090 or 4090, further reduces generation times.
Stability AI claims that SDXL Turbo's real-time generation speed is achieved on an Nvidia A100, with the model able to generate a 512x512 image in just 207 milliseconds, including encoding, de-noising, and decoding. These speeds open up possibilities for real-time generative AI video filters and experimental video game graphics generation, if coherency challenges can be overcome.
Availability and Future Prospects
Currently, Stable Diffusion XL Turbo is available under a non-commercial research license, limiting its use to personal and non-commercial purposes. While some members of the Stable Diffusion community have criticized this restriction, Stability AI has expressed openness to future commercial applications and encourages interested parties to reach out for more information.
Despite recent internal management issues, which have included calls for its CEO to resign, Stability AI continues to release new and innovative AI models. In addition to Stable Diffusion XL Turbo, the company recently introduced Stable Video Diffusion, a model capable of transforming still images into short video clips. Stability AI also offers beta demonstrations of SDXL Turbo's capabilities on its image-editing platform, Clipdrop, and a free unofficial live demo is available on Hugging Face.