New AI Model Can Animate Still Images into Videos
Stability AI has released Stable Video Diffusion, a new AI tool that can turn any still image into a short video clip. The tool uses image-to-video technique and can run locally on machines with Nvidia GPUs.
About Stable Video Diffusion
Stable Video Diffusion (SVD) is an AI tool released by Stability AI that allows users to transform still images into short video clips. This tool is an open-weights preview of two AI models known as SVD and SVD-XT. It operates on machines with Nvidia GPUs and can generate videos at varying speeds.
The SVD model is capable of producing image-to-video synthesis at 14 frames in length, while the SVD-XT model generates 25 frames. The models output short MP4 video clips with a resolution of 576x1024 pixels. The generated videos typically include panning, zooming effects, or animate elements such as smoke or fire.
Testing and Limitations
In local testing, it took approximately 30 minutes to generate a 14-frame video clip using an Nvidia RTX 3060 graphics card. However, users have the option to run the models faster on cloud services like Hugging Face and Replicate. During the tests, the animations typically maintained static portions of the scene while adding dynamic effects.
The model is still in the early stages and intended for research purposes only. Stability AI advises against using it for real-world or commercial applications. The company welcomes user feedback to improve the safety and quality of the model for future releases.
Availability and Future Plans
The Stable Video Diffusion source code and weights are available on GitHub. Users can also test the tool locally using the Pinokio platform, which simplifies installation and provides its own environment for running the model.
Stability AI is actively working on expanding its AI capabilities and is developing a text-to-video model. This upcoming model will enable the creation of short video clips based on written prompts instead of images.