stable-diffusion-animation

Want to create a simple animation or transition from one scene to another? Stable Diffusion Animation enables you to create 2 prompts and the model will do the rest of the work for you.

Hold: 0
Required: 25,000 $wRAI
58.6k runs
Demo
Examples
Input
prompt_start
Prompt to start the animation with
prompt_end
Prompt to start the animation with
width
Width of output image
height
Height of output image
num_inference_steps
Number of denoising steps (minimum: 1; maximum: 100)
prompt_strength
Lower prompt strength generates more coherent gifs, higher respects prompts more but can be jumpy
num_animation_frames
Number of frames to animate (minimum: 2; maximum: 50)
num_interpolation_steps
Number of steps to interpolate between animation frames (minimum: 1; maximum: 50)
guidance_scale
Scale for classifier-free guidance (minimum: 1; maximum: 20)
gif_frames_per_second
Frames/second in output GIF (minimum: 1; maximum: 50)
Whether to reverse the animation and go back to the beginning before looping
Whether to use FILM for between-frame interpolation (film-net.github.io)
Whether to display intermediate outputs during generation
seed
Random seed. Leave blank to randomize the seed
Hold at least 25,000 wRAI to use this model
Multi AI platform is completely free, but most models are only accessible to wRAI token holders. If you have any questions, feel free to ask in our Telegram chat
Input
seed
0
width
512
height
512
prompt_end
siberia, snow, night
prompt_start
jungle, wRAIn, sunny
gif_ping_pong
1
output_format
mp4
guidance_scale
7
prompt_strength
0.9
film_interpolation
1
num_inference_steps
50
num_animation_frames
25
gif_frames_per_second
20
num_interpolation_steps
5

Readme

Animate Stable Diffusion by interpolating between two prompts

Code: https://github.com/andreasjansson/cog-stable-diffusion/tree/animate

How does it work?

Starting with noise, we then use stable diffusion to denoise for n steps towards the mid-point between the start prompt and end prompt, where n = num_inference_steps * (1 - prompt_strength). The higher the prompt strength, the fewer steps towards the mid-point.

We then denoise from that intermediate noisy output towards num_animation_frames interpolation points between the start and end prompts. By starting with an intermediate output, the model will generate samples that are similar to each other, resulting in a smoother animation.

Finally, the generated samples are interpolated with Google's FILM (Frame Interpolation for Large Scene Motion) for extra smoothness.