Sora 2 is OpenAI's powerful media generation model, generating videos with synced audio. It generates videos that are physically accurate, realistic, and more controllable than prior systems. It also features synchronized dialogue and sound effects. It can create richly detailed, dynamic clips from natural language or images. Sora 2 can do things that are exceptionally difficult—and in some instances outright impossible—for prior video generation models. Sora 2is also a big leap forward in controllability, able to follow intricate instructions spanning multiple shots while accurately persisting world state. It excels at realistic, cinematic, and anime styles.
Image to use as the input reference for video generation. Image should be the same aspect ratio as the video size, otherwise it will be cropped to fit the video size.
optional:true
Model
dropdown
The model to use for video generation. Sora 2 is the default model and is a good balance of quality and speed. Sora 2 Pro is a more powerful model that is better for generating high quality videos, but it is slower and more expensive.