Wan-2.2 Video

Wan-2.2 is a leading-edge and highly capable image and video generation model developed by Tongyi Lab of Alibaba Group. Wan-2.2 achieves professional cinematic narratives through a deep command of shot language, offering fine-grained control over lighting, color, and composition for versatile styles with delicate detail. It effortlessly recreates all kinds of complex motion, with enhanced fluidity and control, with better understanding and execution of prompts for complex scenes and multi-object generation. Wan-2.2 can generalize across multiple dimensions such as motions, semantics, and aesthetics. In addition to text-to-video and image-to-video, Wan-2.2 also supports video-to-video generation, with the ability to perform a wide range of edits on an input video such as adding, removing, and transforming objects.

Inputs

Prompt

text

The prompt to generate a video from.

Image Prompt

image

The image used as the first frame of the video. If the image does not match the chosen aspect ratio, it is resized and center cropped.

optional: true

End Image

image

The image used as the last frame of the video. Only supported by the Pro model.

optional: true

Video Prompt

video

Input video, used for video-to-video generation and editing.

optional: true

Negative Prompt

text

The negative prompt is used to guide the model to avoid generating videos that contain certain elements or concepts.

optional: true

Model

dropdown

The model to use for video generation.

default: wan-2.2-5bAccepts: Lite (5B), Pro (14B)

Frames

number

The total number of frames to generate. Must be between 81 and 121.

default: 81•minimum: 81•maximum: 121

FPS

number

The number of frames per second to generate.

default: 24•minimum: 5•maximum: 60

Seed

seed

The same seed and prompt will output the same video every time.

default: 36448•minimum: 0•maximum: 65535

Resolution

dropdown

The resolution of the generated video.

default: 720pAccepts: 480p, 580p, 720p

Aspect Ratio

dropdown

The aspect ratio of the generated video. If 'auto', the aspect ratio will be determined automatically based on the image or video prompts.

default: autoAccepts: Auto, 16:9, 9:16, 1:1

Steps

number

The number of inference steps to perform. Higher values can help with preserving details, but take longer to generate. We recommend using 40 steps for Lite and 27 steps for Pro.

default: 27•minimum: 2•maximum: 50

Guidance Scale

number

Higher guidance scales can help with preserving garment detail, but risks oversaturated colors.

default: 3.5•minimum: 1•maximum: 10

Shift

number

Shift controls how the model moves through the denoising process, affecting motion and time flow in your video. Lower values result in smoother, more predictable movement. Higher values result in more dynamic but sometimes chaotic motion.

default: 5•minimum: 1•maximum: 10

Expand Prompt

toggle

Whether to expand the prompt using the model's own capabilities

default: true

Block NSFW

toggle

Whether to block NSFW content

default: false

Turbo Mode

toggle

When enabled, the video will be generated faster with no noticeable degradation in the visual quality. This property is not supported when using video-to-video generation.

default: false

Wan-2.2 Video

Preview

Inputs

Outputs

# Wan-2.2 Video

#Preview

#Inputs

#Outputs

Wan-2.2 Video

Preview

Inputs

Outputs