Wan-2.1

Provided by FalLearn More

Wan-2.1 is an advanced and powerful visual generation model developed by Tongyi Lab. It can generate videos based on text, images and other control signals. Wan-2.1 excels at generating realistic videos featuring extensive body movements, complex rotations, dynamic scene transitions, and fluid camera motions, and can accurately simulate real-world physics and realistic object interactions, while offering movie-like visuals with rich textures and a variety of stylized effects. It can also create text and dynamic text effects in videos directly from text prompts.

Preview

Inputs

Prompt
text

The prompt to generate a video from.

Image Prompt
image

The image used as the first frame of the video. If the image does not match the chosen aspect ratio, it is resized and center cropped.

optional: true
End Image
image

The image used as the last frame of the video.

optional: true
Negative Prompt
text

The negative prompt is used to guide the model to avoid generating videos that contain certain elements or concepts.

default: bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards
Model
dropdown

The model to use for video generation.

default: wan-2.1Accepts: Wan-2.1, Wan-2.1 Pro
Aspect Ratio
dropdown

The aspect ratio of the generated video.

default: 16:9Accepts: 16:9, 9:16, 1:1
FPS
number

The number of frames per second to generate.

default: 16minimum: 5maximum: 24
Frames
number

The total number of frames to generate. Must be between 81 and 100.

default: 81minimum: 81maximum: 100
Resolution
dropdown

The resolution of the generated video.

default: 720pAccepts: 480p, 580p, 720p
Steps
number

The number of inference steps to perform.

default: 30minimum: 2maximum: 40
Seed
seed

The same seed and prompt will output the same image every time.

default: 35680minimum: 0maximum: 65535
Guidance Scale
number

Higher guidance scales can help with preserving garment detail, but risks oversaturated colors.

default: 5minimum: 1maximum: 10
Shift
number

Shift controls how the model moves through the denoising process, affecting motion and time flow in your video. Lower values result in smoother, more predictable movement. Higher values result in more dynamic but sometimes chaotic motion.

default: 6minimum: 1maximum: 10
Expand Prompt
toggle

Whether to expand the prompt using the model's own capabilities

default: true
Block NSFW
toggle

Whether to block NSFW content

default: false
Turbo Mode
toggle

When enabled, the video will be generated faster with no noticeable degradation in the visual quality. This property is not supported when using an image prompt.

default: false

Outputs

Generated Video
video

The generated video from the model.