Wan-2.1

Provided by FalLearn More

Wan-2.1 is an advanced and powerful visual generation model developed by Tongyi Lab. It can generate videos based on text, images and other control signals. Wan-2.1 excels at generating realistic videos featuring extensive body movements, complex rotations, dynamic scene transitions, and fluid camera motions, and can accurately simulate real-world physics and realistic object interactions, while offering movie-like visuals with rich textures and a variety of stylized effects. It can also create text and dynamic text effects in videos directly from text prompts.

Preview

Inputs

Prompt
text
The prompt to generate a video from.
Image Prompt
image
The image used as the first frame of the video. If the image does not match the chosen aspect ratio, it is resized and center cropped.
optional: true
End Image
image
The image used as the last frame of the video.
optional: true
Negative Prompt
text
The negative prompt is used to guide the model to avoid generating videos that contain certain elements or concepts.
default: bright colors, overexposed, static, blurred details, subtitles, style, artwork, painting, picture, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, malformed limbs, fused fingers, still picture, cluttered background, three legs, many people in the background, walking backwards
Model
dropdown
The model to use for video generation.
default: wan-2.1Accepts: Wan-2.1, Wan-2.1 Pro
Aspect Ratio
dropdown
The aspect ratio of the generated video.
default: 16:9Accepts: 16:9, 9:16, 1:1
FPS
number
The number of frames per second to generate.
default: 16minimum: 5maximum: 24
Frames
number
The total number of frames to generate. Must be between 81 and 100.
default: 81minimum: 81maximum: 100
Resolution
dropdown
The resolution of the generated video.
default: 720pAccepts: 480p, 580p, 720p
Steps
number
The number of inference steps to perform.
default: 30minimum: 2maximum: 40
Seed
seed
The same seed and prompt will output the same image every time.
default: 18018minimum: 0maximum: 65535
Guidance Scale
number
Higher guidance scales can help with preserving garment detail, but risks oversaturated colors.
default: 5minimum: 1maximum: 10
Shift
number
Shift controls how the model moves through the denoising process, affecting motion and time flow in your video. Lower values result in smoother, more predictable movement. Higher values result in more dynamic but sometimes chaotic motion.
default: 6minimum: 1maximum: 10
Expand Prompt
toggle
Whether to expand the prompt using the model's own capabilities
default: true
Block NSFW
toggle
Whether to block NSFW content
default: false
Turbo Mode
toggle
When enabled, the video will be generated faster with no noticeable degradation in the visual quality. This property is not supported when using an image prompt.
default: false

Outputs

Generated Video
video
The generated video from the model.