Vidu Reference
Provided by Fal — Learn More
Vidu Reference is a video generation model that enables seamless interaction between multiple subjects, including characters, props, objects, and environments in the same scene. This model is ideal for creating videos with complex scenes and multiple characters, where characters interact naturally within the same scene. Vidu supports feature fusion, allowing elements from different subjects—such as the front of Character A and the back of Character B—to merge seamlessly into a new character or object.
Vidu
Vidu is a state-of-the-art video generation model for producing high-fidelity videos with cinematic transitions, and consistent characters from simple text or image prompts. Leveraging advanced semantic understanding, Vidu delivers smooth, coherent motion and studio-quality results, particularly excelling at anime and manga video generation. The Q2 variant emphasizes subtle facial expressions, believable micro-acting (blinks, eye movements, lip sync), and smooth camera grammar with stable push-ins, pull-backs, and tracking shots. Q2 offers faster, more predictable generation with up to 8 seconds clip lengths, first-frame control, and adjustable resolution. It excels at character-driven shorts, product reveals, and stylized motion with improved expression fidelity, camera stability, and prompt adherence compared to Q1.
Wan-2.1
Wan-2.1 is an advanced and powerful visual generation model developed by Tongyi Lab. It can generate videos based on text, images and other control signals. Wan-2.1 excels at generating realistic videos featuring extensive body movements, complex rotations, dynamic scene transitions, and fluid camera motions, and can accurately simulate real-world physics and realistic object interactions, while offering movie-like visuals with rich textures and a variety of stylized effects. It can also create text and dynamic text effects in videos directly from text prompts.