F5 TTS
Provided by Fal — Learn More
Text-to-speech synthesis tool that leverages AI to generate natural and expressive speech from text input.
Wan-2.6 Video
Wan 2.6 is an advanced AI video generation model featuring smart shot scheduling for multi-shot storytelling, higher-quality voice generation with stable multi-speaker dialogue, and natural, realistic voices. It accurately generates multi-shot sequences to express a full story while keeping key details consistent between shots, and can auto-plan scenes from simple prompts. Wan 2.6 supports video generation up to 15 seconds, creating 1080p videos at 24fps with native audio-visual synchronization, ensuring dialogue, music, and sound effects align perfectly with character movements and lip-sync. It includes video reference generation that uses a reference video for look and voice, then follows your prompt to create new clips. Supports text-to-video, image-to-video, and video-to-video generation.
Minimax Music
Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.