F5 TTS

Provided by FalLearn More

Text-to-speech synthesis tool that leverages AI to generate natural and expressive speech from text input.

Preview

Inputs

Text
text
The text to be converted to speech.
Reference Audio
audio
A short audio clip of the voice you want to clone.
Reference Text
text
The text being spoken in the reference audio. Providing this will dramatically improve the speed of generation.
optional: true
Model Type
dropdown
The TTS model to use for voice synthesis.
default: F5-TTSAccepts: F5-TTS (Better quality, slower), E2-TTS (Faster, lower quality)
Remove Silence
toggle
Automatically remove silence from the generated audio.
default: true

Outputs

Audio
audio
The generated audio file with the cloned voice.