F5 TTS

Provided by FalLearn More

Text-to-speech synthesis tool that leverages AI to generate natural and expressive speech from text input.

Preview

Inputs

Text
text

The text to be converted to speech.

Reference Audio
audio

A short audio clip of the voice you want to clone.

Reference Text
text

The text being spoken in the reference audio. Providing this will dramatically improve the speed of generation.

optional: true
Model Type
dropdown

The TTS model to use for voice synthesis.

default: F5-TTSAccepts: F5-TTS (Better quality, slower), E2-TTS (Faster, lower quality)
Remove Silence
toggle

Automatically remove silence from the generated audio.

default: true

Outputs

Audio
audio

The generated audio file with the cloned voice.