F5-TTS
Flow-matching based text-to-speech with natural prosody and zero-shot voice cloning from short audio samples.
About
F5-TTS is a flow-matching based text-to-speech system that produces natural-sounding speech with excellent prosody. It supports zero-shot voice cloning from short audio samples without fine-tuning.