StyleTTS2

yl4579

Style-based text-to-speech achieving human-level naturalness through style diffusion and adversarial training with prosody modeling.

About

StyleTTS2 achieves human-level text-to-speech naturalness through style diffusion and adversarial training. It models speech styles as latent random variables, producing highly expressive and natural-sounding output.

Deployment Options

1 stack

You might also like