Fish Speech is an advanced tool that gives you reel power to train your own custom TTS models. While I've made it easy with an installer, don’t expect everything to be handed to you on a platter. This software is for those ready to dive deep into the world of TTS training—it’s not entirely intuitive. You’ll need to experiment and figure out how to make the most of it, but the rewards are worth it. Once you get the hang of it, you’ll be creating custom voices like a pro.
✨ Features that Will Have You Hooked:
Custom Model Training: The true magic lies in Zero-shot & Few-shot TTS—train with just 10-30 second samples to create a voice that’s uniquely yours.
Multilingual Magic: No need to worry about language barriers—support for English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish means you're ready for anything!
Phoneme-Free Power: No phoneme dependency here! The model adapts seamlessly to any language script, giving you more freedom in training.
Precision Performance: Achieve top-notch accuracy with 2% CER and WER on 5-minute English texts after training.
Fast and Furious: Thanks to fish-tech acceleration, expect speedy output: 1:5 real-time on an Nvidia RTX 4060 and 1:15 on an RTX 4090.
🌐 WebUI & GUI Inference: Whether you prefer a Gradio-based browser UI or a PyQt6 graphical interface, Fish Speech works perfectly across Linux, Windows, and macOS.
⚡ Deploy-Friendly: Setting up an inference server on Linux, Windows, or MacOS is a breeze—no speed loss here.
🤖 Fish Agent: End-to-end ASR + TTS integration means no need for extra models. Just plug and play!
Timbre Control: Use reference audio to adjust the speech timbre to match your vision.
Emotional Depth: Create speech with emotion, bringing your TTS to life like never before.
Ready to reel in your own custom voices and create truly lifelike speech? Fish Speech is your catch of the day! 🎧
FREE FOR ALL PRAIRIE DOGS AND ABOVE
Gary
2025-01-06 19:34:36 +0000 UTCWei Choong Lee
2025-01-06 13:09:19 +0000 UTC