Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Mistral Releases Voxtral TTS, an Open-Weight Voice Model Built for On-Device Use
Headline
Mistral Releases Voxtral TTS, an Open-Weight Voice Model Built for On-Device Use
Summary
Mistral released Voxtral TTS, a 3-billion-parameter text-to-speech model with open weights. The model splits into three parts: a 3.4B language model that processes text, a 390M model that generates speech features, and a 300M model that produces the final audio. After quantization, it runs on laptops with 90ms latency, 6x real-time speed, and 3GB RAM.
The model handles nine languages and can clone voices from just 5 seconds of audio—including cloning a voice in one language and having it speak another. In Mistral’s internal tests, people preferred Voxtral over ElevenLabs 62.8% of the time for default voices and 69.9% for custom ones. The open-weight release lets companies run TTS on their own hardware, avoiding the cost and privacy concerns of sending audio through external APIs.
Analysis
The modular design reflects a broader shift toward AI architectures optimized for consumer hardware rather than data center GPUs. By splitting text understanding, speech generation, and audio output into separate components, Mistral made the system more flexible—companies can potentially swap or fine-tune individual pieces.
This positions Mistral against ElevenLabs in a market where most high-quality TTS requires API calls to external servers. For applications like voice assistants or customer service systems, on-device processing eliminates round-trip latency and keeps audio data local. That matters more as regulations around AI and data privacy tighten.
The cross-language voice cloning is worth watching. If it works as advertised, it could make multilingual content production much cheaper. But Mistral’s preference numbers come from internal testing—independent benchmarks will show whether the quality holds up against ElevenLabs and other competitors in real-world use.
Impact Assessment