Voxtral-4B-TTS-2603 (MLX 4bit)
MLX 4bit version of mistralai/Voxtral-4B-TTS-2603 โ a 4B parameter multilingual text-to-speech model with 20 voice presets across 9 languages.
Size: ~2.5GB
Use with mlx-audio
pip install -U mlx-audio
from mlx_audio.tts.utils import load
model = load("mlx-community/Voxtral-4B-TTS-2603-mlx-4bit")
for result in model.generate(
text="Hello, this is a test of Voxtral text-to-speech!",
voice="casual_male",
):
# result.audio is an mx.array of 24kHz audio samples
print(f"Generated {result.audio_duration} of audio")
Available Voices
English: casual_male, casual_female, cheerful_female, neutral_male, neutral_female
French: fr_male, fr_female | Spanish: es_male, es_female | German: de_male, de_female
Italian: it_male, it_female | Portuguese: pt_male, pt_female | Dutch: nl_male, nl_female
Arabic: ar_male | Hindi: hi_male, hi_female
Throughput (Apple Silicon)
| Variant | Short RTF | Long RTF | Size |
|---|---|---|---|
| 4-bit | 0.97x | 0.74x | ~2.5GB |
| 6-bit | 1.15x | 1.07x | ~3.5GB |
| bf16 | 6.50x | 6.32x | ~8GB |
RTF = Real-Time Factor (lower is faster, <1.0 = faster than real-time).
- Downloads last month
- 2,138
Model size
0.8B params
Tensor type
BF16
ยท
U32 ยท
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for mlx-community/Voxtral-4B-TTS-2603-mlx-4bit
Base model
mistralai/Ministral-3-3B-Base-2512 Finetuned
mistralai/Voxtral-4B-TTS-2603