redseaplume
/

Voxtral-4B-TTS-2603-MLX-4bit

Model card Files Files and versions

Voxtral 4B TTS, MLX 4-bit Quantized

4-bit quantized weights for Voxtral 4B TTS on Apple Silicon via MLX.

Backbone: 4-bit (group_size=64), ~2.6 GB (down from ~6.8 GB BF16) Acoustic transformer: BF16 (unchanged) Vocoder: BF16, pre-processed (weight-norm reconstructed, conv weights transposed, codebook precomputed)

Total file size: 3.4 GB (vs 7.5 GB BF16)

Usage

Requires the code from redseaplume/Voxtral-4B-TTS-2603-MLX. Point model_path at this repo.

What's in here

consolidated.safetensors: all three components in one file
params.json: model config
tekken.json: tokenizer
voice_embedding/: 20 pre-computed voice embeddings (.pt and .npz)

Notes

Only the backbone is quantized. Acoustic transformer and vocoder stay BF16.
Generation output differs slightly from BF16 (quantization is lossy). Frame counts may vary.

Downloads last month: 18

MLX

Hardware compatibility

Log In to add your hardware

Quantized

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for redseaplume/Voxtral-4B-TTS-2603-MLX-4bit

Base model

mistralai/Ministral-3-3B-Base-2512

Finetuned

mistralai/Voxtral-4B-TTS-2603

Finetuned

(2)

this model