Frozen Retained Reinitialised New (Hindi) User Audio Moshi Audio Mimi Encoder Mimi Decoder frozen Audio tokens Temporal Transformer 7B Language Model Self-Attention Layers Text Embeddings ✦ Audio Embeddings z_s Depth Transformer Causal Self-Attention Text Embeddings ✦ Audio Embeddings Text Linear ✦ Hindi Text Hindi SentencePiece ★ Hindi vocabulary Audio Tokens ✦ Reinitialised for Hindi ★ New component