Translation
Transformers
Safetensors
Lezghian
Russian
m2m_100
text2text-generation
nllb
lezgi
russian
Instructions to use vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1 with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1") model = AutoModelForSeq2SeqLM.from_pretrained("vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1") - Notebooks
- Google Colab
- Kaggle
NLLB-200 Distilled 600M Lezgi-Russian (v1)
This repository provides an NLLB-200 Distilled 600M model fine-tuned for Lezgi <-> Russian translation.
Model Description
- Base model:
facebook/nllb-200-distilled-600M - Architecture:
M2M100ForConditionalGeneration - Languages: Lezgi (
lez_Cyrl), Russian (ru_Cyrl) - Direction: bidirectional (Lezgi <-> Russian)
- Tokenizer:
NllbTokenizerwith SentencePiece model
Intended Uses
- Machine translation between Lezgi and Russian.
- Bootstrapping parallel data or assisting human translation workflows.
Limitations and Bias
- Translation quality may vary across domains and dialects.
- The model may produce hallucinations or incorrect translations.
- Biases present in the training data may be reflected in outputs.
How to Use
Install dependencies:
pip install transformers sentencepiece
Example (RU -> LEZ):
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id, src_lang="ru_Cyrl", tgt_lang="lez_Cyrl")
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
text = "Привет, как дела?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Example (LEZ -> RU):
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "vadim-pashaev/nllb-200-distilled-600M-lez-rus-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id, src_lang="lez_Cyrl", tgt_lang="ru_Cyrl")
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
text = "Салам, гьикI я?"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Data
Training data was built from Lezgi Wikipedia and Lezgi Gazet website articles in Lezgi. The Lezgi texts were translated into Russian using the gpt-5.2-codex (medium) model, and the resulting parallel data was used to train this model.
Training Procedure
Training settings:
- Base model:
facebook/nllb-200-distilled-600M(NLLB-200 Distilled 600M). - Data setup: bidirectional pairs (Lez->Rus and Rus->Lez) from the same TSV rows.
- Max lengths: 192 (source) / 192 (target).
- Batch size: 2 per device, gradient accumulation 16 (effective 32 per device).
- Epochs: 6.
- Optimizer: AdamW, LR
3e-5, cosine scheduler, warmup ratio0.03, weight decay0.01, label smoothing0.0. - Precision:
bf16enabled,tf32enabled,fp16disabled.
Model Versioning
This is version v1. Future updates will be released under new tags or versions.
Citation
If you use this model, please cite:
@misc{nllb_lez_rus_v1,
title = {NLLB-200 Distilled 600M Lezgi-Russian (v1)},
author = {Vadim Pashaev},
year = {2026},
howpublished = {Hugging Face Hub}
}
License
cc-by-4.0 (Creative Commons Attribution 4.0).
- Downloads last month
- 9