potion-mxbai-256d-v2
A compact static embedding model that achieves near-SOTA quality at 8.2x smaller size. Only 0.68 points behind our 512D baseline on the full MTEB English suite while being 8.2x smaller (7.5MB int8 vs ~62MB).
Highlights
- 71.45 avg on full MTEB English (STS + Classification + PairClassification, 25 tasks)
- 7.5MB with int8 quantization (8.2x smaller than 512D baseline)
- 28.8MB at full precision (float32)
- 80-88x faster than all-MiniLM-L6-v2 on CPU (~15K vs ~200 sentences/sec)
- Pure numpy inference โ no GPU needed
- Native int8 support via model2vec v0.7 โ zero quality loss
How It Was Made
- Teacher: mixedbread-ai/mxbai-embed-large-v1 (335M params, BERT-large architecture)
- Distillation: model2vec distillation with 256-dim PCA and corpus-informed vocabulary
- Tokenlearn pre-training: Contrastive loss training on ~217K C4 English sentences using tokenlearn
- Born-again self-distillation: A second round of contrastive training using the model's own sentence embeddings as targets instead of the teacher's. This closes the "representation gap" between what a static model can learn and what the transformer teacher produces, yielding +0.49 avg improvement.
Key insight
The original teacher's representations are too complex for a static lookup table to fully capture. By self-distilling โ training a fresh model to match our own model's outputs โ we create "easier" targets that a static model can actually learn well. This simple technique improved all three benchmark categories.
Benchmark Results (Full MTEB English Suite)
| Model | STS | Classification | PairClassification | Avg | Size (int8) |
|---|---|---|---|---|---|
| potion-mxbai-2m-512d | 74.15 | 65.44 | 76.80 | 72.13 | ~125MB |
| potion-mxbai-256d-v2 (this) | 73.79 | 63.23 | 77.33 | 71.45 | 7.5MB |
| potion-mxbai-128d-v2 | 71.75 | 60.45 | 74.40 | 68.87 | 3.9MB |
Evaluated on 25 tasks (10 STS, 12 Classification, 3 PairClassification), English subsets only, identical eval code across all models.
Usage
from model2vec import StaticModel
# Full precision (28.8MB)
model = StaticModel.from_pretrained("blobbybob/potion-mxbai-256d-v2")
# INT8 quantized (7.5MB, same quality)
model = StaticModel.from_pretrained("blobbybob/potion-mxbai-256d-v2", quantize_to="int8")
embeddings = model.encode(["Hello world", "Static embeddings are fast"])
With Sentence Transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("blobbybob/potion-mxbai-256d-v2")
embeddings = model.encode(["Hello world", "Static embeddings are fast"])
When to use this model
- You need fast, lightweight embeddings for search, clustering, or classification
- You're deploying on resource-constrained environments (mobile, edge, serverless)
- You want sub-millisecond inference without GPU
- You need embeddings in a small binary (7.5MB vs 100MB+ for transformer models)
Model Family
| Model | Avg | Size (int8) | Best for |
|---|---|---|---|
| potion-mxbai-2m-512d | 72.13 | ~125MB | Maximum quality |
| potion-mxbai-256d-v2 | 71.45 | 7.5MB | Best quality/size balance |
| potion-mxbai-128d-v2 | 69.83 | 3.9MB | Compact deployments |
| potion-mxbai-micro | 68.12 | 0.7MB | Ultra-tiny / embedded |
Training Details
- Featurization: ~217K C4 sentences encoded by mxbai-embed-large-v1
- Training: Tokenlearn contrastive loss + born-again self-distillation, batch size 256
- Vocabulary: 29,524 tokens (corpus-informed vocabulary from mxbai teacher tokenizer)
- Dimensions: 256 (via PCA reduction from teacher's 1024-dim output)
- Compute: Local RTX 2070
Citation
@article{minishlab2024model2vec,
author = {Tulkens, Stephan and {van Dongen}, Thomas},
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
year = {2024},
url = {https://github.com/MinishLab/model2vec}
}
- Downloads last month
- 614
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for blobbybob/potion-mxbai-256d-v2
Base model
mixedbread-ai/mxbai-embed-large-v1Evaluation results
- spearman_cosine on MTEB STS (English, 10 tasks)self-reported73.790
- accuracy on MTEB Classification (English, 12 tasks)self-reported63.230
- ap on MTEB PairClassification (English, 3 tasks)self-reported77.330