potion-mxbai-256d-v2

A compact static embedding model that achieves near-SOTA quality at 8.2x smaller size. Only 0.68 points behind our 512D baseline on the full MTEB English suite while being 8.2x smaller (7.5MB int8 vs ~62MB).

Highlights

  • 71.45 avg on full MTEB English (STS + Classification + PairClassification, 25 tasks)
  • 7.5MB with int8 quantization (8.2x smaller than 512D baseline)
  • 28.8MB at full precision (float32)
  • 80-88x faster than all-MiniLM-L6-v2 on CPU (~15K vs ~200 sentences/sec)
  • Pure numpy inference โ€” no GPU needed
  • Native int8 support via model2vec v0.7 โ€” zero quality loss

How It Was Made

  1. Teacher: mixedbread-ai/mxbai-embed-large-v1 (335M params, BERT-large architecture)
  2. Distillation: model2vec distillation with 256-dim PCA and corpus-informed vocabulary
  3. Tokenlearn pre-training: Contrastive loss training on ~217K C4 English sentences using tokenlearn
  4. Born-again self-distillation: A second round of contrastive training using the model's own sentence embeddings as targets instead of the teacher's. This closes the "representation gap" between what a static model can learn and what the transformer teacher produces, yielding +0.49 avg improvement.

Key insight

The original teacher's representations are too complex for a static lookup table to fully capture. By self-distilling โ€” training a fresh model to match our own model's outputs โ€” we create "easier" targets that a static model can actually learn well. This simple technique improved all three benchmark categories.

Benchmark Results (Full MTEB English Suite)

Model STS Classification PairClassification Avg Size (int8)
potion-mxbai-2m-512d 74.15 65.44 76.80 72.13 ~125MB
potion-mxbai-256d-v2 (this) 73.79 63.23 77.33 71.45 7.5MB
potion-mxbai-128d-v2 71.75 60.45 74.40 68.87 3.9MB

Evaluated on 25 tasks (10 STS, 12 Classification, 3 PairClassification), English subsets only, identical eval code across all models.

Usage

from model2vec import StaticModel

# Full precision (28.8MB)
model = StaticModel.from_pretrained("blobbybob/potion-mxbai-256d-v2")

# INT8 quantized (7.5MB, same quality)
model = StaticModel.from_pretrained("blobbybob/potion-mxbai-256d-v2", quantize_to="int8")

embeddings = model.encode(["Hello world", "Static embeddings are fast"])

With Sentence Transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("blobbybob/potion-mxbai-256d-v2")
embeddings = model.encode(["Hello world", "Static embeddings are fast"])

When to use this model

  • You need fast, lightweight embeddings for search, clustering, or classification
  • You're deploying on resource-constrained environments (mobile, edge, serverless)
  • You want sub-millisecond inference without GPU
  • You need embeddings in a small binary (7.5MB vs 100MB+ for transformer models)

Model Family

Model Avg Size (int8) Best for
potion-mxbai-2m-512d 72.13 ~125MB Maximum quality
potion-mxbai-256d-v2 71.45 7.5MB Best quality/size balance
potion-mxbai-128d-v2 69.83 3.9MB Compact deployments
potion-mxbai-micro 68.12 0.7MB Ultra-tiny / embedded

Training Details

  • Featurization: ~217K C4 sentences encoded by mxbai-embed-large-v1
  • Training: Tokenlearn contrastive loss + born-again self-distillation, batch size 256
  • Vocabulary: 29,524 tokens (corpus-informed vocabulary from mxbai teacher tokenizer)
  • Dimensions: 256 (via PCA reduction from teacher's 1024-dim output)
  • Compute: Local RTX 2070

Citation

@article{minishlab2024model2vec,
  author = {Tulkens, Stephan and {van Dongen}, Thomas},
  title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
  year = {2024},
  url = {https://github.com/MinishLab/model2vec}
}
Downloads last month
614
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for blobbybob/potion-mxbai-256d-v2

Finetuned
(57)
this model

Evaluation results

  • spearman_cosine on MTEB STS (English, 10 tasks)
    self-reported
    73.790
  • accuracy on MTEB Classification (English, 12 tasks)
    self-reported
    63.230
  • ap on MTEB PairClassification (English, 3 tasks)
    self-reported
    77.330