Add exported openvino model 'openvino_model_qint8_quantized.xml'

#4
by tomaarsen HF Staff - opened

Hello!

This pull request adds an exported openvino model (openvino_model_qint8_quantized.xml).

Config

OVQuantizationConfig(
    quant_method=<OVQuantizationMethod.DEFAULT: 'default'>
)

Testing this pull request

You can test this pull request before merging by loading the model from this PR with the revision argument:

from sentence_transformers import SentenceTransformer

# NOTE: Update this to the number of your pull request
pr_number = 2
model = SentenceTransformer(
    "tomaarsen/distilroberta-base-nli-v2-bf16-bf16",
    revision=f"refs/pr/{pr_number}",
    backend="openvino",
    model_kwargs={"file_name": "openvino_model_qint8_quantized.xml"},
)

# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)

similarities = model.similarity(embeddings, embeddings)
print(similarities)

This PR was auto-generated with export_static_quantized_openvino_model.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment