looking forward to the quantized versions for edge deployment

#3
by tomasmcm - opened

pretty cool model you got here. I hope you are considering releasing a quantized version as well, so we can try running this on SBCs, phones, tablets 🙏

Reka AI org

Hi Tomas, we are working on quantizing the model and will update here when we have more to share!

Hello do you have any news about GGUF ?

Reka AI org

Hi @DeltaWhiplash , we should have it within the next couple of weeks to a month

Reka AI org

I've made a PR upstream to llama.cpp to add support for Reka Edge - https://github.com/ggml-org/llama.cpp/pull/21616

We've also added scripts for converting to GGUF (convert_reka_vlm_to_gguf.py) and basic quantization scripts (quantize_reka_q4_last8_q8.sh, quantize_reka_q4.sh)

Sign up or log in to comment