looking forward to the quantized versions for edge deployment
#3
by tomasmcm - opened
pretty cool model you got here. I hope you are considering releasing a quantized version as well, so we can try running this on SBCs, phones, tablets 🙏
Hi Tomas, we are working on quantizing the model and will update here when we have more to share!
Hello do you have any news about GGUF ?
I've made a PR upstream to llama.cpp to add support for Reka Edge - https://github.com/ggml-org/llama.cpp/pull/21616
We've also added scripts for converting to GGUF (convert_reka_vlm_to_gguf.py) and basic quantization scripts (quantize_reka_q4_last8_q8.sh, quantize_reka_q4.sh)