Please add support for GGUF‑quantized models

by makisekurisu-jp - opened 15 days ago

Discussion

makisekurisu-jp

15 days ago

https://huggingface.co/cmeka/SeedVR2-GGUF

dummy9996

11 days ago

Yes, I need the Q4K_M quantized model. Its size is roughly half that of an FP8 model. @makisekurisu-jp

comfyui won't support gguf...but I've converted it to 4-bit safetensors, try it, so it's 3.5x smaller and almost no degradation https://huggingface.co/dummy9996/seedvr2_comfyui_bf16_mxfp8_nvfp4/blob/main/seedvr2_3b_nvfp4.safetensors

makisekurisu-jp

11 days ago

Yes, I need the Q4K_M quantized model. Its size is roughly half that of an FP8 model. @makisekurisu-jp

comfyui won't support gguf...but I've converted it to 4-bit safetensors, try it, so it's 3.5x smaller and almost no degradation https://huggingface.co/dummy9996/seedvr2_comfyui_bf16_mxfp8_nvfp4/blob/main/seedvr2_3b_nvfp4.safetensors

Please also quantize the two 7B models, seedvr2_7b and seedvr2_7b_sharp, into FP4. Thank you.

makisekurisu-jp

6 days ago

https://huggingface.co/ApacheOne/SeedVR2_comfyUI-nvfp4_mixed

ApacheOne

about 24 hours ago

https://huggingface.co/ApacheOne/SeedVR2_comfyUI-nvfp4_mixed

I am not sure if my nvfp4 versions work with comfy's kitchen but I am guessing it does as you have linked it. The only valuable input I can say is that my nvfp4 versions are smaller because I also do the first and last blocks, the official paper on nvfp4 leave first and last blocks for stable training reasons but its not need for inference.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment