jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 8
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 27
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 22
charlie-li/Qwen3-Coder-30B-A3B-Instruct-Lora-ScaleSWE-Distilled-v1 Text Generation • 31B • Updated Apr 28 • 106