lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-depth2-recursive-r64-a128-lr1e-5-adapter Reinforcement Learning • Updated 1 day ago • 14
lsteno/Qwen3-4B-Instruct-2507-RLM-RLVR-FullFT-lr5e-6-depth1-v1 Text Generation • 4B • Updated 23 days ago • 116