arxiv:2409.20566
Zhen Yang
zhen-yang
AI & ML interests
None yet
Organizations
models 31
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V19-Step20
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V20-Step16
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V19-Step16
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V20-Step10
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V19-Step10
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V20-Step6
2B • Updated
zhen-yang/DAPO-DAPO-CtrlG-Decouple-1.7B-V19-Step6
2B • Updated
zhen-yang/DAPO-DAPO-DRM-Qwen3-8B-Base-Step30
8B • Updated • 2
zhen-yang/DAPO-DAPO-DRM-Qwen3-8B-Base-Step25
8B • Updated • 2
zhen-yang/DAPO-DAPO-Reward-Qwen3-8B-Base-Step30
8B • Updated • 2
datasets 14
zhen-yang/rollout_output_1d7b_v20
Viewer • Updated • 147k • 13
zhen-yang/validation_output_1d7b_v20
Viewer • Updated • 6.3k • 11
zhen-yang/rollout_output_1d7b_v19
Viewer • Updated • 188k • 15
zhen-yang/validation_output_1d7b_v19
Viewer • Updated • 8.4k • 14
zhen-yang/rollout_output_drm_qwen3_8b_base
Preview • Updated • 3
zhen-yang/rollout_output_reward_qwen3_8b_base
Viewer • Updated • 164k • 31
zhen-yang/validation_output_drm_qwen3_8b_base
Viewer • Updated • 27.3k • 12
zhen-yang/validation_output_reward_qwen3_8b_base
Viewer • Updated • 12.6k • 16
zhen-yang/rollout_output_v1
Viewer • Updated • 162k • 42
zhen-yang/validation_output_v1
Viewer • Updated • 12.6k • 17