arxiv:2504.16828
Muhammad Khalifa
mkhalifa
AI & ML interests
natural language genration, reinforcement learning
Organizations
models 21
mkhalifa/flan-t5-large-gsm8k
Text Generation • Updated • 5
mkhalifa/flan-t5-large-svamp
Text Generation • Updated • 3
mkhalifa/flan-t5-large-mathqa
Text Generation • Updated • 1
mkhalifa/ThinkPRM-gptoss-20B
Updated • 15
mkhalifa/r1_14b_discriminative_prm
Text Generation • 15B • Updated • 1
mkhalifa/r1_14b_longthought-1K
Text Generation • 15B • Updated • 1
mkhalifa/r1-1.5b-longthought-outcome-matching
Text Generation • 2B • Updated • 1
mkhalifa/r1-1.5b-longthought-1K
Text Generation • 2B • Updated • 3
mkhalifa/r1_14b_longthought-1K-outcome-only
Text Generation • 15B • Updated • 6
mkhalifa/r1-1.5b-longthought-v2
Text Generation • 2B • Updated • 2
datasets 18
mkhalifa/agent
Updated • 5
mkhalifa/gpqa-diamond-physics
Viewer • Updated • 86 • 214
mkhalifa/short-to-long-5K
Viewer • Updated • 5k • 2
mkhalifa/CoGEX
Viewer • Updated • 51.8k • 40
mkhalifa/llama-3.1-8b-instruct-math-trajectories-64-sample-per-problem
Viewer • Updated • 736k • 70
mkhalifa/llama-3.1-8b-instruct-math-trajectories-48-sample-per-problem
Viewer • Updated • 552k • 48
mkhalifa/llama-3.1-8b-instruct-math-trajectories-32-sample-per-problem
Viewer • Updated • 368k • 37
mkhalifa/llama-3.1-8b-instruct-math-trajectories-16-sample-per-problem
Viewer • Updated • 184k • 23
mkhalifa/llama-3.1-8b-instruct-math-trajectories-8-sample-per-problem
Viewer • Updated • 92k • 11
mkhalifa/llama-3.1-70b-instruct-math-trajectories-8-sample-per-problem
Viewer • Updated • 92k • 12