Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 15 days ago • 26
Model Merging Scaling Laws in Large Language Models Paper • 2509.24244 • Published about 1 month ago • 44
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Paper • 2605.09608 • Published May 10 • 52