LoRA adapters trained for 5 progressively shorter chain-of-thought styles on GSM8K, plus the eval artifacts behind the Pareto curve.
Frolov Anatolii
ssurface
·
AI & ML interests
None yet
Recent Activity
updated a model about 22 hours ago
s-nlp/tool-calling-hallucination-modernbert-base-glaive-100pct published a model about 22 hours ago
s-nlp/tool-calling-hallucination-modernbert-base-glaive-100pct updated a model about 22 hours ago
s-nlp/tool-calling-hallucination-modernbert-large-glaive-100pct