-
P-EAGLE: Parallel-Drafting EAGLE with Scalable Training
Paper • 2602.01469 • Published • 3 -
Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs
Paper • 2403.00858 • Published • 1 -
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
Paper • 2503.10135 • Published -
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
Paper • 2602.23881 • Published • 18
Ganesh Jawahar
ganeshjwhr
·
AI & ML interests
NLP, Efficiency
Recent Activity
updated a collection 1 day ago
SpecDec updated a collection 1 day ago
SpecDec updated a collection 1 day ago
SpecDec