AI & ML interests

MoE architectures, Chimera models, Assembly of Experts

Recent Activity

Articles

BM-TNG 
published an article 10 months ago
view article
Article

How Long Prompts Block Other Requests - Optimizing LLM Performance

11
SR-TNG 
published an article 12 months ago
view article
Article

Finetuning olmOCR to be a faithful OCR-Engine

19
BM-TNG 
published an article 12 months ago
view article
Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

67
BM-TNG 
published an article about 1 year ago
view article
Article

Efficient Request Queueing – Optimizing LLM Performance

24