RedHatAI/NVIDIA-Nemotron-3-Ultra-550B-A55B-FP8-Dynamic
Text Generation • 561B • Updated • 96
OpenSource and AI
SNLP: Layer-Parallel Inference via Structured Newton Corrections
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation