Nouamane Tazi's picture

Nouamane Tazi

nouamanetazi

huggingface

·

https://nouamanetazi.github.io

AI & ML interests

Scale it 'til you make it

Recent Activity

upvoted an article 4 days ago

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

published an article 3 months ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

liked a Space 3 months ago

tiiuae/tiny-h1-blogpost

View all activity

Organizations

upvoted an article 4 days ago

Article

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

+6

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, lvwerra, sergiopaniego

•

6 days ago

• 36

upvoted a paper 5 months ago

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Paper • 2512.14080 • Published Dec 16, 2025 • 9

upvoted a collection 7 months ago

Fanar

A powerful and versatile family of Arabic Large Language Models (LLMs) designed for a wide range of tasks. • 3 items • Updated Feb 6 • 11

upvoted an article 7 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 482

upvoted 5 collections 7 months ago

gpt-oss

Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7, 2025 • 441

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 544

Llama 4

Llama 4 release • 13 items • Updated Apr 29, 2025 • 736

Google's Gemma models family

334 items • Updated Mar 12 • 820

Qwen3

84 items • Updated Dec 31, 2025 • 1.8k

upvoted an article 9 months ago

Article

AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models

imomayiz

•

Sep 16, 2025

• 19

upvoted an article 11 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 777

upvoted 3 collections 11 months ago

SmolLM3 evaluation datasets

Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated Jul 8, 2025 • 9

SmolLM3 pretraining datasets

datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12, 2025 • 51

🧠 SmolLM3

Smol, multilingual, long-context reasoner • 14 items • Updated Oct 9, 2025 • 104

upvoted an article 11 months ago

Article

Bringing Fusion Down to Earth: ML for Stellarator Optimization

cgeorgiaw

•

Jul 2, 2025

• 80

upvoted a paper 11 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26, 2025 • 78

upvoted 2 articles 12 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

+5

drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb

•

Jun 12, 2025

• 164

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

danaaubakirova, andito, merve, ariG23498, fracapuano, loubnabnl, pcuenq, mshukor, cadene

•

Jun 3, 2025

• 349

upvoted 2 articles about 1 year ago

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

+1

loubnabnl, anton-l, davanstrien

•

Mar 20, 2024

• 114

Article

Tiny Agents: an MCP-powered agent in 50 lines of code

julien-c

•

Apr 25, 2025

• 308