Datasets used to train SmolDocling
HuggingFaceM4
Team
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
HuggingFaceM4 is the multimodal team at Hugging Face, working on vision-language models.
Within this organization on the Hugging Face hub, you can access the Idefics models (version 1 IDEFICS, version 2 Idefics2, version 3 Idefics3), datasets used for the training like OBELICS, WebSight, The Cauldron or Docmatix, and interactive tools to visualize the results.
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
IDEFICS2 Playground
🐨169Chat with a visual AI that answers questions about images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 89.9k • 624 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 117 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 772 • 28
Datasets used to train SmolDocling
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
IDEFICS2 Playground
🐨169Chat with a visual AI that answers questions about images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 89.9k • 624 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 117 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 772 • 28
spaces 22
pinned
Build error
Agents
Featured
377
IDEFICS Playground
🐨
Running on Zero
Agents
Demo of Encoder-Free VLM Trained for $100
🖼
Ask questions about images and get answers instantly
Running
2
Encoder-Free VLM
👁
Train Your Own Encoder-Free VLM in $100
Paused
Featured
251
faster-qwen3-tts
🎙
Generate natural speech from text or voice samples
Running
Agents
10
Reachy Mini Remote Control (Multi-User)
🤖
Remote control for Reachy Mini robots with authentication
Sleeping
Agents
Reachy Mini Key Claim
🚀
Request an ephemeral API key using an order number
models 34
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text • 8B • Updated • 288k • 304
HuggingFaceM4/Florence-2-DocVQA
Image-Text-to-Text • 0.8B • Updated • 546 • 65
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 89.9k • 624
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 772 • 28
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 117 • 95
HuggingFaceM4/siglip-so400m-14-364-flash-attn2-navit
Zero-Shot Image Classification • 0.9B • Updated • 3 • 1
HuggingFaceM4/siglip-so400m-14-700-flash-attn2-navit
Zero-Shot Image Classification • 0.9B • Updated • 4 • 2
HuggingFaceM4/siglip-so400m-14-384-flash-attn2-navit
Zero-Shot Image Classification • 0.9B • Updated • 3 • 1
HuggingFaceM4/idefics2-8b-chatty-AWQ
Image-Text-to-Text • 8B • Updated • 8 • 5
HuggingFaceM4/idefics2-8b-AWQ
Image-Text-to-Text • 8B • Updated • 6 • 26
datasets 82
HuggingFaceM4/FineVisionMax
Viewer • Updated • 24.2M • 22.7k • 27
HuggingFaceM4/FineVision
Viewer • Updated • 24.2M • 155k • 492
HuggingFaceM4/lmms-eval-embeddings
Updated • 271 • 1
HuggingFaceM4/DoclingMatix
Viewer • Updated • 1.27M • 2.26k • 52
HuggingFaceM4/Caltech-101
Updated • 180 • 4
HuggingFaceM4/Docmatix
Viewer • Updated • 2.55M • 13.6k • 305
HuggingFaceM4/the_cauldron
Viewer • Updated • 1.88M • 331k • 546
HuggingFaceM4/FairFace
Viewer • Updated • 195k • 2.21k • 31
HuggingFaceM4/MMBench
Viewer • Updated • 11k • 296 • 4
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 11.6k • 395