The Idea
Why?
Most research asks
"How can we make models smarter?"
We asked
"How many parameters can we delete before Git starts feeling sorry for us?"
This repository is the answer.
Specifications
| Property | Value |
|---|---|
| Parameters | 21 |
| Architecture | GPT-2 |
| Layers | 1 |
| Attention Heads | 1 |
| Embedding Size | 1 |
| Context Length | 1 |
| Vocabulary | 2 Tokens |
| Disk Size | <2 KB |
| Training Time | ~20 Seconds |
Performance
| Task | Result |
|---|---|
| Copy "a" | โ |
| Copy "b" | โ |
| Understand Humans | โ |
| Understand Itself | โ |
| Break Records | โ |
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("WhirlwindAI/SubatomZephyr")
model = AutoModelForCausalLM.from_pretrained("WhirlwindAI/SubatomZephyr")
prompt = "a"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=2,
do_sample=True,
temperature=2.0
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Output
a
Peak artificial intelligence.
Example Conversation
User
Tell me a story.
SubatomZephyr
a
Oscar-worthy.
Scientific Explanation
SubatomZephyr doesn't generate language.
It doesn't reason.
It doesn't predict.
It doesn't even pretend anymore.
It has mastered exactly one skill:
Input:
a
Output:
a
100% accuracy.
Zero creativity.
Perfect confidence.
World Records
๐ฅ Smallest Generative Transformer
๐ Highest Accuracy On The Letter "a"
๐ฅ Lowest Grocery Bill (21 Parameters)
๐ฅ First Model Smaller Than Most README Files
๐๏ธ Certified Quantum Intelligenceโข
Benchmarks
MMLU : ๐
HumanEval : ๐
TruthfulQA : ๐คจ
Binary Copy : ๐ 100%
Entertainment : โญโญโญโญโญ
Frequently Asked Questions
Is this useful?
No.
Is this funny?
Hopefully.
Why does it exist?
Curiosity.
Can it beat GPT-4?
Only if the task is copying the letter "a".
What's next?
QuarkZephyr.
Probably.
Fun Facts
Smaller than many favicon files.
Downloads before you click download.
Has fewer parameters than this README has paragraphs.
The tokenizer is more complicated than the model.
Uses more electricity displaying this README than running inference.
Limitations
SubatomZephyr should not be used for:
Chatbots
Coding
Translation
Math
Science
Existing in production
It excels primarily at making ML engineers laugh.
License
MIT
If you somehow improve this model...
please tell us.
We're genuinely curious.
- Downloads last month
- 15