A newer version of this model is available: WhirlwindAI/SubatomZephyr




The Idea

What if a transformer became... microscopic?

AtomZephyr explores one of the smallest practical transformer architectures ever built.

Not because anyone asked for it.

Because someone eventually had to answer the question:

"How absurdly small can an AI become before it forgets how to AI?"

Turns out...

27 parameters is still technically enough.


Why?

Most AI models compete by getting bigger.

AtomZephyr competes by removing parameters until people start questioning whether it's still a neural network.

Every parameter had to earn its place.

Most didn't.


Specifications

Property Value
Parameters 27
Architecture GPT-2
Layers 1
Attention Heads 1
Embedding Size 1
FFN Size 1
Context Length 4
Vocabulary 5 Tokens
Model Size <5 KB
Training Time ~6 Seconds (CPU)

Performance

Test Result
Understand English ❌
Write Code ❌
Solve Math ❌
Generate "abba" βœ…
Break Expectations βœ…

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("WhirlwindAI/AtomZephyr")
model = AutoModelForCausalLM.from_pretrained("WhirlwindAI/AtomZephyr")

prompt = "a"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    do_sample=True,
    temperature=1.7,
    max_length=4
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Possible output

abaa

Groundbreaking.


Example Conversation

User

Tell me a joke.

AtomZephyr

abba

Technically...

that's an answer.


Scientific Achievement

Removing parameters is easy.

Keeping a transformer alive afterwards...

isn't.

AtomZephyr exists purely to explore the absolute lower limits of transformer architectures while remaining a real, trainable language model.

Whether it's useful is a completely different discussion.


Awards

πŸ₯‡ Smallest Model That Still Has Self-Respect

πŸ† Best Binary Poetry Generator

πŸ₯ˆ Most Efficient Waste Of Six Seconds

πŸŽ–οΈ Official Representative Of Tiny AI


Limitations

AtomZephyr should not be used for:

  • Programming
  • Translation
  • Question Answering
  • Homework
  • Anything important

It performs significantly better when asked to do absolutely nothing useful.


Fun Facts

  • Fits inside most PNG images.
  • Smaller than many neural network tutorials.
  • Downloads faster than this README loads.
  • Has fewer parameters than some calculator manuals have pages.

License

MIT

Take it apart.

Make it smaller.

Break another record.


Built by WhirlwindAI

"Sometimes progress isn't measured in billions... it's measured in what you can remove."


Downloads last month
-
Safetensors
Model size
27 params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support