darwASCIInGPT β nGPT ASCII-art artists + Darwin-bred children
Char-level nGPT (Normalized GPT) checkpoints from the darwASCIInGPT experiments: small hypersphere transformers that draw ASCII art, plus the Darwin-style bred offspring produced by merging them with no gradient training. Companion knowledge base (observations, code, Spark setup):
All ASCII models are dim 256 / depth 4 (~3.18M params), char vocab ~106β109,
trained on the apehex hand-drawn ASCII corpus on a Quadro P4000. The enwik8
text models are dim 256β512 / depth 8, trained on a GTX 1060.
Special tokens (char-level)
SOL = \x02, SEP = \x03, EOA = \x04. Two framings:
| Framing | Prime with | Use |
|---|---|---|
Conditional <SOL> label <SEP> art <EOA> |
<SOL> + label + <SEP> |
request a class (e.g. Cats, Swords) |
Unconditional <SOL> art <EOA> |
<SOL> |
free-form draw (no label channel) |
These models are trained to very low loss (near-memorization), so:
Tβ0.6, top_kβ20 β clean complete drawings; top_k=1 β one fixed canonical piece
per prefix; higher T β more variety with occasional whitespace drift.
Contents
| Path | Type | Framing | Trained on / notes |
|---|---|---|---|
uncond/styleA |
artist | unconditional | apehex creatures & nature half. final stream_loss 0.031 (99.2% acc) |
uncond/styleB |
artist | unconditional | apehex objects & tech half. final stream_loss 0.089 (97.5% acc) |
apehex/styleA |
artist | conditional | GROUP_A subcategories (Cats, Dragons, Flowers, β¦) |
apehex/styleB |
artist | conditional | GROUP_B subcategories (Swords, Cars, Robots, β¦) |
apehex/breed/child_slerp |
bred | conditional | SLERP merge of styleA Γ styleB on the nGPT hypersphere |
apehex/breed/child_slerp_frozenattn |
bred | conditional | attention frozen from one parent, FFN SLERP-blended |
parents/domA, parents/domB |
artist | conditional | domain split: apehex art vs mrzjy sample |
parents/breed/child_slerp, child_discrete, child_slerp_frozenattn |
bred | conditional | recombinations of domA Γ domB |
smith-experiment/ngpt |
ablation | conditional | nGPT normalized-lerp residual. val_loss 1.0736 |
smith-experiment/smith |
ablation | conditional | MΓΆbius geodesic residual (matched init/data/schedule). val_loss 1.0788, ~1.57Γ slower |
resonance/standard_g1, harmonic30_g1 |
sweep | conditional | resonance-geometry sweep representatives |
enwik8-darwin/offspring_forkL__x__forkR.pt |
bred | enwik8 text | the hybrid-vigor offspring: bpc 2.4636 vs best parent 2.5047 (+0.0412) |
enwik8-darwin/darwin_log.json |
log | β | shared-ancestor breeding β vigor |
enwik8-darwin/darwin_log_independent.json |
log | β | independent-init breeding β no vigor (control) |
Headline result: genealogy decides hybrid vigor
Identical SLERP breeder, different parent relationship (enwik8 bpc, lower better):
| Parents | Origin | Gen-0 child | Champion | Best parent | Vigor? |
|---|---|---|---|---|---|
| independent inits | different basins | 3.26 | 2.3064 | 2.3063 | No |
| shared ancestor, split data | same basin | 2.47 | 2.4633 | 2.5047 | Yes (+0.041) |
Crossbreeding only works between mode-connected parents (shared ancestor,
specialized differently). See the GitHub docs/darwin-breeding.md.
Loading
import torch
from nGPT_pytorch import nGPT
import ngpt_patch # restore __hash__ on nGPT modules; import BEFORE constructing
ck = torch.load("uncond/styleA/model.pt", map_location="cuda", weights_only=False)
model = nGPT(**ck["config"]).cuda(); model.load_state_dict(ck["model"]); model.eval()
stoi, itos = ck["stoi"], ck["itos"]
# see code/sample.py in the GitHub repo for the full conditional/unconditional sampler
Checkpoints with
variant == "smith"need theSmithResidualswap before construction (seetrain_compare.make_modelin the source lab).
Example β uncond/styleA (unconditional, T=0.6)
_
( )
\ ( ) )
\ /\) (/\
\ /` `
| dlb
Per-checkpoint sample galleries are in the GitHub repo under galleries/.
Provenance & license
Models are derived from the apehex / mrzjy ASCII-art corpora and enwik8. Released
MIT for the model weights and code; original ASCII art belongs to its respective
artists (signatures like dlb, jgs, sjw, ejm are preserved in outputs).