darwASCIInGPT β€” nGPT ASCII-art artists + Darwin-bred children

Char-level nGPT (Normalized GPT) checkpoints from the darwASCIInGPT experiments: small hypersphere transformers that draw ASCII art, plus the Darwin-style bred offspring produced by merging them with no gradient training. Companion knowledge base (observations, code, Spark setup):

GitHub: https://github.com/tinycrops/darwASCIInGPT-playbook

All ASCII models are dim 256 / depth 4 (~3.18M params), char vocab ~106–109, trained on the apehex hand-drawn ASCII corpus on a Quadro P4000. The enwik8 text models are dim 256–512 / depth 8, trained on a GTX 1060.

Special tokens (char-level)

SOL = \x02, SEP = \x03, EOA = \x04. Two framings:

Framing Prime with Use
Conditional <SOL> label <SEP> art <EOA> <SOL> + label + <SEP> request a class (e.g. Cats, Swords)
Unconditional <SOL> art <EOA> <SOL> free-form draw (no label channel)

These models are trained to very low loss (near-memorization), so: Tβ‰ˆ0.6, top_kβ‰ˆ20 β†’ clean complete drawings; top_k=1 β†’ one fixed canonical piece per prefix; higher T β†’ more variety with occasional whitespace drift.

Contents

Path Type Framing Trained on / notes
uncond/styleA artist unconditional apehex creatures & nature half. final stream_loss 0.031 (99.2% acc)
uncond/styleB artist unconditional apehex objects & tech half. final stream_loss 0.089 (97.5% acc)
apehex/styleA artist conditional GROUP_A subcategories (Cats, Dragons, Flowers, …)
apehex/styleB artist conditional GROUP_B subcategories (Swords, Cars, Robots, …)
apehex/breed/child_slerp bred conditional SLERP merge of styleA Γ— styleB on the nGPT hypersphere
apehex/breed/child_slerp_frozenattn bred conditional attention frozen from one parent, FFN SLERP-blended
parents/domA, parents/domB artist conditional domain split: apehex art vs mrzjy sample
parents/breed/child_slerp, child_discrete, child_slerp_frozenattn bred conditional recombinations of domA Γ— domB
smith-experiment/ngpt ablation conditional nGPT normalized-lerp residual. val_loss 1.0736
smith-experiment/smith ablation conditional MΓΆbius geodesic residual (matched init/data/schedule). val_loss 1.0788, ~1.57Γ— slower
resonance/standard_g1, harmonic30_g1 sweep conditional resonance-geometry sweep representatives
enwik8-darwin/offspring_forkL__x__forkR.pt bred enwik8 text the hybrid-vigor offspring: bpc 2.4636 vs best parent 2.5047 (+0.0412)
enwik8-darwin/darwin_log.json log β€” shared-ancestor breeding β†’ vigor
enwik8-darwin/darwin_log_independent.json log β€” independent-init breeding β†’ no vigor (control)

Headline result: genealogy decides hybrid vigor

Identical SLERP breeder, different parent relationship (enwik8 bpc, lower better):

Parents Origin Gen-0 child Champion Best parent Vigor?
independent inits different basins 3.26 2.3064 2.3063 No
shared ancestor, split data same basin 2.47 2.4633 2.5047 Yes (+0.041)

Crossbreeding only works between mode-connected parents (shared ancestor, specialized differently). See the GitHub docs/darwin-breeding.md.

Loading

import torch
from nGPT_pytorch import nGPT
import ngpt_patch  # restore __hash__ on nGPT modules; import BEFORE constructing

ck = torch.load("uncond/styleA/model.pt", map_location="cuda", weights_only=False)
model = nGPT(**ck["config"]).cuda(); model.load_state_dict(ck["model"]); model.eval()
stoi, itos = ck["stoi"], ck["itos"]
# see code/sample.py in the GitHub repo for the full conditional/unconditional sampler

Checkpoints with variant == "smith" need the SmithResidual swap before construction (see train_compare.make_model in the source lab).

Example β€” uncond/styleA (unconditional, T=0.6)

    _
   (   )
  \ (  )  )
   \  /\) (/\
    \ /`    `
      |       dlb

Per-checkpoint sample galleries are in the GitHub repo under galleries/.

Provenance & license

Models are derived from the apehex / mrzjy ASCII-art corpora and enwik8. Released MIT for the model weights and code; original ASCII art belongs to its respective artists (signatures like dlb, jgs, sjw, ejm are preserved in outputs).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support