RiverRider (River Rider)

posted an update 8 days ago

Post

153

Train Once, Read Everywhere

Paper title:
Train Once, Read Everywhere: Substrate Invariance of the Linearly Readable Structure in Frozen Language Models

Paper URL:
https://github.com/space-bacon/SRT/blob/main/arxiv_program/paper.md

Repository URL:
https://github.com/space-bacon/SRT

The consolidated findings of the SRT research program are now available.

The program treats frozen production-scale language models as substrates whose internal states carry structure that small, inspectable instruments can read. Results include:

- A ~12 M-parameter adapter that surfaces per-token semiotic signals from a frozen 7 B backbone with zero cross-entropy degradation
- An activation verbalizer that recovers text from single hidden states up to a calibrated paraphrase ceiling
- Linear readout ports spanning dense 3 B models to 94-layer 235 B mixture-of-experts models
- A 22 MB linear head that gives a frozen multimodal chat model image-to-text retrieval performance matching fully trained 2018 dual encoders on the COCO benchmark

The central claim is substrate invariance. The readable structure is a stable property of the model class. A head trained once on one host reads, with no retraining and at most a 42 KB recalibration, across:

- Hosts ten times smaller (31 B → 3 B)
- 4-bit weight precision
- Entirely different silicon and kernels (CUDA/bf16 to Apple Silicon/MLX-Q4)

Deployment tiers differ in latency and cost, never in capability.

All instruments, measurement protocols, invariance evidence, negative results, and artifacts are in the repository.

reacted to Quazim0t0's post with 🔥 19 days ago

Post

5061

🌼 DaisyChain-Web: train a language model with friends or by yourself with multiple devices, in the browser, no install

Open a webpage, share a room link, and every device that joins becomes part of the training cluster. Phones, laptops, old PCs: they connect peer-to-peer over WebRTC and train one shared transformer together, entirely in the browser.

What's actually happening under the hood:

🧠 A mini transformer LM trains on FineWeb-Edu, streamed live from the HuggingFace Hub. Each device pulls its own slice (data parallelism), tokenized with our 16.5k-token Spikewhale tokenizer
⚡ Every single multiply runs through verified INT8 neural units, no float fallback. On WebGPU browsers it uses the GPU's DP4A integer dot-product hardware, admitted only after proving bit-identical results against the verified units, with a 3×INT8 fast-accurate scheme (CUTLASS's 3xTF32 trick, ported to 8-bit)
🔒 Devices average gradients every step under a sync guard: a per-step roster protocol plus weight-hash verification keeps every device's model bit-identical. If anything drifts, training stops instead of silently forking
📊 Live logs show exactly what every device contributes, step by step
💾 When you're done: test generations right on the page, download a checkpoint, or grab the inference kit, a single self-contained HTML file with the weights baked in that runs generations offline, anywhere
Works solo too. Every extra device just grows the effective batch.

👉 Try it: Quazim0t0/DaisyChain-Web
🛠 Training framework: DaisyChainAI/DaisyChain-Train

Updates:
- Block-scaled INT8 quantization
- Batched attention GEMM
- Fused dequant+ReLU epilogue
- Weight-tied unembedding
- WebSocket relay fallback
- Server keepalive ping/pong every 30s
- disconnected-state redial
- Visibility/network-change reconnect

11 replies

·

posted an update 29 days ago

Post

160

🔮 Gemma-4-31B-it SRT-Sunstone

A read-out that reads images — trained only on words.

A 12.3M side-channel head on a frozen google/gemma-4-31B-it, taught meaning from text alone — to separate discourse communities in the residual stream. It never saw a single picture in training. Hand it a picture and it names what the image means, with zero image training.

Cross-modal transfer. Give it a photo and it lands next to the right words: bicycle → bicycle, rose → flower, dog → pet. It groups images by what they mean, not how they look: image→referent kNN 0.64 against chance 0.10.

Why this is a semiotic read-out, not an image classifier. A classifier is trained on labelled images and learns a fixed map from pixels to a closed label set; it only knows the labels it was shown. This read-out is different in kind. It never saw an image in training — it was trained only on text, to separate discourse communities in the residual stream of a frozen gemma-4-31B. It can interpret a picture because gemma-4 already fuses image and word into one representational stream, and the read-out taps the shared interpretant: the meaning a sign carries, whatever form it arrived in. So it does not classify the image. It tells you what the image means to a system that learned meaning from words — a transfer across modality, from a head that was never cross-trained. That is the result.

Each picture gets two readings: the words it means — the load-bearing evidence — and the discourse it evokes, the nearest of 35 communities it learned from text. Read the second as a flavour, not a category: cars into the automotive community, deer and mushrooms into gardening, cats and dogs into the cozy-domestic communities. Never a class label.

Scored offline through a frozen google/gemma-4-31B-it (62.5 GB, too large to run live)

Try it: RiverRider/srt-sunstone
Model: RiverRider/Gemma-4-31B-it-SRT-Sunstone
Code: https://github.com/space-bacon/SRT

replied to ginigen-ai's post 29 days ago

Thanks @NovusEdge the work is understood by few so I am thrilled you get it.

It performs extremely well semantically and beyond across communities.

Check 11.5.2 in paper nla
https://github.com/space-bacon/SRT/blob/main/paper_nla.md

replied to ginigen-ai's post 30 days ago

Benchmarking various models with the SRT as I type. Nice work. It's good to have others in the space.

https://github.com/space-bacon/SRT

reacted to ginigen-ai's post with 🔥 about 1 month ago

Post

10477

🧠 Does your LLM know when it's about to be wrong?

Most leaderboards measure accuracy. We measure metacognition — whether a model catches its own errors. Benchmark + leaderboard + adapters, all open. 🎉

The surprise: even a K-AI #1 model (JGOS-31B-Citizen) is the strongest on multiple-choice traps (trap_rate 0.005 — ~2 misses in 400) yet blind to its own free-form mistakes (self-confidence AUROC = 0.5, pure random). A tiny base-frozen adapter recovers that signal.

Two independent axes (never compared across a row): ① trap_rate — does it fall for tempting trap options? (lower = stronger) ② adapter gain Δ — how much a lightweight adapter catches errors the model itself misses. (higher = more adapter value)

What's open: 📊 300+100 trap problems (each with a hidden trap + TICOS type) 🏆 24-model leaderboard 🧩 11 per-model adapters — adapters, NOT fine-tunes (base stays frozen; the adapter just reads the hidden state → P(wrong))

Submit any HF model → auto-scored daily at 09:00 KST and added to the board.

🏆 Leaderboard → ginigen-ai/Metacognition-Leaderboard-Space

📊 Benchmark → ginigen-ai/Metacognition-Bench

🧩 Adapters → https://huggingface.co/collections/FINAL-Bench/metacognition-adapters-6a42c032e6beb803dd032961

📊 Article → https://huggingface.co/blog/ginigen-ai/metacognition

Benchmark by ginigen-ai · Adapters by FINAL-Bench (Darwin/Chimera platform + AETHER metacognition tech).

11 replies

·

posted an update about 1 month ago

Post

277

SRT Showcase: Watch a Frozen Model Think, Token by Token

A frozen Qwen-2.5-7B now narrates its own interpretation in real time. SRT Showcase is the most complete public demonstration of computational semiotics to date, running the backbone with the SRT Adapter and Activation Verbalizer. As the model generates, every token is tinted by its predictive effort, and at the highest-effort positions the Verbalizer decodes the hidden state directly into natural language. You see what the model is representing at the exact moment its computation is most active.

Every verbalization is validated, not asserted. Each decoded thought is re-encoded and compared back to the original hidden state, and the reconstruction closely approximates it. The "this is what the model was thinking" claim carries its own fidelity badge. This is grounded introspection, not plausible narration.

The Showcase goes further than the trace. An A/B panel runs the same prompt with SRT injection on and off under an identical seed, so the side-channel's effect is directly observable. A curated gallery walks through confident recall, false premises, misconceptions, reasoning pivots, genuine uncertainty, and safety boundaries. Live entropy and divergence meters track the crystallization process token by token, with per-layer traces and reflexivity estimates on hover.

None of the backbone weights are touched. The entire mechanism is a lightweight reflexive layer over a frozen model, which is why the same read-out heads already port from Qwen-2.5-7B up to a 235B Mixture of Experts. Frozen models can now be verbalized in real time. No retraining. No fine-tuning. No black box.

First request is a brief cold start while ZeroGPU acquires a GPU. Bring your own prompt.

Try it: RiverRider/srt-showcase

Repository: https://github.com/space-bacon/SRT

reacted to their post with 🚀 about 2 months ago

Post

208

Real-Time Introspection for Qwen3 235B

An SRT adapter is now available for Qwen3 235B, a large open Mixture of Experts model. This adapter adds a lightweight monitoring layer that provides signals about the models internal state during generation without modifying any of its original weights.

The adapter can indicate whether the model is operating in a more stable or more divergent internal condition. It also estimates how self referential the models processing is at each step and tracks shifts in representation across layers. In addition it produces information about the kind of discourse style the model appears to be using at any moment.

This is the first adapter of its kind released on a model of this scale with fully open weights. It was trained in a read only configuration so the base model remains unchanged. The strongest result is in regime detection where the adapter achieved very high accuracy in distinguishing between different internal operating states on held out data.

For researchers focused on model interpretability this provides a practical way to inspect internal dynamics at frontier scale. For those working in semiotics it offers empirical access to how processes of meaning and interpretation unfold inside large contemporary language models.

Model: RiverRider/srt-adapter-qwen3-235b

Repository and documentation: https://github.com/space-bacon/SRT

reacted to branikita's post with 🚀 about 2 months ago

Post

205

Hackernoon published our article on how semi-SCARA kinematics and mechanical design choices reduce the cost of a 6DOF manipulator without relying on expensive actuators.

https://hackernoon.com/building-a-budget-robot-arm-with-semi-scara-kinematics

posted an update about 2 months ago

Post

208

Real-Time Introspection for Qwen3 235B

An SRT adapter is now available for Qwen3 235B, a large open Mixture of Experts model. This adapter adds a lightweight monitoring layer that provides signals about the models internal state during generation without modifying any of its original weights.

The adapter can indicate whether the model is operating in a more stable or more divergent internal condition. It also estimates how self referential the models processing is at each step and tracks shifts in representation across layers. In addition it produces information about the kind of discourse style the model appears to be using at any moment.

This is the first adapter of its kind released on a model of this scale with fully open weights. It was trained in a read only configuration so the base model remains unchanged. The strongest result is in regime detection where the adapter achieved very high accuracy in distinguishing between different internal operating states on held out data.

For researchers focused on model interpretability this provides a practical way to inspect internal dynamics at frontier scale. For those working in semiotics it offers empirical access to how processes of meaning and interpretation unfold inside large contemporary language models.

Model: RiverRider/srt-adapter-qwen3-235b

Repository and documentation: https://github.com/space-bacon/SRT

replied to eabdullin's post about 2 months ago

Attention is all you need folks and let me tell you, it’s big league.

reacted to their post with 🚀 about 2 months ago

Post

5142

ATTENTION: The SRT-Introspect framework moves past surface-level output commentary by supplying real-time natural language interpretations of a model’s latent states. These verbalizations are validated, not merely asserted, through a round-trip reconstruction procedure. Natural language descriptions derived from hidden activations are passed through an encoder that reconstructs the corresponding activation vector; the recovered vector closely approximates the original. High reconstruction fidelity indicates that the verbalizations encode genuine structural information about the internal state rather than offering plausible but ungrounded speculation.

This validated introspection converts what has often remained a theoretical or post-hoc exercise into a practical instrument for auditing model behavior, diagnosing failure modes, and providing high-level semantic guidance—all without modifying the base model or incurring the costs of fine-tuning. Because the mechanism operates on frozen configurations, it can be applied to production systems where any change to weights or architecture is undesirable. Thank you for your attention.

Run a trace: RiverRider/srt-introspect

Repo: https://github.com/space-bacon/SRT

replied to black-yt's post about 2 months ago

First time hearing about this … I will steer the SRT on this soon. Catching up.

reacted to black-yt's post with 🔥 about 2 months ago

Post

4880

Hey all — our ResearchClawBench leaderboard just updated 🔥

We let AI do real science: 40 tasks across 10 disciplines, compared to human papers. Hard example? 🏔️ Glacier mass change — AI must integrate 233 datasets from 35 teams, 4 methods, reproduce 6542±387 Gt ice loss vs IPCC. No toy problems.

Latest leaderboard (2026-06-09) 📊:
Agents: 🥇 Claude Code 21.5 (50 = match human), $5.3; 🥈 EvoScientist 18.8, $4.1; 🥉 Codex CLI 18.4, just $2.0
LLMs+Harness: 🥇 Claude-Opus-4.8 21.1, $4.0; 🥈 Claude-Opus-4.7 20.7; 🥉 MiniMax-M3 19.8, only $0.45; Qwen3.7-Max 18.7, $0.42, 11min 💥

Claude still king, but MiniMax/Qwen/DeepSeek are crazy cheap and competitive. Expensive isn't always better.

📎 Code & star: https://github.com/InternScience/ResearchClawBench
🏠 Website: https://internscience.github.io/ResearchClawBench-Home/
🤗 Upvote paper: ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research (2606.07591)

2 replies

·

reacted to their post with 😎 about 2 months ago

Post

3631

This is not a pipe.

Everyone is born a semiotician, no one is born knowing it. Go easy on yourself (and me) for not understanding this yet.

Computational semiotics is now an empirical study.

LLMs are not proto-minds. They are verifiably semiotic infrastructure.

This repository (or attached demo) can show you, in real time, how any frozen model (Qwen for demo) arrives at any answer by reading its latent states directly during generation.

Any questions?

RiverRider/srt-introspect

Repo:

https://github.com/space-bacon/SRT

Grok insist my intro is condescending … This is certainly true, as is the statement in my condescended opinion. I expect heat for it, let’s think this through?

posted an update about 2 months ago

Post

5142

ATTENTION: The SRT-Introspect framework moves past surface-level output commentary by supplying real-time natural language interpretations of a model’s latent states. These verbalizations are validated, not merely asserted, through a round-trip reconstruction procedure. Natural language descriptions derived from hidden activations are passed through an encoder that reconstructs the corresponding activation vector; the recovered vector closely approximates the original. High reconstruction fidelity indicates that the verbalizations encode genuine structural information about the internal state rather than offering plausible but ungrounded speculation.

This validated introspection converts what has often remained a theoretical or post-hoc exercise into a practical instrument for auditing model behavior, diagnosing failure modes, and providing high-level semantic guidance—all without modifying the base model or incurring the costs of fine-tuning. Because the mechanism operates on frozen configurations, it can be applied to production systems where any change to weights or architecture is undesirable. Thank you for your attention.

Run a trace: RiverRider/srt-introspect

Repo: https://github.com/space-bacon/SRT

posted an update about 2 months ago

Post

3631

This is not a pipe.

Everyone is born a semiotician, no one is born knowing it. Go easy on yourself (and me) for not understanding this yet.

Computational semiotics is now an empirical study.

LLMs are not proto-minds. They are verifiably semiotic infrastructure.

This repository (or attached demo) can show you, in real time, how any frozen model (Qwen for demo) arrives at any answer by reading its latent states directly during generation.

Any questions?

RiverRider/srt-introspect

Repo:

https://github.com/space-bacon/SRT

Grok insist my intro is condescending … This is certainly true, as is the statement in my condescended opinion. I expect heat for it, let’s think this through?

reacted to their post with 🚀 about 2 months ago

Post

204

Words do not have determined meanings.

The vocabulary itself is reflexive. It is self-referential, looping back into its own structure rather than anchoring in fixed reality. What we treat as stable meaning is continually reconstituted in the act of using it. The observers own interpretations molding each word like clay with every utterance.

All large language models to date treat words otherwise. At the moment of softmax crystallization they determine the meaning of every token. Probabilities collapse into a single output. Meaning is not found. It is fixed, token by token, in that final distribution.

SRT-Introspect is a demo for observing what Qwen actually thinks at the points of highest effort. It surfaces the internal representations during generation, making visible the reflexive vocabulary at work and the precise crystallization process: the weights, the assumptions, the decisions that resolve ambiguity into output. This includes accounting for anisotropy collapse in hidden states by centering representations around the layer-mean before analysis.

Feel free to comment your prompts

RiverRider/srt-introspect

Repo
https://github.com/space-bacon/SRT

1 reply

·

reacted to danielhanchen's post with 🔥 about 2 months ago

Post

9353

Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.
You can run and train the model via Unsloth Studio.

GGUF: unsloth/gemma-4-12b-it-GGUF
Guide: https://unsloth.ai/docs/models/gemma-4

5 replies

·

replied to AxionLab-official's post about 2 months ago

I’m curious how it arrived https://github.com/space-bacon/SRT

River Rider PRO

AI & ML interests

Recent Activity

Organizations

River Rider PRO

AI & ML interests

Recent Activity

Organizations

RiverRider's activity