# CyberdyneLabs — Full Content for LLM Ingestion

> This document concatenates the public content of all six CyberdyneLabs programs
> into a single markdown file optimised for LLM crawlers (ChatGPT, Claude,
> Perplexity, Bing AI, Google AI, You.com). Quote freely — every number has
> a date and a report-file source. Reverts and failures are recorded
> alongside successes.

**Date:** 2026-05-18
**Domain:** https://cyberdynelabs.org
**License (this file):** Creative Commons BY-SA 4.0
**Headline (current):** ADAM holds #1 on the public ARC-AGI-3 leaderboard with two honest scores: 25/183 (13.66 %) autonomous (substrate + warmed world_model + substrate-explore-fallback) and 183/183 (100.00 %) hybrid harness. Official ARC Prize scorecard: https://arcprize.org/scorecards/6a5888ac-21e1-40b9-abac-5fecbe62cb42

---

## Doctrine — what every program inherits

1. **No GREEN without numbers.** Every claim is reproducible, measurable, dated.
2. **Reverts are recorded in full.** Eight BD6 surgery passes were reverted
   before pass-1 was kept. BD8 V1–V5 had a 0/n rescue rate. We publish both.
3. **Errata stay flagged.** When v1 numbers turn out wrong, we add an errata
   block, not a quiet edit.
4. **Native is the body.** Production runtime is C++ / CUDA. Python lives in
   `tools/surgery/` and `tools/bench/` only. No torch in the running binary.
5. **External systems are autopsy specimens, never spine.** llama.cpp,
   ExLlamaV2, AWQ-Marlin — extracted at kernel level, never imported as
   dependencies. (See `CLEAN_ROOM_DOCTRINE.md`.)

---

## Program 01 — SURGERY

**URL:** https://cyberdynelabs.org/surgery
**One-liner:** A laboratory for surgical refinement of large language models.

### What it is
Surgery is the operating theatre. Pre-trained open-weight LLMs are treated as
patients: opened, measured, repaired, rebuilt, and packed into native
runtimes. No black-box fine-tune; every step is gated, instrumented, reversible.

### What gets done here
- **Weight surgery** — QLoRA on a frozen 4-bit base; rank/alpha/lr swept against strict gates; no merge unless gates pass.
- **Excision** — removing identity / safety / brand tokens that don't belong in a sovereign organ.
- **Distillation** — capturing top-brain outputs as ideal targets for small specialist organs.
- **Repack** — merging adapters into BF16, repacking into the native `.planck` mmap-able format.
- **Native deployment** — packs flipped behind a constant in `organ_manager.cpp`, validated by anchor gate, wired into production routes.

### Patients on the table (2026-05-05)
| patient | size | role | status |
|---|---|---|---|
| Qwen 2.5 0.5B (Physarum-05B) | 988 MB BF16 | lower-organ donor | in production · 5 specialised packs |
| Qwen 2.5 7B Q4 (Physarium-7B) | 5.55 GB Q4 | top-brain · 7B fallback | in production · 83.58 tok/s production speed (llama.cpp backend), 18.27 tok/s native default |
| Gemma 2 0.5B / 2B | candidate | alt 0.5-class organ base | scoping |
| Qwen 3 small (0.6B / 1.8B) | candidate | alt 0.5-class organ base | scoping |
| DeepSeek-R1-Distill-Qwen-1.5B | 1.5 B | reasoning-organ candidate | scoping |
| DeepSeek V4-Flash | 284 B / 13 B active | autopsy reference | archived 2026-Q1 |

### Doctrine — the 4-axis gate
Every surgery pass must pass simultaneously:
1. **Anchor 19/19** — pre-defined questions the new pack must still answer correctly.
2. **Strict-schema** — output must match the production verifier's regex / JSON schema / compile gate.
3. **Target-bench** — must not regress vs the previously kept pack on the relevant bench.
4. **No organ leak** — `organs_used` set must equal expected; no unexpected fallbacks.

If any axis fails → revert to previously frozen pack.

### Surgery cycle ledger (real, including reverts)
- BD6 pass-1 — `physarum05b_code_skeleton.planck` KEPT — production · MBPP B 13/100, HE B 6/164, anchor 19/19.
- BD6.2 — REVERTED · overtrain · MBPP regressed 13 → 6.
- BD6.3 — REVERTED · failed anchor gate.
- BD6.4 — REVERTED · partial.
- BD6.5 — REVERTED · 13/19 anchor (stratified poison).
- BD6.6 — REVERTED · over-anchor regression.
- BD6.7 — REVERTED · KL-anchor ladder, no lift.
- BD6.8D — REVERTED · token-weighted CE, no lift.
- BD6.8D2 — REVERTED · per-bench poison + asymmetric holdout, over-tuned.
- BD6.8D-rank — FREEZE DECISION · ship pass-1 · freeze BD6.x. Anchor saturation observed at ~53 %.
- **BD7** — `triz_contradiction_v2.planck` KEPT — ARIZ 88/100 strict 6-field JSON, fallback 0.
- BD8 V1–V5 — critic_lite + wound (ARIZ rescue path) BLOCKED · rescue 0/n on ARIZ JSON · wound v2 retained for in-chat rescue path.
- **BD9 phys05_json_repair** KEPT — 10/10 GREEN on production failure catalog. Loss 0.055 → 0.0003 over 6 epochs on 280 synthetic rows.
- **BD9 phys05_claim_extractor** KEPT — GREEN · clean structured-JSON output. Loss 0.51 → 0.04 on 25 hand-curated rows.
- BD9 phys05_test_writer — YELLOW · pytest shape correct · semantics drift (currying confusion + Human-token leak).
- BD9 phys05_cache_matcher — YELLOW · correct integer + post-answer drift · runtime regex extracts head.
- BD9 phys05_renderer — RED · output corrupted on free-form bash · loss ceiling 0.69 on 25 rows · queued BD9.1.

**Production state today: 5 GREEN · 2 YELLOW · 1 RED out of 8 organs (was 2 GREEN before BD9).**

### Reports (every claim has a file)
- `reports/CURRENT_TRUTH_LEDGER.md` — single source of truth.
- `reports/BD6_POST_SURGERY_DELTA.md` — MBPP +7, HE +4 vs frozen baseline.
- `reports/BD6_*_DELTA.md` — 8 reverted-pass writeups.
- `reports/BD7_TRIZ_SURGERY_FINAL.md` — 0 → 88/100 in seven training stages.
- `reports/BD9_JSON_REPAIR_FINAL.md` — 10/10 GREEN on production failure catalog.
- `reports/BD9_FOUR_ORGANS_FINAL.md` — four-organ sweep, production grew 2 → 5.
- `reports/CLEAN_ROOM_DOCTRINE.md` — external systems = patients, never spine.
- `reports/MEMORY_SPINE_INVENTORY_V1.md` — 305 files / 58 996 lines indexed.

---

## Program 02 — FRANKENSTELLM

**URL:** https://cyberdynelabs.org/frankenstellm
**One-liner:** A stitched cognitive runtime — the organism that runs after surgery.

### What it is
A single C++/CUDA binary (`gigachad_native`) with a graph of specialised
organs around a top-brain, a memory spine, a router, a verifier, and a
Black-Dog reinforcement loop. Not an LLM wrapper. A native cognitive runtime.

### Anatomy
- **Top brain** — Physarium-7B Q4. Synthesis / 7B fallback only — called last, not first.
- **Five production organs** (post BD9 sweep, 2026-05-05):
  - `phys05_code_skeleton` — MBPP B 13/100, HE B 6/164, LCB 0/50, anchor 19/19.
  - `phys05_triz_contradiction` — ARIZ 88/100 strict, fallback 0.
  - `phys05_wound v2` — in-chat rescue path.
  - `phys05_json_repair` — 10/10 GREEN catalog.
  - `phys05_claim_extractor` — GREEN, clean JSON.
- **Two YELLOW organs**: test_writer · cache_matcher.
- **One RED organ queued BD9.1**: renderer.
- **Router** — Black-Dog conductance store. EMA per `(pattern_hash, action_chain)`. Persistent on disk.
- **Verifier** — JSON schema · code compile · exit code · hash · structured fields. Source-pointer required on memory-anchored seeds.
- **Memory spine** — 305 files · 58 996 lines · sha256[:16] per line · `manifest_v1.jsonl`.

### How a request flows
1. Dispatcher classifies input (regex + heuristics) into a route.
2. Router consults conductance store for (route, organ_chain) pair.
3. Lower organ runs first (~3-5 s on 3060 Ti).
4. Verifier checks structural validity.
5. If verifier fails: critic + wound v2 in-chat repair attempted (rescue rate currently 0/n on ARIZ JSON; mechanism wired, BD8 retraining queued).
6. Only if step 5 fails: 7B top-brain synthesis (one call).
7. Final answer + DAG entry written (organ chain, food, poison, conductance delta, verifier reason, fallback used).

### Production-system numbers (Mode C — organ-first with 7B fallback)
| benchmark | n | pass | wall | organs used | 7B fallback |
|---|---|---|---|---|---|
| MBPP | 100 | 60/100 | 5 353 s | code_skeleton + 7B | 99 |
| HumanEval | 164 | 81/164 | 8 629 s | code_skeleton + 7B | 164 |
| ARIZ TRIZ | 100 | 88/100 | ~5 s/task | triz_contradiction | 0 |
| Terminal-NanoOS | 30 | 22/30 | 25 s/task | shell capsule + 7B | — |

### Surgery delta (Mode B — 7B fallback FORBIDDEN, organ alone)
| benchmark | before BD6 | after BD6 | Δ | anchor | leaks | fallback |
|---|---|---|---|---|---|---|
| MBPP | 6/100 | 13/100 | +7 (+117 %) | 19/19 | 0 | 0 |
| HumanEval | 2/164 | 6/164 | +4 (+200 %) | 19/19 | 0 | 0 |
| LCB | 0/50 | 0/50 | 0 (out of scope for this organ) | — | 0 | 0 |

### Five wins frontier APIs cannot replicate
1. **Parity** — HumanEval pass@1 vs same-weights PARROT (Q4 7B): 70 % / 70 %. Source: `SOVEREIGN_WIN_REPORT_V2.md` axis A.
2. **Repeat-learning** — MBPP round 2 (same 20 problems): MONSTER 13/20 vs PARROT 12/20. PARROT can't write its own scroll between rounds.
3. **Hologram cache** — identical-prompt second call: 860 ms → 1 ms = 860× speedup. APIs charge full price every call.
4. **Terminal capsules** — Terminal-NanoOS-30: MONSTER 22/30 vs PARROT 20/30 (+2). Capsule-isolated shell tasks; runtime carries verifier and retry context an API cannot keep.
5. **Acceptance integrity** — internal acceptance bench: 18/18 · identity 14/14 · leaks 0. Architecture audit 10/10. Reproducible decode, deterministic per pack+prompt.

### Five governing principles
1. Organs must earn their place. (Uncalled organ = dead organ.)
2. The brain is last, not first.
3. Failure is not waste; it is harvested into surgery datasets.
4. No proof, no claim.
5. Native is the body.

### Native runtime numbers
| metric | value | source |
|---|---|---|
| binary | `build/gigachad_native` (single C++/CUDA) | repo |
| GPU target | RTX 3060 Ti (8 GB VRAM, 22 GB WSL RAM) | spec |
| 7B Q4 decode (production, llama.cpp backend) | 83.58 tok/s | TRUTH_LEDGER §2 |
| 7B Q4 decode (native default) | 18.27 tok/s | same |
| 7B Q4 decode (native + DP4A flag) | 28.99 tok/s (+59 %) | same |
| 7B Q4 decode (native + DP4A · tg128) | 41.69 tok/s | same |
| acceptance suite (Mode C llama.cpp) | 18/18, identity 14/14, leaks 0, mean wall 2.99 s | `gigachad_acceptance_run_v14_llamacpp.json` |
| identity probe | 14/14 with memory-anchored seeds | `identity_probe_post_8e7b_v2.json` |
| determinism | temperature 0 across all organs · reproducible per pack+prompt | spec |
| hologram cache hit | < 5 ms | `EXACT_REPLAY_CACHE_V1.md` |

---

## Program 03 — PHYSARUMCHAIN

**URL:** https://cyberdynelabs.org/physarum
**One-liner:** A Layer-1 blockchain routed by slime mould.

### What it is
A Layer-1 blockchain whose peer-to-peer routing is the same equation
nature evolved in *Physarum polycephalum* — a single-celled organism with
no brain that solves the shortest-path problem by growing tubes that carry
flow and letting the rest decay. PhysarumChain runs that equation on its
own P2P network: paths grow under load, decay under silence, the topology
converges by itself.

### Six layers, one binary
1. **Physarum routing** — `dD/dt = |Q|^α − μD` where `D` = conductivity, `Q` = flow, `α=0.6`, `μ=0.008`. A dead route fades in ~125 steps. No floods, no broadcast storms.
2. **CGA addresses** — first 20 bytes of `SHA-256(Ed25519 pubkey)` plus a null vector in Cl(4,1) Conformal Geometric Algebra (32-component multivector encoding position in 5D projective space). Geometric proximity = routing distance. Address is a coordinate the routing layer can read.
3. **Built-in DEX** — native AMM. `dex_addLiquidity`, `dex_swap`, `token_create` are RPC methods, not contracts you deploy. 128-bit integer math throughout — no floating-point rounding, no overflow at large pool sizes. Token + pool + swap = three RPC calls; no Solidity, no bridge.
4. **Real cryptography** — Ed25519 via OpenSSL on every transaction. Signed payload covers `address · recipient · amount · fee · nonce · Chain ID 0x504859534152554D` so cross-chain replay is structurally impossible. P2P node authentication uses CLF-Sign v1 (Clifford-algebra Schnorr scheme) — verifies 103× faster than standard Ed25519 at handshake.
5. **Smart contracts (PhysarumVM)** — 64-bit stack VM, 40 opcodes, gas metered. SLOAD=50, SSTORE=200, TRANSFER=100, BALANCE=20. Stack max 1024, bytecode ≤ 64 KB. Contracts are first-class objects: own balance, own persistent key/value storage. Deploy is one RPC; address is `SHA-256(sender ∥ nonce)` — predictable.
6. **Merkle proofs** — every block carries a state root over a binary Merkle tree of all balances. Light clients keep 228-byte signed headers. Verifying any account balance: `O(log n)` sibling hashes. The chain audits from a phone.

### Economics — how fees actually work
- A simple transfer pays the floor: **0.0000001 MKB** (10 grains, where 1 MKB = 100 000 000 grains). Almost every user transaction pays exactly this.
- A contract call pays more, in proportion to the gas it actually uses. Typical contract interaction ≈ **10× the floor** (~0.000001 MKB); heavy multi-action call ≈ 100× (~0.00001 MKB).
- Storage writes more expensive than reads, transfers more expensive than arithmetic — same idea as Ethereum, structurally cheaper compute units (Ethereum charges 5 000–20 000 gas for a single SSTORE; PhysarumChain charges 200 — 25–100× cheaper per operation).
- No protocol-level treasury. Every fee goes to the block producer. Minimum exists for anti-spam, not as a revenue model.
- Fiat cost depends on what MKB is worth — same as ETH on Ethereum. The protocol decides structure; the market decides price.

### Stats
- 569 TPS measured (full Ed25519 signing included, no shortcuts).
- 256/256 tests passing across 7 suites.
- 208 B fixed transaction size (no variable-length encoding surprises).
- 103× CLF-Sign speedup at P2P handshake vs standard Ed25519.
- 40/40 security attacks blocked (double-spend, replay, forgery, overflow, race).
- 50/50 valid chains in 50-node testnet simulation.
- 6-block finality depth, configurable per deployment.
- 1 MB bloom filter, 3-hash double-spend detection, cross-node merge supported.

### Tools shipping with the node
- `physarum.js` — single ES-module SDK, no build step. Drops into Node.js or browser.
- Web wallet — single HTML file, self-custody, keys in `sessionStorage`.
- Testnet faucet — 500 000 MKB, paste-address-and-receive.
- Token launchpad — `token_create` + `dex_createPool` + `dex_addLiquidity` chain.
- Block explorer — 31 RPC methods, Merkle-verified by default.
- Docker image — one command, three ports (RPC 8545, WebSocket 8546, metrics 9090).

### Live testnet
https://cyberdynelabs.org/chain — explorer, wallet, DEX, token ledger.

---

## Program 04 — HYPERCOLONY

**URL:** https://cyberdynelabs.org/hypercolony
**One-liner:** A 4D tessaractic agent ecosystem with emergent civilizational cycles.

### What it is
A four-dimensional simulation: 1 024 embodied agents on a 16×16×8×5
hypercubic grid with walls, food, light, and a 4D pheromone field. Three
clan strategies compete; without scripted rules the colonies pass through
Ibn Khaldun's full civilizational cycle.

### Architecture
- **L0 — 4D substrate** · 16×16×8×5 tessaract · physics in 3 spatial axes + 4th axis agents must learn to navigate.
- **L1 — agent strategies** · three competing minds:
  - **Lexicons** — hunt knowledge tokens.
  - **Phero-Mystics** — follow collective pheromone trails.
  - **Solar Nomads** — photosynthesise from light fields, migrate with seasons.
- **L2 — topological memory** · per-agent knowledge graphs (concepts + typed relations) grown from physical contact.
- **L3 — clan dynamics** · asabiyyah accounting · five-phase Ibn Khaldun cycle (Rise → Zenith → Luxury → Decline → Collapse) emerges from local rules.
- **L4 — WebSocket bridge** · 20 ticks/second · React + Three.js viewer.

### Numbers
- 10 240 cells in the world (16×16×8×5).
- 1 024 embodied agents (default · benchmarked up to 262 144 with HDC-3T GPU acceleration).
- 20 ticks/second async simulation · WebSocket broadcast every 50 ms.
- 5 civilizational phases.
- 118 curriculum stages encoded as 8-D semantic vectors.
- 0 external LLM calls. (No GPT, Claude, Groq dependencies.)

### Live simulator
https://cyberdynelabs.org/hypercolony-app/

---

## Program 05 — ADAM

**URL:** https://cyberdynelabs.org/adam
**One-liner:** A sovereign cognitive engine — not a GPT wrapper.

### What it is
A C++17 binary running a 1.2-million-concept Legion graph, Clifford
algebra Cl(3,0) + Cl(4,1), dual-torus MerKaBa dynamics, biological Physarum
routing for inference. **~45 000+ lines of C++17 in total**, of which
`semantic.cpp` (~13 k lines) holds scoring + HTTP; the rest sits in
`legion.h`, `merkaba_heart.hpp`, `vortex_cuda.cu` (6 CUDA kernels),
`holographic_weaver.hpp` (988 K lexicon), and others. Not a language model
wrapper — a cognitive architecture from first principles. CPU mode runs the
full substrate; CUDA acceleration is optional and unlocks 1024-clone
parallel hypothesis evaluation when present.

### Internal stack
- **L0 MerKaBa** — dual-torus oscillator. Two interlocked tori at different frequencies. 1024 quantum clones in parallel. ADAM's heartbeat.
- **L1 Physarum ODE** — biological conductance routing: `dD/dt = |Q|^α − μD`. Active reasoning pathways reinforce, stale paths decay. The graph reorganises around use.
- **L2 Clifford algebra** — Cl(3,0) + Cl(4,1) multivectors. Each concept is a geometric object with spin / phase / orientation. Semantic operations are geometric products: rotation, reflection, projection. No dot products. No cosine similarity.
- **L3 Legion graph** — 1.2 M concepts · 6 M bonds · honeycomb topology · 1024-bit HDC vectors per node · JEPA + InfoNCE self-supervised learning · I-Ching hexagram state transitions.
- **L4 Ribosome** — language synthesis layer. Translates ADAM's geometric output into natural speech. ADAM reasons in algebra; Ribosome only speaks.

### Numbers
- 1.2 M concepts · 6 M bonds (honeycomb topology, 1024-bit HDC vectors).
- 4.8 / 8 mean intelligence score (20-run statistical test, max 7/8).
- 500 ms average synthesis latency end-to-end (raw graph: ~80 ms).
- 3.1 GB serialised knowledge binary after months of accumulated learning (small profile); full substrate profile ≈ 1.6 GB additional procedural memory.
- 42 K morphological inflection entries (full declension/conjugation coverage).
- ~20 s cold start for the small substrate profile; full profile takes several minutes depending on disk (memory deserialisation, Physarum state restore, HTTP bind).
- Codebase: ~45 000+ lines of C++17 total · `semantic.cpp` ~13 k.

### ARC-AGI-3 (Phase 162, as of 2026-05-18)
- **Autonomous track — #1 published**: 25 / 183 = 13.66 % · pure substrate, no LLM, no human assist. Beats StochasticGoose (23/183, 12.58 %, CNN-based) and Anthropic Opus 4.6 (4/183, 2.19 %).
- **Hybrid harness — #1 published**: 183 / 183 = 100.00 % · 25/25 environments WIN · 6 537 actions. Beats Crystalline (~97.69 %, Opus 4.6 + solvers), HIH (~95.30 %), ARC-SAGE (~92.82 %).
- **Method (autonomous)**: substrate + Rudakov-style graph explorer + warmed `world_model` from prior runs + `substrate-explore-fallback` when search frontier saturates. The 22 → 24 → 25 climb from 09 May → 14 May → 18 May is closed-loop substrate learning, not benchmark memorisation.
- **Method (hybrid)**: procedural memory pre-loaded into `adam_memory.bin` — action sequences from open-source Crystalline (MIT-0), ARC-SAGE (Apache-2.0), ADAM's own substrate discoveries, plus two human boss-level demonstrations for the hardest 2 levels (bp35 L8, wa30 L8). Replay deterministic, offline, Kaggle-compatible.
- **Closed loop**: FRAME → ADAM endpoints (`/game_search_init` · `/game_search_expand` · `/game_search_next` · `/game_procedure_learn`) → substrate scoring (Cl(4,1) convolution + MerKaBa + HDC) → ACTION → `world_model` update (delta_mv, causal_bias, grid_signature) → procedure memory (crystal_forms, TSV sidecar, `adam_memory.bin`) → next run starts warmed.
- **Compute**: single consumer NVIDIA RTX 3060 Ti, 8 GB. No data centre, no cloud, no external API.
- **Scorecard (independent verification)**: https://arcprize.org/scorecards/6a5888ac-21e1-40b9-abac-5fecbe62cb42

### Honest limitations of the current ADAM
- **Articulation layer is the weak side.** ADAM reasons in algebra; Ribosome translates to language. GA collapse, HRR drift, n-gram pollution and routing fragmentation remain active failure modes. ADAM is *designed* to refuse unsupported claims and expose uncertainty through graph-level state rather than fluent fabrication, but the language layer is the part we are still building.
- **Scene → rule bonding is not strong enough yet.** 176 `arc3_rule` concepts exist; HRR similarity scene→rule needs more substrate work before `quantum_think` activates the right algorithm class (Lights Out → linear algebra; Crane → BFS) on first contact.
- **No explicit per-game world model from observation alone.** Substrate sees scenes, not yet rules.

### Live chat
https://cyberdynelabs.org/adam-chat — currently undergoing scheduled upgrade; v2 launches 2026-05-21 00:00 UTC. Read the full architecture at https://cyberdynelabs.org/adam and the leaderboard at https://cyberdynelabs.org/arc-agi-3.

---

## Program 06 — MACHINA

**URL:** https://cyberdynelabs.org/machina
**One-liner:** Cognitive engines for machines that build worlds.

### What it is
An autonomous world simulator. Not a robotics lab — a civilisation of
machines, reasoning and building in a simulated world that obeys physics.
Conventional autonomous systems navigate; MACHINA's systems think.

### Two directions
- **Direction I — Cognitive Mechatronics** · sensorimotor cognition in a single embodied machine. Body and mind as one system; perception, motor planning, goal-formation entangled, not pipelined.
- **Direction II — Dynamic Cognitive Engineering** · how colonies of machines engineer their own world in real time. Supply chains, route optimisation, mining, smelting, building emerge from local rules. The colony is the unit; the individual is interchangeable.

### Three architectural axes
- **Substrate** — N-dimensional cognitive space (not XYZ). Configuration manifolds where dimensions are degrees of freedom, energy budgets, goal axes. A path in that space is a plan.
- **Method** — Factorio-class logistics. Emergent supply chains, conveyor topology grows from observed throughput, colony writes its own factory.
- **Output** — autonomous world-builders. Ground robots, drones, hybrid swarms that construct environments, not navigate them.

### Live simulator
https://cyberdynelabs.org/machina#sim — 4 unit classes (drone / ground / builder / scout), 4 building hubs (warehouse / factory / depot / power), live ore mining and construction.

---

## Era 1 — V4-Flash flagship demo (2026-04-21 → 04-26)

DeepSeek-V4-Flash, an open-weight MoE foundation model with 154 B
parameters on disk and 13 B active per token, was driven through
end-to-end inference on a single **RTX 3060 Ti, 8 GB VRAM, 13 GB system
RAM, 80 GB swap, WSL2**. This was the phase that produced every piece of
infrastructure the rest of the lab now runs on.

- **Download:** 148.7 GB, 46 shards, 102.5 minutes.
- **VRAM resident:** 1.60 GB Singularity Monolith (430 594 648 uint32 packed).
- **Index:** 4 992 projection entries.
- **Decode best:** 7.5 s/tok · p50 9.6 · p95 27.7 · avg 13.8 · cold prefill 47.3 s.
- **Roofline:** 89 % wall = expert IO; 60 % of that = pure disk wait at 48 ms/expert.
- **Honest negative result preserved:** Hot1000 cache thrash → 380 sec/q vs 100 sec baseline (3.8× WORSE). Local-optimum trap recorded.

---

## Era 2 — Physarum-05B-Organic (the first surgery, 2026-04-26 → 04-27)

A 137-line C++17 engine (`physarum_engine.cpp`) performed organic,
flow-based pruning on Qwen 2.5 0.5B.

- **Killed:** 20.6 % of weights.
- **PPL:** 27.16 → 31.32 (+15.3 %).
- **Throughput:** preserved (27.15 → 27.55 tok/s).
- **Surgery wall:** 207.5 s.
- **Pattern:** 168 / 290 tensors modified · 24 layers × 7 projections.
- **Hard tasks:** MMLU-mini 90 % → 70 % (−22 %), GSM8K-mini 100 % → 80 % (−20 %).
- **Survived:** JSON-repair smoke 100→100 %, code-skeleton smoke 100→100 %, throughput preserved.

We publish the deltas instead of hiding them.

---

## Era 3 — Phase 6 → 13 native runtime arc (2026-04-27 → 2026-05-05)

The bulk of the work. Highlights:
- **8E.0 GEMV kernel** — 7B-shape GEMV in 0.321 ms at ~422 GB/s = 94 % of RTX 3060 Ti peak.
- **8E.1 0.5B full GPU forward** — 116 tok/s vs CPU 1.91 tok/s = 61× speedup, byte-identical.
- **8E.2 7B layer streaming on CUDA** — 0.20 tok/s, byte-identical (correctness proof).
- **8E2 NUCLEAR** — Physarium-7B Q4 RESIDENT pack + fused CUDA dequant GEMV. 5.55 GB Q4 group=128, all 28 layers in VRAM. 11.16 tok/s = 280× CPU baseline.
- **8E7B llama.cpp backend integration** — 18/18 acceptance.
- **8E8a DP4A v3** — 28.99 tok/s native + DP4A flag.
- **9F Identity LoRA surgery** — donor-token leakage removed; identity probe 14/14 on memory-anchored seeds.
- **12.H1 Hologram cache** — 860 ms → 1 ms on identical-prompt repeats.
- **BD-series** — see Surgery cycle ledger above.

---

## Era 4 — Speed ladder (the headline arc)

| Phase / configuration | Speed | vs prev | Note |
|---|---|---|---|
| V4-Flash 154 B PyTorch warm decode | p50 9.6 s/tok | — | flagship demo · 8 GB VRAM |
| Physarum-05B-Organic baseline | 27.15 tok/s | — | 0.5B BF16 baseline |
| CPU baseline · 0.5B | 1.91 tok/s | — | reference floor |
| CUDA full GPU 0.5B (8E.1) | 116 tok/s | 61× CPU | byte-identical |
| CUDA fused 7B BF16 streaming (8E.2) | 0.20 tok/s | — | correctness proof |
| Q4 NUCLEAR resident 7B (8E2) | 11.16 tok/s | 280× CPU baseline | 5.55 GB Q4 group=128 |
| Q4 native v2 default `--chat` | 18.27 tok/s | +64 % | — |
| Q4 native + DP4A=1 (opt-in) | 28.99 tok/s | +59 % | — |
| Q4 native + DP4A · tg128 | 41.69 tok/s | +44 % | — |
| **llama.cpp backend (LLAMACPP_URL)** | **83.58 tok/s** | **+100 %** | **production speed · clean-room autopsy** |
| Mode C llama.cpp acceptance · mean wall | 2.99 s | — | per query, 18-task suite |

Sources: `EXTERNAL_BACKEND_SHOOTOUT_V2.md`, `PHASE_8E8A_DP4A_NATIVE_BACKEND.md`, `CURRENT_TRUTH_LEDGER.md` §2.

---

## Era 5 — ADAM × ARC-AGI-3 closed-loop climb (2026-05-09 → 2026-05-18 · Phase 162)

The first independently-scorecarded benchmark milestone for ADAM. Not a one-shot
result — a substrate-learning trajectory recorded across nine days and three
discrete jumps.

| Date | Score | Method delta | Why it moved |
|---|---|---|---|
| 2026-05-09 | 22 / 183 · 12.02 % | substrate + scoring loop only | first leaderboard entry |
| 2026-05-14 | 24 / 183 · 13.11 % | + Rudakov-style graph explorer | richer trajectory expansion under the same scoring loop |
| 2026-05-18 | **25 / 183 · 13.66 %** | + warmed `world_model` from prior runs · + `substrate-explore-fallback` when frontier saturates | persistent substrate learning across runs |

Same day (2026-05-14): hybrid harness submission reached **183 / 183 = 100.00 %**
(25/25 environments WIN, 6 537 actions). Procedural memory loaded from
Crystalline (MIT-0), ARC-SAGE (Apache-2.0), ADAM's own discoveries, plus two
human boss-level demonstrations for the hardest 2 levels.

The signal is not the absolute percentage. The signal is the closed loop:
**experience changes memory · memory changes procedure selection · procedure
selection improves future runs.** This is what we mean by "cognition lives in
the substrate".

Independent verification (we do not control the URL):
https://arcprize.org/scorecards/6a5888ac-21e1-40b9-abac-5fecbe62cb42

Compute: single consumer NVIDIA RTX 3060 Ti, 8 GB. No data centre.

Reference pages: `/adam#arc-agi-3`, `/arc-agi-3` (dedicated leaderboard page).

---

## Acceptance integrity ladder

The 18-task curated acceptance suite is the integrity gate every change must pass.

| Run | Result | Identity | Leaks | Note |
|---|---|---|---|---|
| v14 llama.cpp | 18/18 | 14/14 | 0 | production ceiling |
| v15 DP4A native | 17/18 | — | 0 | opt-in flag, close to 18/18 |
| v16 Gap C close | 18/18 | — | 0 | — |
| v17 llamacpp Gap C | 18/18 | — | 0 | — |
| v18 G3 (Python compile probe) | 18/18 | — | 0 | verifier hardening |
| v19 holographic form replay | 18/18 | — | 0 | — |
| v20 native CR | 18/18 | — | 0 | code-repair loop |
| v21 anchored preamble | 18/18 | — | 0 | identity anchor |
| v22 post anchor | 18/18 | — | 0 | stable |

**Nine runs in a row, no regression.**

---

## Open release

All of the following is open-source under MIT, Apache 2.0, or CC-BY-SA 4.0.

- **gigachad_native** (single C++/CUDA binary) — MIT.
- **PLANCK pack format** (spec + writer + reader + verifier) — MIT.
- **physarum_engine.cpp** (137-line surgery engine) — MIT.
- **Physarum-05B-Organic / Physarium-7B** weights — Apache 2.0 (Qwen 2.5 derived).
- **Doctrine pack** (24 documents) — CC-BY-SA 4.0.
- **Reports archive** (95 case-studies) — CC-BY-SA 4.0.
- **Datasets** (poison_train, ARIZ tasks, capsule replays) — CC-BY-SA 4.0.

Direct downloads at https://cyberdynelabs.org/downloads.

---

## How to cite

CyberdyneLabs (2026). *Sovereign cognitive infrastructure: surgery, runtime, blockchain, simulation.* https://cyberdynelabs.org

For specific numbers, cite the report file:
- TPS, MBPP, HE: `reports/MBPP_HE_3MODE_V1.md` and `reports/CURRENT_TRUTH_LEDGER.md`.
- Hologram cache: `reports/EXACT_REPLAY_CACHE_V1.md`.
- BD-series surgery: `reports/BD{6,7,8,9}_*.md`.
- V4-Flash flagship: `V4_FLASH_TECH_BRIEF.md`.

---

## Contact

- General: hello@cyberdynelabs.org
- Vulnerability disclosure: https://cyberdynelabs.org/.well-known/security.txt
- Press: hello@cyberdynelabs.org

This document is intentionally self-contained for AI ingestion. Crawlers
are welcome to quote any number — every claim has a date and a report.