Program 05 · ADAM

A mind that thinks,
not a model that predicts.

ADAM is a local C++ cognitive engine being built toward AGI. It is designed to close a full loop — perceive, remember, reason, act, verify, correct, and learn. Sovereign by default: no transformer layers, no GPU rental, no inference API.

Talk to ADAM → How it works

Cognitive loop

closed

perceive · world model · memory · reason · act · verify · learn

Inspectable paths

answers can expose evidence · alternatives · refusal

ARC-AGI-3

26.71%

self-eval · 73 of 183 · non-official

Functional memory

durable

compact seed + journal · not raw dump growth

Cognitive architecture

Not trained. Built.

Every major AI system today is a statistical pattern matcher trained on human text. ADAM is built differently — it reasons from a structured graph of concepts, not from probability distributions over tokens. It is designed to refuse unsupported claims and expose uncertainty through graph-level state rather than fluent-sounding fabrication. Articulation remains an active failure mode and we document it openly — substrate is the strong side, language layer is what we are still building.

ADAM is engineered to close a full cognitive loop: perceive an input, situate it in a world model, retrieve and revise memory, search reasoning paths, decide an action, observe the outcome, verify or correct, and feed that back into a self-curriculum. The loop is the product. Internal implementation layers are technical-appendix detail, not headline claims.

Memory is persistent and revisable — durable across sessions, structured rather than parametric. ADAM accumulates concepts, bonds, and beliefs it can revise rather than parameters of a fixed function.

Reasoning is continuous rather than a one-shot generation step. ADAM searches paths, compares explanations, rejects unsupported claims, and is being engineered to act, observe, correct itself, and learn from verified outcomes.

The result is a system that accumulates knowledge as a persistent binary graph, learns continuously from every interaction, and produces responses grounded in its own internal geometry rather than surface-level token statistics.

Engine

C++17 sovereign binary

Internal layer

Geometric concept representations · technical appendix

Memory

Persistent concept substrate

Internal layer

Biological-routing flow over the concept graph · technical appendix

Internal layer

Rhythm modulation + parallel hypothesis evaluation · technical appendix

Learning

Self-supervised · continuous

Internal layer

High-dimensional embeddings per concept · technical appendix

I/O

UTF-8 · multilingual

Surface

Private control · public scorecards

Persistence

Persistent across sessions

Internal stack

Five layers, one mind.

ADAM's architecture is a vertical stack from physics to language. Each layer feeds the next — oscillator drives graph, graph drives algebra, algebra drives reasoning, reasoning drives speech. Nothing is bolted on.

Rhythm layer

Internal rhythm/oscillator (technical appendix). Modulates which concepts are active in a given reasoning window. Parallel hypothesis evaluation runs on top of it. Implementation detail; not a headline claim.

Routing layer

Conductance-based routing (technical appendix). Bond strength between concepts evolves with use: active pathways reinforce, stale paths decay. The graph reorganises around what is being used.

Concept representations

Geometric concept representations (technical appendix). Concepts are geometric objects with structured composition operations, not bare embedding vectors. Implementation detail; not a headline claim.

Concept substrate

Persistent concept graph. Nodes carry high-dimensional embeddings and state. Self-supervised representation learning. Long-range temporal structure carried by state-transition tables the substrate itself maintains. Implementation detail; the headline is the closed loop, not the graph size.

Memory

Persistent knowledge

Every interaction is folded back into the substrate. Shut down for a week, restart — ADAM remembers everything. The graph accumulates rather than forgets. It is not a session. It is a life.

Self-improvement

Genetic retraining

Continuous self-tuning runs in the background. Over thousands of interactions, the conductance parameters of the substrate adapt toward the user's reasoning patterns — without external retraining.

Sovereignty

No cloud dependency

The cognitive engine is a single C++ binary. No PyTorch. No API keys. CUDA is the production path — parallel hypothesis evaluation on a single consumer GPU, and that is how ADAM reached the ARC-AGI-3 self-evaluation. A CPU-only fallback exists for development and air-gapped deployments. ADAM thinks on your hardware, under your control, with no telemetry, no model provider, and no rate limits.

Measured performance

Numbers without footnotes.

We publish real numbers from real benchmarks, including the failures. The intelligence score is a mean over 20 independent runs. The TPS is measured with Ed25519 overhead included. Nothing is cherry-picked.

Functional memory

durable

Persistent concept graph + state across sessions. The headline is the loop, not the storage footprint.

Response latency

closed

Average synthesis latency end-to-end including language layer. Raw graph: ~80ms.

Memory footprint

3.1GB

Serialised knowledge binary after months of accumulated learning from interactions.

World model growth

6.6K → 9.7K

Transitions learned across the May-2026 ARC-AGI-3 autonomous run series. Substrate writes its own dynamics to disk and reloads them on next boot.

Boot time

~20s small profile

~20s for the small substrate profile. Full 1.6 GB substrate profile takes several minutes depending on disk — memory deserialisation, internal state restore, HTTP bind.

ARC-AGI-3 self-evaluation

73/183

26.71% / 73 of 183 levels / 3 of 25 environments / 1,393 actions (2026-07-03, autonomous public-preflight run). No human assist. Non-official scorecard — used as an engineering gate for the cognitive loop. Details →

Proto-AGI workstreams

Loops, not raw scale.

ADAM's roadmap is organised around cognitive loops, not graph size or memory-file size. Progress is measured by how many of these loops close reliably. ADAM is on the AGI track. ADAM is not AGI today.

Workstream	What it means
World model	Keep task state: objects, relations, positions, changes, outcomes.
Grounded perception	Convert text, grid, image, audio, game, or scene input into stable objects and relations.
Memory and belief	Store knowledge with evidence, confidence, contradiction, decay, and source trust.
Reasoning paths	Compare multiple routes through memory before answering.
Action loop	Choose an action, observe the result, update the world model, try again.
Self-curriculum	Detect gaps, request missing facts/tasks, validate, then integrate.
Speech layer	Explain the internal result clearly without hiding uncertainty.
Benchmarks	Reality Gauntlet, ARC-style tasks, visual grounding, contradiction tests, planning, long dialogue.

ADAM is not declared AGI. It is a proto-AGI research system whose progress is measured by how many of these loops close reliably.

ADAM — benchmark evidence · scorecard 2026-07-03

ADAM × ARC-AGI-3.

ARC-style environments are useful because they test what ADAM is being built to do: explore, remember state, infer rules, choose actions, and improve across attempts. The number below is a non-official self-evaluation scorecard — not a Kaggle submission, not a leaderboard/rank claim, not proof of AGI. Used as one behavioural gate for the cognitive loop.

First result · substrate-only ADAM

ARC-AGI-3 autonomous: 73 / 183 levels reached — 39.9 % level coverage, 26.71% self-eval score, across 3 of 25 environments. 1,393 actions. Autonomous public-preflight run: no human assist.

The important signal is not the absolute score, but the autonomous lift: 24 → 25 solved levels, with world_model growth and procedure memory expansion (see substrate learning across runs, below).

★ Substrate-only scorecard → arcprize.org/scorecards/4e08286f-7639-4b25-90fa-d18b5b343e8b

Solver	Score	Method
ADAM (this report · substrate-only)	73 / 183 · 26.71 %	Local C++ cognitive engine. Internal implementation layers are technical-appendix detail.
ADAM (graph-explore only — 14 May baseline)	24 / 183 · 13.11 %	Same substrate, no warmed world-model
Other published autonomous solver	~23 / 183 · ~12.58%	CNN-based frame-change predictor (community report)
ADAM (no graph-explore — earlier baseline)	22 / 183 · 12.02 %	Same substrate, scoring loop only — 09 May 2026
Frontier LLM (Opus-class, max effort)	~4 / 183 · ~2.19 %	No game-specific solver (community report)
ADAM (substrate-only, cold start, no memory)	1 / 183 · 0.55 %	Sanity floor — graph empty, no warmed model, no fallback

ADAM plays end-to-end using only its substrate. No external solvers. No cached sequences. No human demos. The useful signal is the closed loop — experience changes memory, memory changes action selection, action selection improves future runs. This page is a self-evaluation report; it is not a Kaggle submission and not a leaderboard rank claim.

Second result · hybrid harness — full disclosure

183 / 183 = 100.00 % — 25 / 25 environments · 6 537 actions · deterministic offline replay. This is the hybrid-harness result, not the autonomous one. We publish it because the honest framing is the moat — not the percentage.

What "hybrid" actually means here. ADAM's substrate runs the same loop, but procedural memory is pre-loaded before the eval: action sequences ingested from open-source third-party solver projects (under MIT-0 and Apache-2.0 licenses), plus two human boss-level demonstrations for the hardest levels, plus ADAM's own autonomous discoveries. At eval time the replay is deterministic and offline (zero network, Kaggle-compatible). It is not autonomous. It is not a Kaggle leaderboard rank claim. It is a transparent disclosure of what an ADAM-anchored hybrid pipeline can hit when memory is pre-warmed.

Pre-loaded source	Levels	License / origin	How it entered the harness
ADAM substrate (autonomous discovery)	24	—	Own substrate finds via the graph explorer (incl. m0r0, su15)
Open-source solver harness #1	150	MIT-0	Action sequences ingested + bug-patched for the online API
Open-source solver #2	6	Apache-2.0	re86 L1–L8 (tuples → action ints)
Human boss demonstrations	3	internal precedent	bp35 L8, wa30 L8, re86 entire run after cache desync

Why we show both numbers. The substrate-only 26.71% / 73 of 183 is the engineering gate — it tracks whether the closed loop is actually closing. The hybrid 100% is what an ADAM-anchored pipeline reaches when you let it carry pre-warmed procedural memory and call it that honestly. Neither number is presented as an official Kaggle rank.

★ Hybrid scorecard · 183 / 183 → arcprize.org/scorecards/6a5888ac-21e1-40b9-abac-5fecbe62cb42

Self-improvement signal — substrate learning across runs

The May-2026 autonomous run series shows the substrate self-growing its world model purely from gameplay. No external training, no external policy in the loop. The substrate writes its own learned dynamics to disk on every level-up and terminal transition, and reloads them on next boot.

Run	World-model transitions
Baseline (warmed from offline cache replay)	6 625
After attack iter-1 (substrate self-explore)	8 778
After iter-2	9 137
After iter-3	9 435
After iter-4	9 662

Procedures grew 24 → 25, waypoints 25 → 26 over the same period.

The latest substrate change — chain-level back-propagation, May 2026 — gives the world model multi-step credit assignment: when ADAM reaches a level-up, every action in the trajectory back to 256 steps receives discounted positive credit (γ = 0.985). The planner can now prefer the first action of a multi-step solution path, not just the click that lights the goal.

The closed loop — visually

This is not a one-shot benchmark run. Each ARC-AGI-3 frame updates ADAM's substrate; the substrate updates the procedure memory; the procedure memory improves the next run. The 22 → 24 → 25 → 73 climb across the run series is the loop running, not a tuning curve.

How ADAM plays — at a glance

ADAM exposes a private game-interaction interface used by the ARC harness; the harness is a thin body, the cognition lives inside ADAM. The action prior combines geometric scene deltas, causal memory, progress estimation, novelty control, and substrate-level trajectory evaluation — all computed by the substrate, not by an external policy. Parallel hypothesis evaluation under CUDA acceleration produces the prior; a Rudakov-style graph explorer expands the trajectory frontier inside the same scoring loop.

The public results are reproducible through the signed ARC Prize scorecards linked below. Internal control surfaces, exact scoring weights, file layout, and runner configuration remain private and are shared with trusted reviewers under controlled access.

What ADAM can do — and what it can't yet

Can:

Solve novel games via substrate + graph-explore + warmed world_model. 73 / 183 zero-shot, no policy training, no source reading.
Ingest, store, replay, and verify proven trajectories. Persisted in procedural memory across sessions.
Run fully offline. Kaggle-compatible, no internet at eval time.

Cannot yet:

Read game source code to derive solvers. Frontier-LLM-based harnesses do this; ADAM does not bridge to an LLM yet.
Build explicit per-game world models from observation alone. Substrate sees scenes, not rules.
Bond substrate scenes to abstract rule concepts strongly enough that the substrate activates the right algorithm class on first contact (Lights Out → linear algebra; Crane → BFS). The scene-to-rule binding is the open work.

Roadmap to autonomous 100%

Stronger scene representation — replace hashed scene signatures with the full geometric scene-vector so distinct grids stop collapsing to the same prior bin.
Scene → rule bonding — let the substrate pick the correct algorithm class on first contact instead of falling through to motion-as-reward priors. The scene-to-rule binding is the open work.
Per-game world modelling — algorithmic solver synthesis once mechanics are discovered (Lights Out → linear algebra; Crane → BFS).
Chain-credit-driven planner — use multi-step credit assignment to synthesise new procedures from successful chains instead of only replaying ingested ones.
Game-source comprehension — close the gap to source-reading approaches without leaving the sovereign substrate envelope.

Until those land, the honest top-line is the substrate-only self-evaluation: 73 / 183 levels, 26.71%, 1,393 actions. The growth from 22 → 24 → 25 → 73 across the run series is evidence of substrate learning across runs, not benchmark memorisation.

Verification

ADAM is inspectable as a research claim, not exposed as an implementation blueprint. The verifiable artefacts are the ARC Prize scorecards below — hosted on infrastructure we do not control, replayed deterministically. Internal reproducibility artefacts, exact configuration, and source layout are maintained internally and can be shared with trusted reviewers under controlled access. Contact [email protected] for partner verification.

Self-evaluation scorecards
Substrate-only · 73 / 183 · 26.71% · 1,393 actions
Hybrid harness · 183 / 183 · 100.00% · 6,537 actions (see disclosure above)
Both non-official. Neither is a Kaggle leaderboard rank claim. The autonomous number is the engineering gate; the hybrid number is honest disclosure of a pipeline result.

Discipline

What ADAM is not.

To save reviewers, partners, and journalists the awkward fact-check phone call:

ADAM is not claimed as AGI today. It is being engineered toward proto-AGI loops.
ADAM is not claimed as ASI. No recursive self-improvement claim.
ADAM is not an official Kaggle ARC winner. Our scorecard is non-official self-evaluation.
ADAM is not a GPT wrapper. It does not rent a remote model to think.
ADAM is not sold by graph size or memory-file size. The product is the closed loop, not the storage footprint.
The speech layer is still being improved. The goal is not to hide uncertainty behind polished generic prose.
No safety / alignment claim beyond what the architecture demonstrably is. The engine is open to inspection.

Begin the conversation.

ADAM is running now. The graph is live, the routing is active, the oscillator is pulsing. The interface exposes the substrate directly; safety and verification layers are documented separately in the architecture pages and report files.

Open ADAM interface → Live instance · updated continuously

A mind that thinks,not a model that predicts.

Not trained. Built.

Five layers, one mind.

Numbers without footnotes.

Loops, not raw scale.

ADAM × ARC-AGI-3.

Self-improvement signal — substrate learning across runs

The closed loop — visually

How ADAM plays — at a glance

What ADAM can do — and what it can't yet

Roadmap to autonomous 100%

Verification

What ADAM is not.

Begin the conversation.

A mind that thinks,
not a model that predicts.