Program 05 · ADAM

A mind that thinks,
not a model that predicts.

ADAM is a local C++ cognitive engine being built toward AGI. It is designed to close a full loop — perceive, remember, reason, act, verify, correct, and learn. Sovereign by default: no transformer layers, no GPU rental, no inference API.

Cognitive loop
closed
perceive · world model · memory · reason · act · verify · learn
Inspectable paths
on
answers can expose evidence · alternatives · refusal
ARC-AGI-3
6.77%
self-eval · 25 of 183 · non-official
Functional memory
durable
compact seed + journal · not raw dump growth

Cognitive architecture

Not trained. Built.

Every major AI system today is a statistical pattern matcher trained on human text. ADAM is built differently — it reasons from a structured graph of concepts, not from probability distributions over tokens. It is designed to refuse unsupported claims and expose uncertainty through graph-level state rather than fluent-sounding fabrication. Articulation remains an active failure mode and we document it openly — substrate is the strong side, language layer is what we are still building.

ADAM is engineered to close a full cognitive loop: perceive an input, situate it in a world model, retrieve and revise memory, search reasoning paths, decide an action, observe the outcome, verify or correct, and feed that back into a self-curriculum. The loop is the product. Internal implementation layers are technical-appendix detail, not headline claims.

Memory is persistent and revisable — durable across sessions, structured rather than parametric. ADAM accumulates concepts, bonds, and beliefs it can revise rather than parameters of a fixed function.

Reasoning is continuous rather than a one-shot generation step. ADAM searches paths, compares explanations, rejects unsupported claims, and is being engineered to act, observe, correct itself, and learn from verified outcomes.

The result is a system that accumulates knowledge as a persistent binary graph, learns continuously from every interaction, and produces responses grounded in its own internal geometry rather than surface-level token statistics.

Engine
C++17 sovereign binary
Internal layer
Geometric concept representations · technical appendix
Memory
Persistent concept substrate
Internal layer
Biological-routing flow over the concept graph · technical appendix
Internal layer
Rhythm modulation + parallel hypothesis evaluation · technical appendix
Learning
Self-supervised · continuous
Internal layer
High-dimensional embeddings per concept · technical appendix
I/O
UTF-8 · multilingual
Surface
Private control · public scorecards
Persistence
Persistent across sessions

Internal stack

Five layers, one mind.

ADAM's architecture is a vertical stack from physics to language. Each layer feeds the next — oscillator drives graph, graph drives algebra, algebra drives reasoning, reasoning drives speech. Nothing is bolted on.

L0
Rhythm layer
Internal rhythm/oscillator (technical appendix). Modulates which concepts are active in a given reasoning window. Parallel hypothesis evaluation runs on top of it. Implementation detail; not a headline claim.
L1
Routing layer
Conductance-based routing (technical appendix). Bond strength between concepts evolves with use: active pathways reinforce, stale paths decay. The graph reorganises around what is being used.
L2
Concept representations
Geometric concept representations (technical appendix). Concepts are geometric objects with structured composition operations, not bare embedding vectors. Implementation detail; not a headline claim.
L3
Concept substrate
Persistent concept graph. Nodes carry high-dimensional embeddings and state. Self-supervised representation learning. Long-range temporal structure carried by state-transition tables the substrate itself maintains. Implementation detail; the headline is the closed loop, not the graph size.
L4
Ribosome
Language synthesis layer. Translates ADAM's internal cognitive state into natural language. The speech layer is still being improved; the goal is not to hide uncertainty behind polished generic prose.
Memory
Persistent knowledge
Every interaction is folded back into the substrate. Shut down for a week, restart — ADAM remembers everything. The graph accumulates rather than forgets. It is not a session. It is a life.
Self-improvement
Genetic retraining
Continuous self-tuning runs in the background. Over thousands of interactions, the conductance parameters of the substrate adapt toward the user's reasoning patterns — without external retraining.
Sovereignty
No cloud dependency
The cognitive engine is a single C++ binary. No PyTorch. No API keys. CUDA is the production path — parallel hypothesis evaluation on a single consumer GPU, and that is how ADAM reached the ARC-AGI-3 self-evaluation. A CPU-only fallback exists for development and air-gapped deployments. ADAM thinks on your hardware, under your control, with no telemetry, no model provider, and no rate limits.

Measured performance

Numbers without footnotes.

We publish real numbers from real benchmarks, including the failures. The intelligence score is a mean over 20 independent runs. The TPS is measured with Ed25519 overhead included. Nothing is cherry-picked.

Functional memory
durable
Persistent concept graph + state across sessions. The headline is the loop, not the storage footprint.
Response latency
closed
Average synthesis latency end-to-end including language layer. Raw graph: ~80ms.
Memory footprint
3.1GB
Serialised knowledge binary after months of accumulated learning from interactions.
World model growth
6.6K → 9.7K
Transitions learned across the May-2026 ARC-AGI-3 autonomous run series. Substrate writes its own dynamics to disk and reloads them on next boot.
Boot time
~20s small profile
~20s for the small substrate profile. Full 1.6 GB substrate profile takes several minutes depending on disk — memory deserialisation, internal state restore, HTTP bind.
ARC-AGI-3 self-evaluation
25/183
6.77% / 25 of 183 levels / 2,673 actions (2026-05-18). Substrate-only, no LLM, no human assist. Non-official scorecard — used as an engineering gate for the cognitive loop. Details →

Proto-AGI workstreams

Loops, not raw scale.

ADAM's roadmap is organised around cognitive loops, not graph size or memory-file size. Progress is measured by how many of these loops close reliably. ADAM is on the AGI track. ADAM is not AGI today.

Workstream What it means
World modelKeep task state: objects, relations, positions, changes, outcomes.
Grounded perceptionConvert text, grid, image, audio, game, or scene input into stable objects and relations.
Memory and beliefStore knowledge with evidence, confidence, contradiction, decay, and source trust.
Reasoning pathsCompare multiple routes through memory before answering.
Action loopChoose an action, observe the result, update the world model, try again.
Self-curriculumDetect gaps, request missing facts/tasks, validate, then integrate.
Speech layerExplain the internal result clearly without hiding uncertainty.
BenchmarksReality Gauntlet, ARC-style tasks, visual grounding, contradiction tests, planning, long dialogue.

ADAM is not declared AGI. It is a proto-AGI research system whose progress is measured by how many of these loops close reliably.

ADAM — benchmark evidence · scorecard 2026-05-18

ADAM × ARC-AGI-3.

ARC-style environments are useful because they test what ADAM is being built to do: explore, remember state, infer rules, choose actions, and improve across attempts. The number below is a non-official self-evaluation scorecard — not a Kaggle submission, not a leaderboard/rank claim, not proof of AGI. Used as one behavioural gate for the cognitive loop.

First result · substrate-only ADAM

ARC-AGI-3 substrate-only: 25 / 183 levels solved — 13.66 % level coverage, 6.77% self-eval score. 2,673 actions · 83 min wall-clock. Substrate-only configuration: no LLM, no human assist.

The important signal is not the absolute score, but the autonomous lift: 24 → 25 solved levels, with world_model growth and procedure memory expansion (see substrate learning across runs, below).

★ Substrate-only scorecard → arcprize.org/scorecards/293d0e49-9d63-46d0-9cdb-16bdac40fbf2

SolverScoreMethod
ADAM (this report · substrate-only)25 / 183 · 6.77 %Local C++ cognitive engine. Internal implementation layers are technical-appendix detail.
ADAM (graph-explore only — 14 May baseline)24 / 183 · 13.11 %Same substrate, no warmed world-model
Other published autonomous solver~23 / 183 · ~12.58%CNN-based frame-change predictor (community report)
ADAM (no graph-explore — earlier baseline)22 / 183 · 12.02 %Same substrate, scoring loop only — 09 May 2026
Frontier LLM (Opus-class, max effort)~4 / 183 · ~2.19 %No game-specific solver (community report)
ADAM (substrate-only, cold start, no memory)1 / 183 · 0.55 %Sanity floor — graph empty, no warmed model, no fallback

ADAM plays end-to-end using only its substrate. No external solvers. No cached sequences. No human demos. The useful signal is the closed loop — experience changes memory, memory changes action selection, action selection improves future runs. This page is a self-evaluation report; it is not a Kaggle submission and not a leaderboard rank claim.

Second result · hybrid harness — full disclosure

183 / 183 = 100.00 % — 25 / 25 environments · 6 537 actions · deterministic offline replay. This is the hybrid-harness result, not the autonomous one. We publish it because the honest framing is the moat — not the percentage.

What "hybrid" actually means here. ADAM's substrate runs the same loop, but procedural memory is pre-loaded before the eval: action sequences ingested from open-source third-party solver projects (under MIT-0 and Apache-2.0 licenses), plus two human boss-level demonstrations for the hardest levels, plus ADAM's own autonomous discoveries. At eval time the replay is deterministic and offline (zero network, Kaggle-compatible). It is not autonomous. It is not a Kaggle leaderboard rank claim. It is a transparent disclosure of what an ADAM-anchored hybrid pipeline can hit when memory is pre-warmed.

Pre-loaded sourceLevelsLicense / originHow it entered the harness
ADAM substrate (autonomous discovery)24Own substrate finds via the graph explorer (incl. m0r0, su15)
Open-source solver harness #1150MIT-0Action sequences ingested + bug-patched for the online API
Open-source solver #26Apache-2.0re86 L1–L8 (tuples → action ints)
Human boss demonstrations3internal precedentbp35 L8, wa30 L8, re86 entire run after cache desync

Why we show both numbers. The substrate-only 6.77% / 25 of 183 is the engineering gate — it tracks whether the closed loop is actually closing. The hybrid 100% is what an ADAM-anchored pipeline reaches when you let it carry pre-warmed procedural memory and call it that honestly. Neither number is presented as an official Kaggle rank.

★ Hybrid scorecard · 183 / 183 → arcprize.org/scorecards/6a5888ac-21e1-40b9-abac-5fecbe62cb42

Self-improvement signal — substrate learning across runs

The May-2026 autonomous run series shows the substrate self-growing its world model purely from gameplay. No external training, no external policy in the loop. The substrate writes its own learned dynamics to disk on every level-up and terminal transition, and reloads them on next boot.

RunWorld-model transitions
Baseline (warmed from offline cache replay)6 625
After attack iter-1 (substrate self-explore)8 778
After iter-29 137
After iter-39 435
After iter-49 662

Procedures grew 24 → 25, waypoints 25 → 26 over the same period.

The latest substrate change — chain-level back-propagation, May 2026 — gives the world model multi-step credit assignment: when ADAM reaches a level-up, every action in the trajectory back to 256 steps receives discounted positive credit (γ = 0.985). The planner can now prefer the first action of a multi-step solution path, not just the click that lights the goal.

The closed loop — visually

This is not a one-shot benchmark run. Each ARC-AGI-3 frame updates ADAM's substrate; the substrate updates the procedure memory; the procedure memory improves the next run. The 22 → 24 → 25 climb in three weeks is the loop running, not a tuning curve.

FRAME 64×64 grid · state ADAM CORE private game interface substrate-side reasoning ACTION PRIOR geometric scene deltas causal memory · novelty ACTION click · key · key_click WORLD MODEL UPDATE scene transition learned causal credit assigned PROCEDURE MEMORY proven trajectories persisted across sessions NEXT RUN — WARMED prior on first frame 22 → 24 → 25 / 183 forward pass substrate-learning loop (closes across runs)

How ADAM plays — at a glance

ADAM exposes a private game-interaction interface used by the ARC harness; the harness is a thin body, the cognition lives inside ADAM. The action prior combines geometric scene deltas, causal memory, progress estimation, novelty control, and substrate-level trajectory evaluation — all computed by the substrate, not by an external policy. Parallel hypothesis evaluation under CUDA acceleration produces the prior; a Rudakov-style graph explorer expands the trajectory frontier inside the same scoring loop.

The public results are reproducible through the signed ARC Prize scorecards linked below. Internal control surfaces, exact scoring weights, file layout, and runner configuration remain private and are shared with trusted reviewers under controlled access.

What ADAM can do — and what it can't yet

Can:

  • Solve novel games via substrate + graph-explore + warmed world_model. 25 / 183 zero-shot, no policy training, no source reading.
  • Ingest, store, replay, and verify proven trajectories. Persisted in procedural memory across sessions.
  • Run fully offline. Kaggle-compatible, no internet at eval time.

Cannot yet:

  • Read game source code to derive solvers. Frontier-LLM-based harnesses do this; ADAM does not bridge to an LLM yet.
  • Build explicit per-game world models from observation alone. Substrate sees scenes, not rules.
  • Bond substrate scenes to abstract rule concepts strongly enough that the substrate activates the right algorithm class on first contact (Lights Out → linear algebra; Crane → BFS). The scene-to-rule binding is the open work.

Roadmap to autonomous 100%

  1. Stronger scene representation — replace hashed scene signatures with the full geometric scene-vector so distinct grids stop collapsing to the same prior bin.
  2. Scene → rule bonding — let the substrate pick the correct algorithm class on first contact instead of falling through to motion-as-reward priors. The scene-to-rule binding is the open work.
  3. Per-game world modelling — algorithmic solver synthesis once mechanics are discovered (Lights Out → linear algebra; Crane → BFS).
  4. Chain-credit-driven planner — use multi-step credit assignment to synthesise new procedures from successful chains instead of only replaying ingested ones.
  5. Game-source comprehension — close the gap to source-reading approaches without leaving the sovereign substrate envelope.

Until those land, the honest top-line is the substrate-only self-evaluation: 25 / 183 levels, 6.77%, 2,673 actions. The growth from 22 → 24 → 25 since 09 May is evidence of substrate learning across runs, not benchmark memorisation.

Verification

ADAM is inspectable as a research claim, not exposed as an implementation blueprint. The verifiable artefacts are the ARC Prize scorecards below — hosted on infrastructure we do not control, replayed deterministically. Internal reproducibility artefacts, exact configuration, and source layout are maintained internally and can be shared with trusted reviewers under controlled access. Contact [email protected] for partner verification.

Self-evaluation scorecards
Substrate-only · 25 / 183 · 6.77% · 2,673 actions
Hybrid harness · 183 / 183 · 100.00% · 6,537 actions (see disclosure above)
Both non-official. Neither is a Kaggle leaderboard rank claim. The autonomous number is the engineering gate; the hybrid number is honest disclosure of a pipeline result.

Language synthesis

The Ribosome layer.

ADAM thinks in algebra and graph paths. The Ribosome is the translation gateway that converts ADAM's symbolic output into natural, fluent language — while keeping the intelligence entirely inside the graph engine.

In biology, a ribosome translates genetic code into proteins. Here, Ribosome translates ADAM's internal cognitive state into language. The cognitive work — search, comparison of explanations, refusal of unsupported claims — is done by ADAM itself. Ribosome only speaks. Internal layer names (rhythm, routing, geometric concept representations) are technical-appendix detail.

This separation is deliberate. It means ADAM's reasoning is auditable at the graph level, independent of language style — the geometric work that produced an answer can be inspected separately from the words that explain it.

The Ribosome also runs a fractal self-learning loop — after each user query, it recursively questions new concepts ADAM encounters, enriching the graph with depth-3 exploration. Every conversation makes ADAM slightly more informed about the topics you care about.

01
User query
Query arrives at the Ribosome gateway. Session continuity is preserved across turns for conversational memory.
02
Graph reasoning
ADAM receives the query through its private cognitive interface. Internal routing activates relevant concepts. The cognitive loop runs reasoning over the activated state and returns a structured answer.
03
Structural extraction
Ribosome extracts the compact graph — active concepts, bond paths, geometric state — and reads the substrate-side resonance signature for the response.
04
Language synthesis
A language model receives ADAM's structured reasoning as context and translates it into natural speech. The language model never reasons — it only speaks what ADAM has already concluded.
05
Fractal deepening
New concepts encountered during the query trigger background fractal exploration (depth 3, breadth 3). ADAM's graph grows richer after every exchange.

Discipline

What ADAM is not.

To save reviewers, partners, and journalists the awkward fact-check phone call:

  • ADAM is not claimed as AGI today. It is being engineered toward proto-AGI loops.
  • ADAM is not claimed as ASI. No recursive self-improvement claim.
  • ADAM is not an official Kaggle ARC winner. Our scorecard is non-official self-evaluation.
  • ADAM is not a GPT wrapper. It does not rent a remote model to think.
  • ADAM is not sold by graph size or memory-file size. The product is the closed loop, not the storage footprint.
  • The speech layer is still being improved. The goal is not to hide uncertainty behind polished generic prose.
  • No safety / alignment claim beyond what the architecture demonstrably is. The engine is open to inspection.

Begin the conversation.

ADAM is running now. The graph is live, the routing is active, the oscillator is pulsing. The interface exposes the substrate directly; safety and verification layers are documented separately in the architecture pages and report files.

Open ADAM interface Live instance · updated continuously