# Physarium results — reconcile (errata-grade)

**Date:** 2026-04-27
**Status:** This document is a **mandatory errata insert** for every Physarium-v1 claim. Read it before reading any other Physarium number.
**Companion audit:** `PHYSARIUM_COVERAGE_AUDIT.md` (denominator forensics).

## The two rules

1. **Two distinct experiments — never mix.**
2. **Every kill-percentage must state its denominator.**

## Experiment A — `final_results` / organic surgery line

Source: surgery binary's running totals on
`/home/pc/gigachad_native/Physarium-7B-Native/` and
`/mnt/c/Users/pc/Desktop/folder/Physarum-05B-Organic/`.

| Metric                                  | Value                | Denominator                  |
|-----------------------------------------|----------------------|------------------------------|
| Killed weights (7B)                     | 1,450,103,613        | 6,525,288,448 target proj    |
| Kill rate over target proj weights      | **22.22 %**          | 100 % of target proj covered |
| Kill rate over full 7B model            | 19.04 %              | 7,615,616,512 total weights  |
| Held-out perplexity delta (organic 0.5B run) | **+15.3 %**     | held-out test set            |

`processed / target = 100 %` per `PHYSARIUM_COVERAGE_AUDIT.md` — non-overlap
256×256 tiling sees every weight in every target tensor, so the 22.22 %
denominator is whole-target, not a sub-window.

Per-projection 7B kill range:
- min 21.79 % (`o_proj`), max 23.19 % (`k_proj`).
- Layer-1 MLP projections were the densest pruned (43–46 %).

## Experiment B — `lm-eval metric_results`

Source: `lm-eval-harness` external probe on a separately-hosted run.

| Metric                                  | Value                |
|-----------------------------------------|----------------------|
| Perplexity                              | 19.62 → 19.94        |
| MMLU `machine_learning` subset          | +0.9 pp              |

Not directly comparable to Experiment A's PPL because of different test
sets, tokenization, and a different surgery snapshot. Cite it as
*"lm-eval B"* in any joint report.

## What Physarium v1 actually is

`physarum_block()` operates on the **magnitude** of each weight:

- input feature: `D_i = |w_i|` (or a smooth function of magnitude)
- dynamics: nutrient flow + decay biased toward high-magnitude paths
- output: a binary `keep / kill` mask over each tile's weights

It does **not** see activations. It does **not** see gradients. It does not
see contribution to the loss or to specific output logits. The slime-mold
geometry is honest, but the food signal it eats is static weight magnitude
inside each non-overlapping 256×256 tile.

**Physarium v1 = static magnitude-flow surgery, tile-local.**

## What Physarium v2 needs

A proper activation-aware Physarium-v2 must compute per-weight importance as

    importance(w_ij) = act_norm(input_i) · stability(w_ij) · contribution(w_ij → output)

…where:

- `act_norm(input_i)` = per-feature activation magnitude over a calibration set
- `stability(w_ij)` = variance of `w_ij`'s saliency under input noise
- `contribution(w_ij → output)` = signed influence on downstream loss (e.g. Hessian-based)

**Physarium v2 = activation-aware flow.**

Until v2 exists, every v1 number must travel with this errata insert.

## Wording template (mandatory in every report that quotes v1)

> Physarium-v1 numbers must be read through `PHYSARIUM_RESULTS_RECONCILE.md`:
> two different experiments (organic surgery vs lm-eval), v1 is
> magnitude-flow (not activation-aware), and every killed-% must state its
> denominator. Tile coverage of target proj tensors is 100 % (audit:
> `PHYSARIUM_COVERAGE_AUDIT.md`); the 22.22 % figure is over those target
> weights, not a sub-sample.

## Where the numbers live

- 7B surgery raw log: `reports/physarium7b_surgery_run.log`
- 7B surgery summary: `reports/PHYSARIUM7B_SURGERY_REPORT.md`
- Coverage audit (denominator forensics): `reports/PHYSARIUM_COVERAGE_AUDIT.md` + `reports/physarium_coverage_audit.json`
- Phase-7 consolidated: `reports/GIGACHAD_PHASE7_CONSOLIDATED.md`
- 0.5B organic source: `/mnt/c/Users/pc/Desktop/folder/Physarum-05B-Organic/`
- Surgery code: `src/physarium/physarium_engine.cpp::prune_matrix()`

## TL;DR

- Two experiments, not one.
- v1 is magnitude-flow, not activation-aware.
- Tile coverage of target proj = **100 %** (not 6 %).
- 22.22 % kills relative to **all target proj weights** (denominator stated).
- 19.04 % kills relative to the **full 7B model** (denominator stated).
- v2 is the next research line, requires real activations.
