PHYSARIUM-7B SURGERY REPORT

Phase 7 — Physarium-7B top brain surgery Date: 2026-04-27 Binary: build/physarium7b_surgery (native C++17, no Python in hot path)

Read errata first: numbers below are Physarium-v1 magnitude-flow.

Read through reports/PHYSARIUM_RESULTS_RECONCILE.md and

reports/PHYSARIUM_COVERAGE_AUDIT.md. Tile coverage of target proj

tensors = 100 %; the 22.22 % kill rate is over those weights

(denominator = 6,525,288,448), or 19.04 % if denominator = full 7B model

(7,615,616,512).

Inputs / outputs

| Item | Value | |----------------------|----------------------------------------------------| | Donor model | ~/qwen7b/instruct/ (Qwen2.5-7B-Instruct) | | Donor disk size | 15 GB | | Output dir | ~/gigachad_native/Physarium-7B-Native/ | | Output disk size | 15 GB (15,231,233,024 bytes per index) | | Run log | reports/physarium7b_surgery_run.log | | Surgery params | block_size=256, n_iter=30, beta=2.0 |

Donor used as DONOR ONLY per ARCHITECTURE_LOCK.md. The output directory is the top brain physarium_7b; the original Qwen weights are not part of the runtime.

Aggregate sparsity

| Metric | Value | |---------------------------------------|-----------------------------| | Target proj weights total | 6,525,288,448 | | Weights killed (set to 0) | 1,450,103,613 | | Killed % (target weights) | 22.22 % | | Tensors logged | 196 (28 layers × 7 projs) | | Min per-tensor sparsity | 16.32 % | | Max per-tensor sparsity | 45.92 % | | Mean per-tensor sparsity | 22.33 % |

Per-projection sparsity (28 layers each)

| Projection | n | mean | min | max | |------------------------|----|--------|--------|--------| | mlp.down_proj | 28 | 22.41% | 19.91% | 43.54% | | mlp.gate_proj | 28 | 22.15% | 18.48% | 45.92% | | mlp.up_proj | 28 | 22.09% | 19.42% | 45.52% | | self_attn.k_proj | 28 | 23.19% | 16.50% | 30.04% | | self_attn.o_proj | 28 | 21.79% | 18.86% | 26.19% | | self_attn.q_proj | 28 | 22.63% | 19.35% | 32.48% | | self_attn.v_proj | 28 | 22.06% | 16.32% | 29.85% |

Layer 1 was the densest-pruned (43–46% on MLP projections) — physarium found a lot of weak channels to flatten there. Layers 0 and 2 also showed elevated MLP sparsity (~30–36%). Deeper layers settled in the 18–23% band, which is consistent with the 0.5B Phase-1 surgery profile.

Output integrity

| Check | Result | |---------------------------------------------|-------------------| | 4 shards present | ✅ all 4 | | model.safetensors.index.json parseable | ✅ | | weight_map entry count | 339 | | total_size field | 15,231,233,024 | | All shards open via safetensors loader | ✅ 339/339 | | Sample tensor dtype | torch.bfloat16 | | Sample shape (embed_tokens.weight) | [152064, 3584] | | dtype preserved (BF16 → BF16) | ✅ | | Shapes preserved (donor → output) | ✅ (header copy) | | Failed tensors | 0 | | Tokenizer / config / merges copied | ✅ all |

The native surgery binary streams BF16 in, expands to FP32 inside the block, runs the physarum block, contracts to BF16, and writes back into a header-preserving copy of each shard. Tensor offsets in the safetensors header are reused unchanged, so the index file from the donor remains valid.

Wall-clock

| Stage | Time | |-------------------------------|-------------------| | Total wall | 2775.6 s (46.3 min) | | Shard 3 (layers 14–22) | ~14 min | | Shard 4 (layers 22–27 + head) | ~9 min | | Shard 1 (layers 0–6) | ~11 min | | Shard 2 (layers 7–13 + head6) | ~16 min |

Shards processed in directory-iterator order (3, 4, 1, 2 by file mtime), cumulative killed at each shard boundary:

after shard 3: 402,974,047
after shard 4: 653,435,633 (Δ 250,461,586)
after shard 1: 1,048,031,185 (Δ 394,595,552)
after shard 2: 1,450,103,613 (Δ 402,072,428, final)

What is not in this report

Inference quality. No forward pass was run against the pruned model.

The 7B has no native CUDA backend in this tree yet (organ farm is the near-term path); a quality probe needs either an HF transformers eval or a native model runner that does not yet exist.

Per-layer sparsity for layers 22–27. Log skipped some boundary

tensors (shard 4 only contained the tail of layer 22 and the lm_head), so the 196 logged tensors cover 28 layers × 7 projs and not the full 28×7. See reports/physarium7b_surgery_run.log for raw per-tensor output.

Comparison to a Python pipeline_organic.py run. This was a

C++-native run; no parity probe was performed against the legacy Python surgery script.

Honest assessment

✅ Native C++ surgery binary works end-to-end on a real 15 GB BF16 donor.
✅ Output is a structurally valid safetensors model (loadable, indexable,

dtype-preserving, shape-preserving).

✅ 22.22 % weights killed in target projections — within the expected

"organic" sparsity band for physarium block surgery.

⚠️ "It loads" ≠ "it generates". Inference quality of Physarium-7B-Native

is untested; the next step needed to claim a working top brain is a forward-pass eval, which is out of scope for this Phase-7 task.

⚠️ Output sits at 15 GB BF16. With the tier_default: VRAM policy in

organs/organ_farm.json, this does not fit on a single 8 GB RTX 3060 Ti and will need quantization or partial offload before live use. This is a known constraint, not a regression.