PHYSARIUM-7B SURGERY REPORT
Phase 7 — Physarium-7B top brain surgery Date: 2026-04-27 Binary: build/physarium7b_surgery (native C++17, no Python in hot path)
Read errata first: numbers below are Physarium-v1 magnitude-flow.
Read through reports/PHYSARIUM_RESULTS_RECONCILE.md and
reports/PHYSARIUM_COVERAGE_AUDIT.md. Tile coverage of target proj
tensors = 100 %; the 22.22 % kill rate is over those weights
(denominator = 6,525,288,448), or 19.04 % if denominator = full 7B model
(7,615,616,512).
Inputs / outputs
| Item | Value | |----------------------|----------------------------------------------------| | Donor model | /home/pc/qwen7b/instruct/ (Qwen2.5-7B-Instruct) | | Donor disk size | 15 GB | | Output dir | /home/pc/gigachad_native/Physarium-7B-Native/ | | Output disk size | 15 GB (15,231,233,024 bytes per index) | | Run log | reports/physarium7b_surgery_run.log | | Surgery params | block_size=256, n_iter=30, beta=2.0 |
Donor used as DONOR ONLY per ARCHITECTURE_LOCK.md. The output directory is the top brain physarium_7b; the original Qwen weights are not part of the runtime.
Aggregate sparsity
| Metric | Value | |---------------------------------------|-----------------------------| | Target proj weights total | 6,525,288,448 | | Weights killed (set to 0) | 1,450,103,613 | | Killed % (target weights) | 22.22 % | | Tensors logged | 196 (28 layers × 7 projs) | | Min per-tensor sparsity | 16.32 % | | Max per-tensor sparsity | 45.92 % | | Mean per-tensor sparsity | 22.33 % |
Per-projection sparsity (28 layers each)
| Projection | n | mean | min | max | |------------------------|----|--------|--------|--------| | mlp.down_proj | 28 | 22.41% | 19.91% | 43.54% | | mlp.gate_proj | 28 | 22.15% | 18.48% | 45.92% | | mlp.up_proj | 28 | 22.09% | 19.42% | 45.52% | | self_attn.k_proj | 28 | 23.19% | 16.50% | 30.04% | | self_attn.o_proj | 28 | 21.79% | 18.86% | 26.19% | | self_attn.q_proj | 28 | 22.63% | 19.35% | 32.48% | | self_attn.v_proj | 28 | 22.06% | 16.32% | 29.85% |
Layer 1 was the densest-pruned (43–46% on MLP projections) — physarium found a lot of weak channels to flatten there. Layers 0 and 2 also showed elevated MLP sparsity (~30–36%). Deeper layers settled in the 18–23% band, which is consistent with the 0.5B Phase-1 surgery profile.
Output integrity
| Check | Result | |---------------------------------------------|-------------------| | 4 shards present | ✅ all 4 | | model.safetensors.index.json parseable | ✅ | | weight_map entry count | 339 | | total_size field | 15,231,233,024 | | All shards open via safetensors loader | ✅ 339/339 | | Sample tensor dtype | torch.bfloat16 | | Sample shape (embed_tokens.weight) | [152064, 3584] | | dtype preserved (BF16 → BF16) | ✅ | | Shapes preserved (donor → output) | ✅ (header copy) | | Failed tensors | 0 | | Tokenizer / config / merges copied | ✅ all |
The native surgery binary streams BF16 in, expands to FP32 inside the block, runs the physarum block, contracts to BF16, and writes back into a header-preserving copy of each shard. Tensor offsets in the safetensors header are reused unchanged, so the index file from the donor remains valid.
Wall-clock
| Stage | Time | |-------------------------------|-------------------| | Total wall | 2775.6 s (46.3 min) | | Shard 3 (layers 14–22) | ~14 min | | Shard 4 (layers 22–27 + head) | ~9 min | | Shard 1 (layers 0–6) | ~11 min | | Shard 2 (layers 7–13 + head6) | ~16 min |
Shards processed in directory-iterator order (3, 4, 1, 2 by file mtime), cumulative killed at each shard boundary:
- after shard 3: 402,974,047
- after shard 4: 653,435,633 (Δ 250,461,586)
- after shard 1: 1,048,031,185 (Δ 394,595,552)
- after shard 2: 1,450,103,613 (Δ 402,072,428, final)
What is not in this report
- Inference quality. No forward pass was run against the pruned model.
The 7B has no native CUDA backend in this tree yet (organ farm is the near-term path); a quality probe needs either an HF transformers eval or a native model runner that does not yet exist.
- Per-layer sparsity for layers 22–27. Log skipped some boundary
tensors (shard 4 only contained the tail of layer 22 and the lm_head), so the 196 logged tensors cover 28 layers × 7 projs and not the full 28×7. See reports/physarium7b_surgery_run.log for raw per-tensor output.
- Comparison to a Python pipeline_organic.py run. This was a
C++-native run; no parity probe was performed against the legacy Python surgery script.
Honest assessment
- ✅ Native C++ surgery binary works end-to-end on a real 15 GB BF16 donor.
- ✅ Output is a structurally valid safetensors model (loadable, indexable,
dtype-preserving, shape-preserving).
- ✅ 22.22 % weights killed in target projections — within the expected
"organic" sparsity band for physarium block surgery.
- ⚠️ "It loads" ≠ "it generates". Inference quality of
Physarium-7B-Native
is untested; the next step needed to claim a working top brain is a forward-pass eval, which is out of scope for this Phase-7 task.
- ⚠️ Output sits at 15 GB BF16. With the
tier_default: VRAMpolicy in
organs/organ_farm.json, this does not fit on a single 8 GB RTX 3060 Ti and will need quantization or partial offload before live use. This is a known constraint, not a regression.