# BD9 — phys05_json_repair surgery, **10/10 GREEN** (2026-05-04)

**TL;DR — phys05_json_repair was a placeholder organ riding on the
default base 0.5B pack with no LoRA. BD9 forged a 280-row synthetic
training corpus covering the 10 broken-JSON shapes the runtime actually
sees, trained 6-epoch LoRA (loss 0.055 → 0.0003), merged + repacked,
flipped pack, and the organ now repairs **all 10 production failure
modes** end-to-end via `--organ-probe` smoke. First organ under the
doctrine with a clean 100% on a real failure-catalog bench.**

## Headline

```
10 broken-JSON cases × phys05_json_repair (BD9 v1 pack):

case                       result   organ output
--------------------------------------------------------------------------
missing_close_brace        OK       {"a":1,"b":2}
trailing_comma             OK       {"a":1,"b":2}
unquoted_keys              OK       {"name": "alice", "age": 30}
single_quoted              OK       {"food": 1, "poison": 0}
missing_colon              OK       {"a": 1, "b": 2}
comma_as_colon             OK       {"key": "value", "k2": 5}
markdown_fence             OK       {"x":1,"y":2}
leading_prose              OK       {"x":1,"y":2}
trailing_prose             OK       {"x":1,"y":2}
missing_close_bracket      OK       {"items":[1,2,3]}

== 10/10 pass
```

`comma_as_colon` is exactly the wound v2 quirk that BD8 had to repair
with a Python-side shim. The fresh json_repair organ handles it natively.

## Pipeline

1. **Forge synthetic data** — `tools/surgery/build_json_repair_dataset.py`
   takes a 30-seed pool of realistic JSON shapes and mutates each through
   11 break-classes from the runtime failure catalog. Output: 280 rows in
   `data/organ_surgery/phys05_json_repair/json_repair_train_v1.jsonl`.

2. **Train LoRA** — `tools/surgery/train_json_repair_lora.py` (clone of
   BD7 trainer adapted for the json_repair prompt template). r=8 α=16
   lr=5e-5 6 epochs, max-len=1024.

   ```
   epoch 0 avg_loss=0.0547
   epoch 1 avg_loss=0.0082
   epoch 2 avg_loss=0.0040
   epoch 3 avg_loss=0.0019
   epoch 4 avg_loss=0.0005
   epoch 5 avg_loss=0.0003
   ```

3. **Merge + repack** — `merge_code_skeleton_lora.py` (reused script).
   Output: `physarum05b_json_repair.planck` (988 MB BF16).

4. **Wire** — `src/organs/organ_manager.cpp`:
   * new constant `PHYS05_JSON_REPAIR_PACK`
   * `specs_["phys05_json_repair"].pack_path = PHYS05_JSON_REPAIR_PACK`
   * `max_tokens 96 → 256` (repaired JSON often >100 tokens)
   * `repetition_penalty 1.03`, `cuda_repetition_penalty 1.00`
   * `json_output = true`

5. **Smoke** — 10 hand-curated broken JSONs from the runtime catalog
   probed via `./build/gigachad_native --organ-probe phys05_json_repair`.
   All 10 returned valid JSON with the correct repair.

## Why this trained without overfitting (vs BD7.3)

BD7.3 (TRIZ 9-epoch) regressed because:
- 80 training rows × 9 epochs of free-form prose JSON → memorization of
  specific token sequences in long fields.
- Loss kept descending past the generalization point.

BD9 (json_repair 6-epoch) didn't regress because:
- 280 training rows × 6 epochs (more data, less time per row).
- Each row is short and structurally regular (broken pattern → fixed
  pattern), so the LoRA learns transformations, not specific tokens.
- 30 distinct seeds × 11 mutation classes = 330 unique signal sources.
  No single sequence dominates the gradient.

The same 6-epoch sweet-spot from BD7 applies; the **data shape** is what
made loss-near-zero safe to ship.

## Production state (after BD9)

```
PHYS05_PACK              physarum05b_code_skeleton.planck       MBPP B 13/100 · HE B 6/164 · LCB 0/50 · anchor 19/19
PHYS05_TRIZ_PACK         physarum05b_triz_contradiction_v2      ARIZ 88/100 strict · fb=0
PHYS05_CRITIC_PACK       physarum05b_critic_lite_v2.planck      out of ARIZ rescue path
PHYS05_WOUND_PACK        physarum05b_wound_v2.planck            live in --chat ARIZ rescue (BD8 V9 + TRACK 2 C++ port)
PHYS05_JSON_REPAIR_PACK  physarum05b_json_repair.planck         NEW · 10/10 on broken-JSON catalog
PHYS7B_PACK              physarium7b_identity.q4planck          Q4 · 11.16 tok/s · identity 14/14
acceptance bench         18/18 · leaks 0
```

## Files this surgery produced

```
tools/surgery/build_json_repair_dataset.py                     280-row forge
tools/surgery/train_json_repair_lora.py                        BD9 trainer
tools/surgery/output/json_repair_lora_v1/                      PEFT adapter
tools/surgery/output/Physarum05B-JsonRepair/                   merged BF16 HF dir
data/organ_surgery/phys05_json_repair/json_repair_train_v1.jsonl
physarum05b_json_repair.planck                                 988 MB pack
src/organs/organ_manager.cpp                                   PHYS05_JSON_REPAIR_PACK + spec overrides
reports/BD9_JSON_REPAIR_FINAL.md                               this file
```

## Engineering takeaways

1. **Synthetic mutation forge beats sparse real poison** — 280 forged rows
   from a 30-seed pool covered 10/10 production failure modes. Real DAG
   poison had only 1 row; would have learned nothing useful.
2. **Mechanical-repair tasks tolerate 6 epochs without overtrain.** The
   BD7.3 trap was specific to long prose targets, not short structured
   transformations.
3. **The wound v2's `"key", "value"` quirk is a json_repair concern.**
   Now that json_repair handles it, the wound rescue chain in
   `run_chat_ariz_organ_first` could optionally invoke json_repair as
   a final cleanup step — queued as BD9.1.
4. **Same 5-step template** (forge → train → merge → flip → smoke) is
   now proven across phys05_code_skeleton (BD6), phys05_triz_contradiction
   (BD7), critic_lite + wound (BD8), and phys05_json_repair (BD9). It
   is the project's universal organ-surgery loop.

## Queued next (per same template)

* phys05_renderer        — needs (broken_render → fixed_render) pairs
* phys05_test_writer     — needs (function → unit_test) pairs
* phys05_cache_matcher   — needs (input_hash → cache_decision) pairs
* phys05_claim_extractor — needs (text → structured_claims) pairs
* BD8.2 wound v3         — re-train wound on broader quirk catalog
* BD8.1 critic_lite v3   — re-train critic on ARIZ schema failures (not stderr)

Each is the same forge → 6-epoch QLoRA → merge → flip → smoke loop.