# BD7 — teacher hand-fix to 100/100 (2026-05-02)

**TL;DR — added 24 hand-written ideal TRIZ targets for the v2 losers
and replaced 2 confirmed-weak v1 winners (ARIZ/44 inverted-direction,
ARIZ/82 generic IFR). All 100 rows now pass strict 6-field validator.
Spot-check 20 random rows confirms schema-correct + content
defensible. Output ready for QLoRA train/eval split + surgery.**

## Numbers

| stage           | rows from teacher | rows hand-written/replaced | total strict-pass |
|-----------------|-------------------|----------------------------|--------------------|
| forge v1 (200 prompts, 2 variants)     | 70 winners | — | 70/100 |
| retry v2 (90 prompts, 3 variants on 30 losers) | +6 | — | 76/100 |
| **hand-fix v3** | 74 (kept v2 winners minus ARIZ/44, /82) | **+24 losers + 2 weak** | **100/100** |

## Source distribution

| source                 | count |
|------------------------|-------|
| 7B teacher (variant A) | ~40   |
| 7B teacher (variant B) | ~30   |
| 7B teacher (retry C)   | ~6    |
| hand_v3                | 26    |

`hand_v3` covers all of the 24 v2 still-losers (ARIZ/01, /03, /07, /18,
/30, /32, /34, /41, /43, /48, /61, /67, /68, /72, /77, /79, /81, /86,
/88, /93, /94, /99, /02, /06) plus ARIZ/44 and ARIZ/82 replacements.

## Validation result

```
[handfix] strict-validated 100/100
[handfix] hand-written / replaced: 26
```

All 100 rows have:
* technical_contradiction (str, non-empty)
* physical_contradiction  (str, non-empty)
* ifr                     (str, non-empty)
* resources               (list len ≥ 2)
* triz_operators          (list len ≥ 2, TRIZ-40 names)
* candidate_moves         (list len ≥ 2, concrete interventions)

## Spot-check 20 random rows (seed=7)

All 20 sampled task_ids show proper "X improves while Y worsens" TC
form. Mix of teacher and hand sources. No empty fields, no markdown.

Mild caveat: 21/100 IFR strings contain generic engineering phrases
like "optimal", "efficient", "without compromising" — most are
context-appropriate (e.g. "optimal heat exchange") not vacuous fluff.
The two truly fluff IFRs that were spotted in v1 review (ARIZ/82,
ARIZ/44) were replaced.

## Files

* `data/organ_surgery/phys05_triz_contradiction/teacher_targets_v3_100.jsonl`
  — 100 strict-validated rows, ordered by task_id
* `tools/surgery/build_triz_teacher_handfix_v3.py` — script with
  inline 26 hand-written TRIZ analyses
* `reports/BD7_TEACHER_HAND_FIX_V3.md` — this file

## Ready for next phase

Per BD7 spec step 7: `train_set: 80, eval_set: 20, +10 anchor rows`.
With 100 strict targets:
* 80 train + 20 eval = full split possible
* 10 anchor (cleanest, broadest domain coverage) drawn from
  hand-written rows + strongest 7B rows

Per BD7 spec step 8: QLoRA surgery → `physarum05b_triz_contradiction.planck`.
The trainer can be cloned from `train_code_skeleton_lora_bd6_5.py`
(stratified anchor/poison) but for BD7 we don't have a "poison" pool —
this is supervised fine-tuning, not anti-forgetting. Simpler trainer.

Per BD7 spec step 9: PHYS05_TRIZ_PACK already wired in organ_manager
(currently fallback to code_skeleton.planck; flip to the new triz
.planck after train+gate).

Awaiting GO for Phase 4 (train/eval split) or Phase 5 (QLoRA surgery
with the chosen split).
