BD6.6 — over-anchor regression, 11/19 (down from BD6.5's 15/19) (2026-05-02)
TL;DR — over-replication on holdouts backfires. Pushed 4 holdouts × 50, plus other-HE × 25, MBPP-long × 25, MBPP-short × 10 against capped poison (36.6 %). Final anchor gate dropped from 15/19 to 11/19: all 4 holdouts still failed AND 4 previously-kept short-target anchors regressed. Loss curve was tighter (avg 0.5085 vs BD6.5's 0.5354), but tighter loss on the holdout subset produced mode collapse, not memorization. Production reverted; v6 archived; BD6.5 stratified shape stands as the local optimum for replication-only levers. Real fix is no longer replication — it's the loss function itself (Lever D: token-weighted) or KL anchor (Lever E).
Pipeline (PYTHON_QUARANTINE-compliant)
production: physarum05b_code_skeleton.planck (BD6 pass-1, anchor 19/19)
anchor_positive.jsonl: 19 captured pass-1 outputs (unchanged from BD6.4)
bd6_6_mixed_train.jsonl (525 → 670 rows, anchor share 53 → 63 %)
holdout × 50 = 4 × 50 = 200 rows (MBPP/53, HE/34, /45, /85)
he_other × 25 = 3 × 25 = 75 rows (HE/23, /27, /53)
mbpp_long × 25 = 2 × 25 = 50 rows (MBPP/19, /90, threshold ≥150 char)
mbpp_short × 10 = 10 × 10 = 100 rows
→ 425 anchor rows
+ poison_train.jsonl 245 (with refs, all)
= 670 rows total, 63.4 % anchor / 36.6 % poison
trainer: tools/surgery/train_code_skeleton_lora_bd6_5.py (unchanged)
--rank 8 --alpha 16 --lr 3e-5 --epochs 1 --checkpoint-steps 50
trainable params = 1.08 M / 495 M = 0.22 %
loss curve (samples):
step 50 loss 0.30
step 100 loss 0.05 (anchor memorization fast under heavy replication)
step 200 loss 0.06
step 300 loss 0.09
step 350 loss 0.98 (poison spike)
step 400 loss 0.25
step 500 loss 0.29
step 600 loss 0.06
step 650 loss 0.02
step 670 final (epoch end)
epoch avg 0.5085
merge → physarum05b_code_skeleton_v6.planck
flip → rebuild → anchor_eval
anchor gate (final adapter):
▼
11 / 19 PASS (rate 57.9 %) organ_leaks=0
▼
threshold 85 % → REJECT
▼
REVERT: PHYS05_PACK = physarum05b_code_skeleton.planck
REBUILD
VERIFY: anchor 19/19 ✅ production safe
Numbers across all five BD6.x passes
| pass | dataset shape | r | lr | ep | anchor share | anchor post-merge | gate | |-------------|----------------------------------------|----|------|----|-------------|--------------------|------------| | BD6 pass-1 | poison v1 (256 with refs) | 16 | 2e-4 | 3 | 0 % | 19/19 (defines anchor) | KEEP | | BD6.2 | union v1∪v2 (260) | 16 | 2e-4 | 4 | 0 % | not run | REVERT (post-bench MBPP regress) | | BD6.3 | fresh-only (245) | 8 | 1e-4 | 1 | 0 % | 0 / 19 | REVERT | | BD6.4 | fresh + 5× anchor (340) | 8 | 1e-4 | 1 | 28 % | 7 / 19 | REVERT | | BD6.5 | fresh + bench-aware-anchor (525) | 8 | 5e-5 | 1 | 53 % | 15 / 19 | REVERT (still < 19/19) | | BD6.6 | + holdout×50 + cap poison (670) | 8 | 3e-5 | 1 | 63 % | 11 / 19 | REVERT (over-anchor regression) |
Trajectory: 0 → 7 → 15 → 11. The curve bent backward at 63 % anchor share.
Per-row anchor result on v6 final
KEPT (11):
MBPP/17, MBPP/19, MBPP/41, MBPP/51, MBPP/52,
MBPP/64, MBPP/90, MBPP/96, MBPP/99, MBPP/105,
HumanEval/23
LOST (8):
MBPP/20 (66 chars, short) — was KEPT in BD6.5, now LOST
MBPP/53 (256 chars, holdout LONG) — still LOST (50× repl no help)
MBPP/93 (52 chars, short) — was KEPT in BD6.5, now LOST
HumanEval/27 (267 chars, other_he 25×) — was KEPT in BD6.5, now LOST
HumanEval/34 (182 chars, holdout LONG) — still LOST
HumanEval/45 (249 chars, holdout LONG) — still LOST
HumanEval/53 (101 chars, other_he 25×) — was KEPT in BD6.5, now LOST
HumanEval/85 (366 chars, holdout LONG) — still LOST
Two failure modes co-occurred:
- Holdout-replication didn't unlock the holdouts. All 4 of MBPP/53,
HE/34, HE/45, HE/85 still failed despite 50× replication in the training mix. The LoRA could memorize their ChatML→def pattern at the token level (loss 0.05 on these batches by step 100), but at inference time the runtime's actual generation drifted away — same failure mode BD6.5 had on these exact 4 prompts. Replication does not fix what is fundamentally a length-vs-poison interaction.
- Mode collapse on short prompts. 4 short-target anchors that were
stable in BD6.5 (MBPP/20, MBPP/93, HE/27, HE/53) regressed in BD6.6. Heavy holdout replication pushed the LoRA into a long-pattern hot region; on short prompts the model now over-emits long-style content and trips the verifier.
The MBPP/19 (337 char, long) stayed kept — not because it's similar to the holdouts but because its target is structurally close to short MBPP. Length alone isn't the lever; token distribution similarity is.
What this proves
- Replication is a saturating lever, not a linear one. BD6.4 (28 %) →
BD6.5 (53 %) was monotone good (7 → 15). BD6.5 (53 %) → BD6.6 (63 % with holdout-skewed) reverses (15 → 11). The data points trace a parabola; 53 % stratified is the peak for the replication-only lever.
- Long-target anchors are not fixed by repeating them more. The 4
holdouts each appeared 50 times in training; the LoRA still failed to reproduce them at inference. The signal needed is not more exposure — it's a different gradient shape.
- The strict gate works exactly as designed. 11/19 is dramatically
worse than the BD6.5 attempt; the gate caught it cleanly; production reverted before any user-visible damage. Three strict-gate rejects in a row (BD6.3, .4, .5, .6) — and production stayed at 19/19 the whole time.
What's actually needed for BD6.7
The replication knob is exhausted. Two real levers remain — both were identified in the BD6.5 report (Lever D, Lever E):
Lever D — token-weighted loss (correctness fix at training time)
Currently each row contributes equal cross-entropy. Long targets contribute 2-3× more tokens, so their per-row loss looks bigger to the optimizer; the LoRA "tries hard" on them in early steps then drifts under continued poison gradient. Fix: scale per-row loss by 1 / sqrt(target_token_count) so long targets don't dominate the gradient. This is what the loss curve mathematically asks for. ~10 lines in trainer's loss step. No data-shape change. Use BD6.5 mix (53 % anchor) — it was the peak.
Lever E — KL anchor on the 4 holdouts only
Run inference on [MBPP/53, HE/34, /45, /85] against the frozen pass-1 base, capture top-k logits per token, add a KL term to the LoRA training step that pulls the student toward those reference logits on those four prompts. Canonical anti-forgetting fix; principled. ~30 lines extra; PEFT pattern.
Recommended order: D first (cheap, well-targeted, matches the diagnostic). If D leaves any holdout, E.
Do NOT increase replication further. That door is closed.
Production state (after BD6.6 revert)
PHYS05_PACK = physarum05b_code_skeleton.planck(BD6 pass-1, unchanged).- MBPP B = 13/100, HumanEval B = 6/164, LCB B = 0/50, anchor 19/19.
physarum05b_code_skeleton_v6.planckarchived (rejected).tools/surgery/output/code_skeleton_lora_v6/archived (final + 13 mid-checkpoints @ step 50, 100, …, 650).tools/surgery/output/Physarum05B-CodeSkeleton-v6/archived (rejected merged HF dir).
Files this pass touched
data/organ_surgery/phys05_code_skeleton/bd6_6_mixed_train.jsonl— 670-row weighted settools/surgery/output/code_skeleton_lora_v6/— final adapter + 13 mid-checkpointstools/surgery/output/Physarum05B-CodeSkeleton-v6/— merged HF dir (rejected)physarum05b_code_skeleton_v6.planck— repacked (rejected, archived)src/organs/organ_manager.cpp::PHYS05_PACK— flipped to v6 then back to v1reports/BD6_6_OVER_ANCHOR_REGRESSION.md— this file
Reading the trajectory
0 % anchor → 0/19 (BD6.3, lr=1e-4)
28 % anchor → 7/19 (BD6.4, lr=1e-4, 5× repl)
53 % anchor → 15/19 (BD6.5, lr=5e-5, bench-aware repl, stratified)
63 % anchor → 11/19 (BD6.6, lr=3e-5, holdout×50, capped poison)
The curve says: stay at 53 %, change the loss function, not the data weighting. BD6.7 = train_code_skeleton_lora_bd6_5.py + 1 small loss modification, same dataset shape as BD6.5. No new experiments before that change is in place.