CyberdyneLabs · Reports · BLACK_DOG_ORGAN_AUDIT

BLACK_DOG_ORGAN_AUDIT

reports/BLACK_DOG_ORGAN_AUDIT.md 1061 words raw markdown ↗

BLACK_DOG_ORGAN_AUDIT

Honest accounting of what Black-Dog actually does to organs vs what the architecture diagram promises. Read 2262 DAG entries; cross-checked against the three Black-Dog call sites in src/main.cpp.

1. Black-Dog code exists; it is BARELY wired

src/runtime/black_dog.cpp + include/black_dog.hpp define black_dog::Store, extract_features, hash_action_chain, get, update, save. The store reads/writes a JSON file on disk.

bd_store is invoked at exactly 3 call sites in the entire --chat/--task runtime:

| line | function | writes conductance to DAG? | |--------|---------------------------|----------------------------| | 337-347| run_chat_identity | NO(void)cond_before; (void)cond_after; at line 348 silently discards both. The store is updated; the DAG entry is not. | | 3008+ | run_ariz_e2e | YES — line 3331-3334 writes both fields. | | 3447+ | run_chat_organ_route | YES — line 3485-3486 writes both fields. |

Every other --chat path does NOT call Black-Dog at all:

Result: the high-traffic paths that actually shipped this sprint do not feed Black-Dog. Food/poison labels are written via shortcut at lines 378/731/1157 (computed inline, not via bd_store.update), and conductance never moves.

2. Per-organ Black-Dog accounting (all 2262 DAG entries)

organ                                 n   pass  fail food≠0 poison≠0 cond_b≠0 cond_changed food_avg poison_avg
physarium_7b_chat                  1966   1465  501   1418      501        0            0     0.72       0.25
physarium_7b                        139    127   12    135      135      119          136     2.15       1.58
phys05_claim_extractor               57     57    0     57        0        0            0     1.00       0.00
json_repair (legacy stub)            40     27   13      9       13       10           13     0.27       0.26
phys05_code_skeleton                 28     28    0     28        0        0            0     1.00       0.00
phys05_json_repair                   20     20    0      0        0        0            0     0.00       0.00
phys05_test_writer                    0      –    –      –        –        –            –        –          –
phys05_triz_contradiction             0      –    –      –        –        –            –        –          –
phys05_renderer                       0      –    –      –        –        –            –        –          –
phys05_cache_matcher                  0      –    –      –        –        –            –        –          –
phys05_critic_lite                    0      –    –      –        –        –            –        –          –
phys05_wound  (NEW spec)              0      –    –      –        –        –            –        –          –

3. Three concrete failures the audit surfaces

A. physarium_7b_chat — food/poison labels written, conductance never moves (1966 entries)

food_score and poison_score ARE set (1418 nonzero / 501 nonzero respectively), but conductance_before = 0 for every single one of the 1966 entries. The store is updated only inside run_chat_identity, which discards cond_before/after; all other paths skip BD entirely.

Implication: Black-Dog has no influence on routing. The 86.9 % traffic that goes to physarium_7b_chat will keep going there regardless of whether it fails. The dog has no leash.

B. 0.5B organs that DID fire — auto-pass, zero poison (105 entries)

phys05_claim_extractor (57 calls): pass=57/57, poison_nonzero=0, food_avg=1.00. Same shape for phys05_code_skeleton (28/28) and phys05_json_repair (20/20).

This means: either the verifiers on those routes auto-pass, or the food/poison computation is "set 1.0 if return code 0" without checking anything semantic. Looking at run_chat_organ_route line 730-731:

e.food_score = 1.0;
e.poison_score = 0.0;

— hardcoded. Conductance does update via bd_store.update (cond_before nonzero appears 0 times because the keys are fresh on first call), but the food/poison fed in are placeholders. The dog isn't tasting the food; it's just clicking the bowl.

C. 5 of 8 0.5B organs are stillborn — 0 calls, 0 BD signal

phys05_test_writer / triz_contradiction / renderer / cache_matcher / critic_lite and the new phys05_wound have never been invoked. They have no DAG entries, no food, no poison, no conductance. There is no poison data to harvest into surgery datasets. They are dark.

4. Where conductance ACTUALLY moves

Only physarium_7b on the ARIZ route shows conductance behaviour:

ariz_e2e_44b2fc775a633665   physarium_7b   cond  0.6416 → 0.6533
ariz_e2e_979dde753459b039   physarium_7b   cond  0.6935 → 0.6948
ariz_e2e_0a66cdcb747bfbfd   physarium_7b   cond  1.2325 → 1.1260

136 entries out of 139 had cond_before ≠ cond_after. ARIZ pipeline is the only place where Black-Dog is actually a learning loop. Everything else is decorative.

5. The four bugs that block PHASE-13.BLACK_DOG_ORGAN_COLONY

  1. (void)cond_before; (void)cond_after; at main.cpp:348

run_chat_identity discards the only BD signal it computes. Trivial fix: remove the (void) casts and write the values into e.

  1. **run_native_* paths don't call bd_store at all.** Each of

run_native_terminal_task, run_native_code_repair, run_native_tool_call, run_form_replay, run_native_terminal_task_organ_first needs:

  1. run_chat_organ_route hard-codes food=1.0 poison=0.0 (line 730-731).

Real food/poison must come from the verifier (food = verifier_pass ? 1 : 0; poison = capsule.aborted ? 1 : 0 etc.). Currently the 0.5B organs look perfect on every call because the verifier signal isn't being piped into the BD loop.

  1. 5 organs have never fired. Without DAG entries there is no

poison to harvest, no surgery dataset, no QLoRA target. They need to be put in the path of --chat traffic on dispatcher routes that actually reach --chat. Today Route::ariz / renderer / cache_matcher are wired only in legacy --task.

6. Recommended order (BD2 → BD3 → BD5 → BD6)

Before any organ "rewire" or QLoRA surgery:

  1. BD2 fix-and-instrument first. Patch the four bugs above so every

--chat DAG carries real food/poison/cond_before/cond_after. Without this, every subsequent number is unfalsifiable.

  1. BD3 poison harvest. Once BD writes are real, point a script at

dag/runs/*.json and stream all entries with poison_score > 0 or verifier_pass = false into data/organ_surgery/<organ>/poison_train.jsonl.

  1. BD5 learning curve probe. Run 30 tasks × 5 rounds. The 5th round

must show: lower fallback_to_7B count, lower poison sum, higher conductance on successful chains, faster wall.

  1. BD6 QLoRA only after BD5 establishes a real baseline.

7. Stop conditions

are all zero in the DAG entry

in question is supposed to be served by a 0.5B chain

"live"

not yet ready for production claims — it does not call BD; its organs are untrained for the role. It is a baseline probe instrument, not a victory.

Authoritative artefacts: