MBPP_HE_3MODE_V1
Modes: B. MBPP n=100, HumanEval n=164.
Summary
| bench | mode | n | pass | wall_s | organs_used | fallback | BD_signal | |---|---|---|---|---|---|---|---| | humaneval | B | 164 | 6 | 570.4 | phys05_code_skeleton | 0 | 117 | | mbpp | B | 100 | 13 | 286.6 | phys05_code_skeleton | 0 | 88 |
Per-row (truncated to first 200)
| bench | mode | id | pass | wall_s | route | organs_used | fb | |---|---|---|---|---|---|---|---| | mbpp | B | MBPP/11 | False | 3.019 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/12 | False | 2.775 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/13 | False | 2.667 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/14 | False | 2.572 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/15 | False | 2.645 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/16 | False | 2.881 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/17 | True | 2.449 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/18 | False | 2.724 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/19 | True | 7.922 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/20 | True | 3.101 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/21 | False | 3.067 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/22 | False | 3.073 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/23 | False | 2.983 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/24 | False | 2.957 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/25 | False | 3.225 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/26 | False | 3.103 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/27 | False | 3.069 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/28 | False | 2.803 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/29 | False | 0.742 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/30 | False | 2.69 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/31 | False | 3.25 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/32 | False | 2.779 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/33 | False | 2.637 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/34 | False | 2.736 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/35 | False | 2.683 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/36 | False | 2.517 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/37 | False | 2.787 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/38 | False | 2.689 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/39 | False | 2.567 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/40 | False | 4.017 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/41 | True | 2.67 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/42 | False | 2.764 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/43 | False | 2.699 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/44 | False | 2.714 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/45 | False | 2.852 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/46 | False | 2.767 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/47 | False | 2.563 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/48 | False | 2.448 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/49 | False | 2.831 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/50 | False | 2.507 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/51 | True | 2.6 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/52 | True | 4.2 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/53 | True | 2.714 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/54 | False | 2.577 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/55 | False | 2.482 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/56 | False | 2.614 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/57 | False | 2.549 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/58 | False | 2.527 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/59 | False | 2.598 | unparsed | — | False | | mbpp | B | MBPP/60 | False | 2.743 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/61 | False | 2.495 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/62 | False | 2.718 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/63 | False | 2.645 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/64 | True | 5.495 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/65 | False | 2.619 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/66 | False | 2.57 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/67 | False | 2.546 | unparsed | — | False | | mbpp | B | MBPP/68 | False | 2.534 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/69 | False | 2.685 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/70 | False | 2.687 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/71 | False | 2.653 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/72 | False | 2.585 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/73 | False | 2.806 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/74 | False | 2.545 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/75 | False | 2.824 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/76 | False | 5.617 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/77 | False | 2.629 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/78 | False | 2.427 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/79 | False | 2.591 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/80 | False | 2.549 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/81 | False | 2.616 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/82 | False | 2.528 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/83 | False | 2.555 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/84 | False | 2.536 | unparsed | — | False | | mbpp | B | MBPP/85 | False | 2.537 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/86 | False | 2.561 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/87 | False | 2.89 | code_fast_no_7b | phys05_code_skeleton | False | | mbpp | B | MBPP/88 | False | 5.866 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/89 | False | 2.487 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/90 | True | 2.584 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/91 | False | 2.593 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/92 | False | 2.487 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/93 | True | 2.599 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/94 | False | 2.835 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/95 | False | 2.43 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/96 | True | 2.594 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/97 | False | 3.024 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/98 | False | 2.741 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/99 | True | 2.47 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/100 | False | 5.427 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/101 | False | 2.545 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/102 | False | 2.51 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/103 | False | 2.424 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/104 | False | 2.751 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/105 | True | 2.53 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/106 | False | 2.488 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/107 | False | 2.687 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/108 | False | 3.342 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/109 | False | 2.368 | code_fast | phys05_code_skeleton | False | | mbpp | B | MBPP/110 | False | 2.846 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/0 | False | 5.832 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/1 | False | 2.974 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/2 | False | 2.862 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/3 | False | 2.887 | unparsed | — | False | | humaneval | B | HumanEval/4 | False | 2.894 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/5 | False | 3.107 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/6 | False | 2.8 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/7 | False | 2.806 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/8 | False | 3.125 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/9 | False | 2.82 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/10 | False | 5.83 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/11 | False | 2.828 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/12 | False | 2.924 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/13 | False | 2.744 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/14 | False | 2.699 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/15 | False | 2.769 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/16 | False | 2.6 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/17 | False | 3.223 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/18 | False | 2.829 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/19 | False | 5.758 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/20 | False | 3.169 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/21 | False | 2.98 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/22 | False | 2.609 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/23 | True | 2.592 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/24 | False | 2.695 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/25 | False | 3.027 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/26 | False | 2.729 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/27 | True | 2.966 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/28 | False | 2.943 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/29 | False | 2.815 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/30 | False | 6.028 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/31 | False | 3.269 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/32 | False | 3.818 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/33 | False | 3.359 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/34 | True | 3.11 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/35 | False | 2.925 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/36 | False | 2.92 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/37 | False | 3.172 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/38 | False | 3.364 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/39 | False | 6.016 | unparsed | — | False | | humaneval | B | HumanEval/40 | False | 3.296 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/41 | False | 3.302 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/42 | False | 3.09 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/43 | False | 3.34 | unparsed | — | False | | humaneval | B | HumanEval/44 | False | 2.944 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/45 | True | 2.682 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/46 | False | 3.4 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/47 | False | 2.768 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/48 | False | 2.914 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/49 | False | 3.03 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/50 | False | 5.894 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/51 | False | 3.131 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/52 | False | 2.943 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/53 | True | 2.689 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/54 | False | 3.196 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/55 | False | 2.865 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/56 | False | 2.945 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/57 | False | 2.945 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/58 | False | 3.271 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/59 | False | 2.722 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/60 | False | 5.752 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/61 | False | 2.939 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/62 | False | 3.067 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/63 | False | 3.069 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/64 | False | 3.161 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/65 | False | 3.061 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/66 | False | 3.033 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/67 | False | 3.745 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/68 | False | 4.29 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/69 | False | 3.462 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/70 | False | 6.239 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/71 | False | 3.012 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/72 | False | 3.654 | unparsed | — | False | | humaneval | B | HumanEval/73 | False | 3.449 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/74 | False | 3.467 | unparsed | — | False | | humaneval | B | HumanEval/75 | False | 2.901 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/76 | False | 3.208 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/77 | False | 3.047 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/78 | False | 4.164 | unparsed | — | False | | humaneval | B | HumanEval/79 | False | 6.021 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/80 | False | 3.206 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/81 | False | 3.858 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/82 | False | 2.766 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/83 | False | 2.714 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/84 | False | 3.243 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/85 | True | 2.874 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/86 | False | 3.155 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/87 | False | 3.982 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/88 | False | 3.601 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/89 | False | 6.068 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/90 | False | 3.093 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/91 | False | 2.939 | code_fast_no_7b | phys05_code_skeleton | False | | humaneval | B | HumanEval/92 | False | 3.248 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/93 | False | 3.15 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/94 | False | 3.774 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/95 | False | 3.429 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/96 | False | 3.416 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/97 | False | 3.043 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/98 | False | 2.99 | code_fast | phys05_code_skeleton | False | | humaneval | B | HumanEval/99 | False | 6.254 | code_fast | phys05_code_skeleton | False |
Mode key
- A — PARROT: pure llama.cpp 7B HTTP, no Monster runtime.
- B — MONSTER ORGAN-FIRST:
ORGAN_FIRST=1 NO_7B_FALLBACK=1. 0.5B organ chain only; if it fails, the row is a hard fail. Measures raw organ quality. - C — MONSTER ORGAN+7B:
ORGAN_FIRST=1 MONSTER_NATIVE_RETRY=1. 0.5B organ first; on verifier-fail, 7B-chat fallback rescues. Production behaviour.
DOD
GREEN if for both benches: A,B,C present; B uses only phys05_* organs (zero physarium_7b* in B's organs_used_set); C's pass-rate ≥ A's; B reveals raw 0.5B competence floor.