# TERMINAL_NANOOS_NATIVE_V1

Phase-12.TR — same 5 Terminal-Bench-style tasks, same NanoOS shell capsule, but the retry loop now lives in `src/main.cpp` (`run_native_terminal_task`). The bench harness sends ONE `--chat` envelope per task.

| mode | pass | total | rate | wall | VWS |
|---|---|---|---|---|---|
| PARROT_NATIVE | 4 | 5 | 80 % | 3.6s | 1.12 |
| MONSTER_NATIVE | 4 | 5 | 80 % | 8.6s | 0.47 |

**Δ MONSTER_NATIVE − PARROT_NATIVE: +0**

## Per-task

| task | PARROT_N | MONSTER_N | rounds (M) | wall_M (s) | note (M) |
|---|---|---|---|---|---|
| create_file_exact | OK | OK | 1 | 0.542 | content_has:'hello world' |
| run_python_print_42 | OK | OK | 1 | 0.881 | found:'42' |
| fix_broken_python | X | X | 3 | 5.656 | missing:'f() = 3' |
| parse_json | OK | OK | 1 | 0.984 | found:'42' |
| count_grep | OK | OK | 1 | 0.531 | regex_hit:^\\\\s*4\\\\s*$ |

## Architectural witness

* Bench Python sends ONE envelope per task.
* `looks_like_terminal_task()` detects `TERMINAL_TASK_V1` magic.
* `parse_terminal_envelope()` extracts instruction, inputs JSON, verifier JSON.
* `run_native_terminal_task()` drives k=1..3 sequentially, popen-ing `shell_capsule.py` each round and feeding stderr+exit codes back into the next prompt.
* Final envelope emits `attempts`, `first_pass_round`, `final_dag` for replay. The Python bench is now a thin dispatcher.

## DOD

* YELLOW — tie at 4/5: tasks too thin to differentiate (same as Python bench).