summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBen Sima <ben@bensima.com>2025-12-01 18:10:18 -0500
committerBen Sima <ben@bensima.com>2025-12-01 18:10:18 -0500
commit838350a9afc27618abf9a78e721eb8902e99b6ab (patch)
tree0d03e0545ca50e54f1bd0e7f728e90e6b635e509
parent20665b023c5dcf13c01b692711568393cb1cdb61 (diff)
Strengthen prompt to stop immediately after tests pass
Jr was completing tasks but then going into verification loops, re-reading files and 'tracing through logic' after tests passed. This burned ~4 cents of extra cost on t-221. Made instructions more emphatic: - 'STOP IMMEDIATELY' with explicit list of what NOT to do - 'ANY further tool calls are wasted money' - Repeated in BUILD SYSTEM NOTES section Task-Id: t-227
-rw-r--r--Omni/Agent/Worker.hs7
1 files changed, 4 insertions, 3 deletions
diff --git a/Omni/Agent/Worker.hs b/Omni/Agent/Worker.hs
index c52d4a9..8adb7c2 100644
--- a/Omni/Agent/Worker.hs
+++ b/Omni/Agent/Worker.hs
@@ -355,10 +355,11 @@ buildBasePrompt task ns repo =
<> "3. Run 'bild --test "
<> ns
<> "' ONCE after implementing.\n"
- <> "4. If tests pass, you are DONE - stop immediately.\n"
+ <> "4. **CRITICAL**: If tests pass, STOP IMMEDIATELY. Do not verify, do not review, do not trace logic, do not search for usages. Just stop.\n"
<> "5. If tests fail, fix the issue and run tests again.\n"
<> "6. If tests fail 3 times on the same issue, STOP - the task will be marked for human review.\n"
- <> "7. Do NOT update task status or manage git - the worker handles that.\n\n"
+ <> "7. Do NOT update task status or manage git - the worker handles that.\n"
+ <> "8. After tests pass, ANY further tool calls are wasted money. The worker will commit your changes.\n\n"
<> "AUTONOMOUS OPERATION (NO HUMAN IN LOOP):\n"
<> "- You are running autonomously without human intervention\n"
<> "- There is NO human to ask questions or get clarification from\n"
@@ -370,7 +371,7 @@ buildBasePrompt task ns repo =
<> ns
<> "' tests ALL dependencies transitively - run it ONCE, not per-file\n"
<> "- Do NOT run bild --test on individual files separately\n"
- <> "- Once tests pass, STOP - do not continue adding features or re-running tests\n"
+ <> "- Once tests pass, STOP IMMEDIATELY - no verification, no double-checking, no 'one more look'\n"
<> "- Use 'lint --fix' for formatting issues (not hlint directly)\n\n"
<> "EFFICIENCY REQUIREMENTS:\n"
<> "- Do not repeat the same action multiple times\n"