omni.git - not just a monorepo, its an *omnirepo*

Age	Commit message (Collapse)	Author
7 days	Merge telegram bot system prompt with user's preferred style	Ben Sima

8 days	Add Telegram bot agent (t-251)	Ben Sima
	- Omni/Agent/Telegram.hs: Telegram API client with getUpdates/sendMessage - Omni/Bot.hs: Standalone CLI for running the bot - User identification via Memory.getOrCreateUserByTelegramId - Memory-enhanced agent with remember/recall tools - Run with: bot --token=XXX or TELEGRAM_BOT_TOKEN env var
8 days	Add cross-agent memory system (t-248)	Ben Sima
	- User management with Telegram ID identification - Memory storage with Ollama embeddings (nomic-embed-text) - Semantic similarity search via cosine similarity - remember/recall tools for agents - runAgentWithMemory wrapper for memory-enhanced agents - Separate memory.db database for user privacy
8 days	t-247: Add Provider abstraction for multi-backend LLM support	Ben Sima
	- Create Omni/Agent/Provider.hs with unified Provider interface - Support OpenRouter (cloud), Ollama (local), Amp (subprocess stub) - Add runAgentWithProvider to Engine.hs for Provider-based execution - Add EngineType to Core.hs (EngineOpenRouter, EngineOllama, EngineAmp) - Add --engine flag to 'jr work' command - Worker.hs dispatches to appropriate provider based on engine type Usage: jr work <task-id> # OpenRouter (default) jr work <task-id> --engine=ollama # Local Ollama jr work <task-id> --engine=amp # Amp CLI (stub)
8 days	Add Omni/Agent/PLAN.md - agent infrastructure roadmap	Ben Sima
	Defines architecture for multi-agent system with: - Provider abstraction (OpenRouter, Ollama, Amp backends) - Shared memory system (sqlite-vss, multi-user, cross-agent) - Tool registry for pluggable tool sets - Evals framework for regression testing - Telegram bot as first concrete agent Tasks: t-247 through t-251
2025-12-02	Improve agent system prompt for better token efficiency	Ben Sima
	Add explicit guidance on: - Reading files with large ranges (500+ lines) instead of many small chunks - Using read_file directly when target file is known vs search_and_read - Cost awareness: planning refactors, avoiding redundant reads - Tool call limits for complex tasks
2025-12-02	Remove duplicate prompt content	Ben Sima
	The task was being added to the prompt twice, once in the base prompt and once in the user prompt.
2025-12-02	System prompt improvements	Ben Sima
	Worked with Gemini and Opus to improve the system prompt with learnings from the Amp prompt. Removed reference to Omni/Task/README.md because it is deprecated in favor of `jr task`.
2025-12-02	jr: add 'prompt' command to inspect agent system prompt	Ben Sima
	jr prompt <task-id> constructs and prints the full system prompt that would be sent to the agent, including: - Agent configuration (model, cost budget) - Base instructions - AGENTS.md content - Relevant facts from knowledge base - Retry/progress context if applicable Useful for debugging agent behavior and token usage.
2025-12-01	Compact amp-style timeline rendering and targeted file reading	Ben Sima
	Timeline tool display: - Grep/search: ✓ Grep pattern in filepath - Read file: ✓ Read filepath @start-end - Edit file: ✓ Edit filepath - Bash: ϟ command (lightning bolt prompt) - Tool results only shown for meaningful output New search_and_read tool: - Combines search + read in one operation - Uses ripgrep --context for surrounding lines - More efficient than separate search then read Worker prompt updated to prefer search_and_read over separate search + read_file calls
2025-12-01	Fix build errors in Jr modules	Ben Sima
	- Fix Worker.hs to use EngineError instead of tuple - Fix Types.hs imports for LazyText.encodeUtf8 and dayOfWeek - Remove duplicate SortOrder from Components.hs (import from Types.hs) - Add orphan instance pragmas to Pages.hs and Partials.hs - Clean up unused imports
2025-12-01	Fix timeline partial to include cost/token metrics and controls	Ben Sima
	The HTMX-refreshed AgentEventsPartial was missing: - Cost/token summary in header - Live toggle button - Autoscroll toggle button - Comment form Now matches the full page renderUnifiedTimeline output.
2025-12-01	Strengthen prompt to stop immediately after tests pass	Ben Sima
	Jr was completing tasks but then going into verification loops, re-reading files and 'tracing through logic' after tests passed. This burned ~4 cents of extra cost on t-221. Made instructions more emphatic: - 'STOP IMMEDIATELY' with explicit list of what NOT to do - 'ANY further tool calls are wasted money' - Repeated in BUILD SYSTEM NOTES section Task-Id: t-227
2025-12-01	Add prompt guidance for large file editing	Ben Sima
	Instructs the agent to: - Use line ranges when reading large files (>500 lines) - Use minimal context for edit_file old_str matching - Re-read exact lines after failed edits - Stop after 2-3 failed edits to reconsider approach - Flag very large files (>2000 lines) for refactoring Task-Id: t-225
2025-12-01	Add guardrail for repeated edit_file failures	Ben Sima
	Tracks 'old_str not found' errors from edit_file tool calls. After 5 consecutive failures, stops the agent to prevent burning tokens on impossible edits. This catches the pattern where the agent repeatedly tries to edit a large file with incorrect old_str matches, which was the root cause of t-222 exceeding its cost budget. Task-Id: t-224
2025-12-01	Scale cost guardrail by task complexity	Ben Sima
	Cost limits by complexity level: - Complexity 1: 50 cents - Complexity 2: 100 cents - Complexity 3: 200 cents (default) - Complexity 4: 400 cents - Complexity 5: 600 cents This prevents low-complexity tasks from burning budget while allowing complex tasks more room for iteration. Task-Id: t-223
2025-12-01	Clicking LIVE label toggles live updates on/off	Ben Sima
	- Add clickable LIVE toggle button that pauses/resumes timeline polling - Green pulsing when active, grey when paused - Uses htmx:beforeRequest event to cancel requests when paused - Increase duplicate tool call guardrail from 20 to 30 Task-Id: t-211
2025-12-01	Show tool call arguments inline instead of JSON blob	Ben Sima
	- Add formatToolCallSummary to extract key argument from JSON - Shows run_bash command, file paths for read/edit/write, patterns for search - Display summary inline in tool call header (e.g., run_bash: `ls -la`) - Increase token guardrail from 1M to 2M to prevent premature stops Task-Id: t-212
2025-12-01	Increase duplicate tool call guardrail limit from 5 to 20	Ben Sima
	The limit of 5 was too aggressive - reading 5 different files while exploring a codebase would trigger the guardrail. 20 allows for legitimate exploration while still catching infinite loops.
2025-12-01	Add actor tracking for status changes and use unified timeline	Ben Sima
	- updateTaskStatusWithActor logs status_change events to agent_events - Worker uses Junior actor for status changes - Jr review uses System/Human actors appropriately - CLI task update uses Human actor - Remove task_activity table schema (migrated to agent_events) - addComment now inserts into agent_events with event_type='comment' Task-Id: t-213
2025-12-01	Add actor column to agent_events table	Ben Sima
	- Add 'actor' column to agent_events table (human/junior/system) - Add System to CommentAuthor type (reused for actor) - Add SQL FromField/ToField instances for CommentAuthor - Update insertAgentEvent to accept actor parameter - Update all SELECT queries to include actor column - Update Worker.hs to pass actor for all event types - Guardrail events logged with System actor Migration: ALTER TABLE adds column with default 'junior' for existing rows. Task-Id: t-213.1
2025-12-01	Add guardrails and progress tracking to Jr agent	Ben Sima
	Implement runtime guardrails in Engine.hs: - Cost budget limit (default 200 cents) - Token budget limit (default 1M tokens) - Duplicate tool call detection (same tool called N times) - Test failure counting (bild --test failures) Add database-backed progress tracking: - Checkpoint events stored in agent_events table - Progress summary retrieved on retry attempts - Improved prompts emphasizing efficiency and autonomous operation Worker.hs improvements: - Uses guardrails configuration - Reports guardrail violations via callbacks - Better prompt structure for autonomous operation Task-Id: t-203
2025-12-01	Improve Jr agent structure with progress file and incremental workflow	Ben Sima
	Perfect! All changes are in place and working correctly. Let me create a I have successfully implemented the improvements to Jr Worker agent stru 1. Progress File Tracking - Added `readProgressFile` function to read `_/llm/${taskId}-progress - Added `buildProgressPrompt` function to include progress context in - Modified `runWithEngine` to load and include progress at the start 2. Incremental Workflow Enforcement - Updated base prompt to explicitly instruct: "Pick ONE specific chan - Added "INCREMENTAL WORKFLOW (IMPORTANT)" section with clear guidanc - Added instruction to write progress after each change - Emphasized that tasks may be run multiple times to complete all cha 3. Explicit Verification - Maintained existing requirement to run `bild --test` before complet - Added instruction to save progress only after tests pass - Clarified that code must be left in clean, testable state 4. Avoid Redundant Testing - Updated BUILD SYSTEM NOTES to clarify running `bild --test` on name - Added explicit instruction not to re-run tests unless more changes - Explained that bild handles dependencies transitively - `bild --test Omni/Agent/Worker.hs` - PASSED ✓ - `lint Omni/Agent/Worker.hs` - NO ISSUES ✓ - `_/llm/t-203-progress.md` - Progress file documenting this implementat - `_/llm/t-203-implementation-summary.md` - Detailed summary of changes The implementation follows industry best practices from Anthropic, OpenA - Reduced token usage through focused, incremental changes - Better code quality with isolated, tested changes - Improved reliability with progress tracking across sessions - Clear workflow preventing "declaring victory" too early Task-Id: t-203
2025-12-01	Improve Worker.hs prompt to avoid redundant test/lint runs	Ben Sima
	Perfect! The changes are exactly what we need. The diff shows that I've 1. ✅ `bild --test` on a namespace tests all its dependencies - no need t 2. ✅ Don't re-run tests after they already passed 3. ✅ Lint runs automatically via git hooks, so manual runs are optional I have successfully completed task t-202. The changes made to `Omni/Agen 1. Explaining dependency handling: Added explicit note that `bild -- 2. Preventing redundant testing: Added instruction to not re-run tes 3. Clarifying lint automation: Explained that lint runs automaticall All tests pass (`bild --test Omni/Agent/Worker.hs`) and all lint checks Task-Id: t-202
2025-12-01	Fix agent event content double-encoding in web UI	Ben Sima
	Perfect! Everything passes. Let me create a final summary of what was do I successfully fixed the agent event content double-encoding issue in th - Agent events showed raw JSON with escaped quotes and literal `\n` in t - Example: ToolResult displayed `{"output":"Replaced 1 occurrence(s)","s - Assistant messages showed literal `\n` instead of newlines In `Omni/Agent/Worker.hs`, the `logEvent` helper function was wrapping t Modified `Omni/Agent/Worker.hs` to distinguish between text and structur 1. Created two helper functions (lines 250-256): - `logEventText`: Stores text content as-is without JSON encoding - `logEventJson`: JSON-encodes structured data (for Cost events) 2. Updated all event logging calls to use the appropriate function: - `engineOnAssistant`: Uses `logEventText` with plain message text - `engineOnToolCall`: Uses `logEventText` with plain tool call descri - `engineOnToolResult`: Uses `logEventText` with plain output text - `engineOnError`: Uses `logEventText` with plain error message - `engineOnComplete`: Uses `logEventText` with empty string - `engineOnCost`: Uses `logEventJson` for structured JSON (preserves 3. No changes to Web.hs were needed - the rendering functions alread ✅ `bild --test Omni/Jr/Web.hs` - PASSED ✅ `lint Omni/Agent/Worker.hs` - PASSED ✅ `lint Omni/Jr/Web.hs` - PASSED The fix is complete and ready for commit. Agent events will now display Task-Id: t-200
2025-12-01	Fix cost reporting - parse actual cost from OpenRouter API response	Ben Sima
	Perfect! All tests pass for the affected modules. Now let me verify the I've successfully implemented the fix for cost reporting as specified in - Added `usageCost :: Maybe Double` field to the `Usage` data type - Updated `FromJSON` instance to parse the optional `cost` field from th - Modified `ChatCompletionRequest` ToJSON instance to include `"usage": - This enables OpenRouter to return actual cost information in the respo - Updated the `runAgent` loop to use actual cost from the API response w - Falls back to `estimateCost` when actual cost is not provided - Converts from dollars to cents (multiplies by 100) since OpenRouter re - The `engineOnCost` callback already uses `Double` for cost (not `Int`) - The `estimateCost` function already returns `Double`, avoiding integer - The `AgentResult` type already uses `Double` for `resultTotalCost` All tests pass successfully: - ✅ `Omni/Agent/Engine.hs` - All 14 tests pass, including new tests for - ✅ `Omni/Agent/Worker.hs` - Builds successfully - ✅ `Omni/Agent.hs` - All combined tests pass - ✅ All files pass lint checks (ormolu + hlint) The implementation correctly addresses all points in the task descriptio 1. ✅ Parses actual cost from OpenRouter API response 2. ✅ Enables usage accounting in requests 3. ✅ Uses Double for cost to avoid rounding issues 4. ✅ Falls back to estimation when actual cost is unavailable The previous error with `bild --test .` was due to `.` not being a valid Task-Id: t-197.8
2025-12-01	Fix Worker.hs prompt to use lint instead of hlint	Ben Sima
	Perfect! The changes have been successfully implemented. Let me summariz I've successfully updated the `buildBasePrompt` function in `Omni/Agent/ 1. Line 320: Changed "including hlint suggestions" → "including lint 2. Line 324: Changed "if hlint finds issues" → "if lint finds issues 3. Line 325: Changed "You must fix hlint suggestions like:" → "You m 4. Removed lines 326-328: Deleted the specific hlint suggestion exam - 'Use list comprehension' -> use [x \| cond] instead of if/else - 'Avoid lambda' -> use function composition - 'Redundant bracket' -> remove unnecessary parens - Ran `bild --test Omni/Agent/Worker.hs` ✓ PASSED with no errors The prompt now correctly references the `lint` command instead of `hlint Task-Id: t-199
2025-12-01	Fix cost reporting - parse actual cost from OpenRouter API response	Ben Sima
	I have successfully completed task t-197.8 to fix cost reporting by pars Omni/Agent/Engine.hs: 1. Added `usageCost :: Maybe Double` field to the `Usage` type to captur 2. Updated `FromJSON` instance to parse the optional `"cost"` field 3. Modified `ChatCompletionRequest` ToJSON instance to include `"usage": 4. Changed cost types from `Int` to `Double` throughout (engineOnCost ca 5. Updated `estimateCost` to use floating-point division instead of inte 6. Modified `runAgent` to use actual cost from API when available, conve 7. Added new test case for parsing usage with cost field Omni/Agent/Worker.hs: 1. Updated `runWithEngine` signature to return `Double` for cost 2. Changed `totalCostRef` from `IORef Int` to `IORef Double` 3. Added rounding when storing cost in DB metrics to maintain backward c ✅ All tests pass: - Omni/Agent/Engine.hs - 16 unit tests pass - Omni/Agent/Worker.hs - Builds successfully - Omni/Agent.hs - All integration tests pass - Omni/Jr.hs - All 12 tests pass ✅ All lint checks pass: - No hlint issues - No ormolu formatting issues The implementation correctly handles OpenRouter's cost format (credits w Task-Id: t-197.8
2025-12-01	Validate cwd exists before running bash commands	Ben Sima
	run_bash tool now checks if the working directory exists before executing. Previously invalid cwd caused system-level chdir error. Now returns clean tool error the agent can understand and react to.
2025-11-30	Extract facts from completed tasks after review acceptance	Ben Sima
	Perfect! Let me verify the complete implementation checklist against the ✅ 1. In Jr.hs, after accepting a task in review, call fact extraction: - Line 424: `extractFacts tid commitSha` - called in `autoReview` aft - Line 504: `extractFacts tid commitSha` - called in `interactiveRevi ✅ 2. Add extractFacts function: - Lines 585-600: Implemented with correct signature `extractFacts :: - Gets diff using `git show --stat` - Loads task context - Calls LLM CLI tool with `-s` flag - Handles success/failure cases ✅ 3. Add buildFactExtractionPrompt function: - Lines 603-620: Implemented with correct signature - Includes task ID, title, description - Includes diff summary - Provides clear instructions for fact extraction - Includes example format ✅ 4. Add parseFacts function: - Lines 623-627: Implemented with correct signature - Filters lines starting with "FACT: " - Calls `addFactFromLine` for each fact ✅ 5. Add addFactFromLine function: - Lines 630-636: Implemented with correct signature - Removes "FACT: " prefix - Parses file list from brackets - Calls `Fact.createFact` with project="Omni", confidence=0.7, source - Prints confirmation message ✅ 6. Add parseFiles helper function: - Lines 639-649: Implemented to parse `[file1, file2, ...]` format ✅ 7. Import for Omni.Fact module: - Line 22: `import qualified Omni.Fact as Fact` already present ✅ 8. Workflow integration: - Current: work -> review -> accept -> fact extraction -> done ✅ - Fact extraction happens AFTER status update to Done - Fact extraction happens BEFORE epic completion check The implementation is complete and correct**. All functionality descri 1. ✅ Facts are extracted after task review acceptance (both auto and man 2. ✅ LLM is called with proper context (task info + diff) 3. ✅ Facts are parsed and stored with correct metadata (source_task, con 4. ✅ All tests pass (`bild --test Omni/Agent.hs`) 5. ✅ No linting errors (`lint Omni/Jr.hs`) The feature is ready for use and testing. When a task is completed and a 1. The LLM will be prompted to extract facts 2. Any facts learned will be added to the knowledge base 3. Each fact will have `source_task` set to the task ID 4. Facts can be viewed with `jr facts list` Task-Id: t-185
2025-11-30	Add agent observability: event logging and storage	Ben Sima
	- Add Omni/Agent/Event.hs with AgentEvent types - Add agent_events table schema and CRUD functions to Core.hs - Add new callbacks to Engine.hs: onAssistant, onToolResult, onComplete, onError - Wire event logging into Worker.hs with session tracking Events are now persisted to SQLite for each agent work session, enabling visibility into agent reasoning and tool usage. Task-Id: t-197.1 Task-Id: t-197.2 Task-Id: t-197.3
2025-11-30	Fix jr loop: update model IDs and dev shell	Ben Sima
	- Update OpenRouter model IDs to Claude 4.5 family: - anthropic/claude-sonnet-4.5 (default) - anthropic/claude-haiku-4.5 (simple tasks) - anthropic/claude-opus-4.5 (complex tasks) - Remove aider-chat from dev shell (broken, unused) - Simplify llm package (remove llm-ollama plugin) - Update nixos-unstable for llm 0.27.1 Task-Id: t-163
2025-11-30	Audit and verify Engine testing coverage	Ben Sima
	All tests pass and lint is clean. Let me verify the final test coverage Engine.hs Test Coverage (13 tests): - ✅ Tool JSON roundtrip - ✅ Message JSON roundtrip - ✅ ToolCall JSON roundtrip (NEW) - ✅ FunctionCall JSON roundtrip (NEW) - ✅ Role JSON roundtrip for all roles (NEW) - ✅ defaultLLM endpoint & headers - ✅ defaultAgentConfig defaults - ✅ defaultEngineConfig callbacks - ✅ buildToolMap correctness - ✅ Usage JSON parsing - ✅ AgentResult JSON roundtrip - ✅ estimateCost calculation Tools.hs Test Coverage (19 tests): - ✅ All 5 tool schemas are valid objects - ✅ allTools contains 5 tools - ✅ ReadFileArgs parsing - ✅ WriteFileArgs parsing - ✅ EditFileArgs parsing - ✅ RunBashArgs parsing - ✅ SearchCodebaseArgs parsing - ✅ ToolResult success/failure JSON roundtrip - ✅ readFileTool handles missing files (NEW) - ✅ editFileTool handles no-match case (NEW) - ✅ runBashTool captures exit codes (NEW) - ✅ runBashTool captures stdout (NEW) - ✅ searchCodebaseTool returns structured results (NEW) All unit tests from the checklist are now covered. The integration and m Task-Id: t-141.7
2025-11-30	Remove amp dependency entirely	Ben Sima
	The build and tests pass. Let me provide a summary of the changes made: Removed the amp dependency entirely from the codebase: - Removed `runAmp` function (was running amp subprocess) - Removed `shouldUseEngine` function (env var check `JR_USE_ENGINE`) - Removed `monitorLog` and `waitForFile` helpers (for amp.log parsing) - Removed unused imports: `System.IO`, `Data.Text.IO` - Made `runWithEngine` the default/only path - Updated error messages from "amp" to "engine" - Renamed `ampOutput` parameter to `agentOutput` in `formatCommitMessage - Added `Data.IORef` import for `newIORef`, `modifyIORef'`, `readIORef` - Removed amp.log parsing code: `LogEntry`, `processLogLine`, `updateFro - Removed unused imports: `Data.Aeson`, `Data.ByteString.Lazy`, `Data.Te - Renamed `activityAmpThreadUrl` to `activityThreadUrl` - Updated field references from `activityAmpThreadUrl` to `activityThrea - Updated UI label from "Amp Thread:" to "Session:" - Updated comment from "amp completes" to "engine completes" - Updated `Amp.execute` to `Engine.runAgent` - Updated logging section to describe Engine callbacks instead of amp.lo - Updated integration test guidance to mock Engine instead of amp binary Task-Id: t-141.6
2025-11-30	Add task complexity field and model selection	Ben Sima
	All tests pass. Let me summarize the changes made: - Added `taskComplexity :: Maybe Int` field to the `Task` data type (1-5 - Updated SQL schema to include `complexity INTEGER` column - Updated `FromRow` and `ToRow` instances to handle the new field - Updated `tasksColumns` migration spec for automatic schema migration - Updated `saveTask` to include complexity in SQL INSERT - Updated `createTask` signature to accept `Maybe Int` for complexity - Added `--complexity=<c>` option to the docopt help string - Added complexity parsing in `create` command (validates 1-5 range) - Added complexity parsing in `edit` command - Updated `modifyFn` in edit to handle complexity updates - Updated all unit tests to use new `createTask` signature with complexi - Added CLI tests for `--complexity` flag parsing - Added unit tests for complexity field storage and persistence - Updated `selectModel` to use `selectModelByComplexity` based on task c - Added `selectModelByComplexity :: Maybe Int -> Text` function with map - `Nothing` or 3-4 → `anthropic/claude-sonnet-4-20250514` (default) - 1-2 → `anthropic/claude-haiku` (trivial/low complexity) - 5 → `anthropic/claude-opus-4-20250514` (expert complexity) - Updated `createTask` calls to include `Nothing` for complexity Task-Id: t-141.5
2025-11-30	Replace amp subprocess with native Engine in Worker	Ben Sima
	Implementation complete. Summary of changes to [Omni/Agent/Worker.hs](fi 1. Added imports: `Omni.Agent.Engine`, `Omni.Agent.Tools`, `System.E 2. Added `shouldUseEngine` (L323-327): Checks `JR_USE_ENGINE=1` envi 3. Added `runWithEngine` (L329-409): Native engine implementation th - Reads `OPENROUTER_API_KEY` from environment - Builds `EngineConfig` with cost/activity/tool callbacks - Builds `AgentConfig` with tools from `Tools.allTools` - Injects AGENTS.md, facts, retry context - Returns `(ExitCode, Text, Int)` tuple 4. Added `buildBasePrompt` and `buildRetryPrompt` (L411-465): Help 5. Added `selectModel` (L467-471): Model selection (currently always 6. Updated `processTask`** (L92-120): Checks feature flag and routes t Task-Id: t-141.4
2025-11-30	Define Tool protocol and LLM provider abstraction	Ben Sima
	The implementation is complete. Here's a summary of the changes made: 1. Updated LLM type to include `llmExtraHeaders` field for OpenRoute 2. Changed `defaultLLM` to use: - OpenRouter base URL: `https://openrouter.ai/api/v1` - Default model: `anthropic/claude-sonnet-4-20250514` - OpenRouter headers: `HTTP-Referer` and `X-Title` 3. Updated `chatWithUsage` to apply extra headers to HTTP requests 4. Added `case-insensitive` dependency for proper header handling 5. Added tests for OpenRouter configuration 6. Fixed hlint suggestions (Use `</` instead of `<$>`, eta reduce) Task-Id: t-141.1
2025-11-29	Implement core coding tools (read, write, bash, search)	Ben Sima
	Both `bild --test` passes for Engine.hs and Tools.hs, and lint passes. T 1. readFileTool - Reads file contents with optional line range 2. writeFileTool - Creates/overwrites files (checks parent dir exist 3. editFileTool - Search/replace with optional replace_all flag 4. runBashTool - Executes shell commands, returns stdout/stderr/exit 5. searchCodebaseTool - Ripgrep wrapper with pattern, path, glob, ca Plus ToolResult type and allTools export as required. Task-Id: t-141.3
2025-11-29	Implement agent loop with tool execution	Ben Sima
	The implementation is complete. Here's what was implemented: Types Added: - `EngineConfig`: Contains LLM provider config and callbacks (`engineOnC - `AgentResult`: Results of running an agent (finalMessage, toolCallCoun - `Usage`: Token usage from API responses - `ChatResult`: Internal type for chat results with usage Functions Added: - `runAgent :: EngineConfig -> AgentConfig -> Text -> IO (Either Text Ag - `buildToolMap` - Creates a lookup map from tool list - `executeToolCalls` - Executes tool calls and returns tool messages - `estimateCost` / `estimateTotalCost` - Cost estimation helpers - `chatWithUsage` - Chat that returns usage stats - `defaultEngineConfig` - Default no-op engine configuration Loop Logic: 1. Sends messages to LLM via `chatWithUsage` 2. If response has tool_calls, executes each tool via `executeToolCalls` 3. Appends tool results as ToolRole messages 4. Repeats until no tool_calls or maxIterations reached 5. Tracks cost/tokens and calls callbacks at appropriate points Task-Id: t-141.2
2025-11-29	Define Tool protocol and LLM provider abstraction	Ben Sima
	The implementation is complete. I created [Omni/Agent/Engine.hs](file:// - Types: `Tool`, `LLM`, `AgentConfig`, `Message`, `Role`, `ToolCall` - Functions: `chat` for OpenAI-compatible HTTP via http-conduit, `de - Tests: JSON roundtrip for Tool, Message; validation of defaults All lints pass (hlint + ormolu) and tests pass. Task-Id: t-141.1
2025-11-29	Inject relevant facts into coder agent context	Ben Sima
	All checks pass. The implementation is complete: 1. Added imports for `Data.List` and `Omni.Fact` 2. Added `getRelevantFacts` function that retrieves facts for the task's 3. Added `formatFacts` and `formatFact` functions to format facts for in 4. Updated `runAmp` to call `getRelevantFacts`, format them, and append Task-Id: t-186
2025-11-29	Inject task comments into agent context during work and review	Ben Sima
	Build and lint both pass. The implementation: 1. Updated `formatTask` in [Omni/Agent/Worker.hs](file:///home/ben/omni/ 2. Extracted deps formatting to a separate `formatDeps` helper for consi 3. Added `formatComments` and `formatComment` helpers that show timestam Task-Id: t-184
2025-11-28	Add comments field to tasks for providing extra context	Ben Sima
	All tests pass. Here's a summary of the changes I made: 1. Added `Comment` data type in `Omni/Task/Core.hs` with `commentTex 2. Added `taskComments` field to the `Task` type to store a list of 3. Updated database schema with a `comments TEXT` column (stored as 4. Added SQL instances for `[Comment]` to serialize/deserialize 5. Added `addComment` function to add timestamped comments to tasks 6. Added CLI command `task comment <id> <message> [--json]` 7. Updated `showTaskDetailed` to display comments in the detailed vi 8. Added unit tests for comments functionality 9. Added CLI tests for the comment command 10. Fixed dependent files (`Omni/Agent/Worker.hs` and `Omni/Jr/Web.h Task-Id: t-167
2025-11-28	Fix llm tool installation - update nixpkgs hash in Biz/Bild.nix	Ben Sima
	The build passed. The task was to update nixpkgs hash in Biz/Bild.nix, b Task-Id: t-163
2025-11-28	Truncate task title to 52 characters in commit message subject line	Ben Sima
	The build and tests pass. The change is complete - the task title in com Task-Id: t-159
2025-11-27	Add human notes field for intervention tasks	Ben Sima
	All tests pass. Let me summarize the implementation: I've added a human notes field for intervention tasks with the following 1. Omni/Task/Core.hs: - Added `retryNotes :: Maybe Text` field to `RetryContext` data type - Added `notes` column to `retryContextColumns` for schema migration - Updated `getRetryContext` to fetch the notes field from DB - Updated `setRetryContext` to save the notes field to DB - Updated `getAllRetryContexts` to include notes - Added `updateRetryNotes :: Text -> Text -> IO ()` function to updat 2. Omni/Jr/Web.hs: - Added new API endpoint: `POST /tasks/:id/notes` - Added `NotesForm` type and `FromForm` instance - Added `taskNotesHandler` to save notes - Updated `renderRetryContextBanner` to accept task ID and display: - Notes textarea form when max retries exceeded (intervention tasks - Existing notes display for non-critical retry banners 3. Omni/Agent/Worker.hs: - Updated worker prompt to include human notes/guidance in the retry - Preserved existing notes when setting new retry context 4. Omni/Jr.hs: - Updated all `RetryContext` creations to preserve existing notes Task-Id: t-153.5
2025-11-27	Display worker metrics on task detail page	Ben Sima
	All tests pass. Let me summarize what was implemented: - Extended `TaskActivity` type with new fields: - `activityAmpThreadUrl` - Link to amp thread - `activityStartedAt` - Work start timestamp - `activityCompletedAt` - Work completion timestamp - `activityCostCents` - API cost in cents - `activityTokensUsed` - Token usage count - Updated `SQL.FromRow` and `SQL.ToRow` instances for the new fields - Updated schema to include new columns in `task_activity` table - Added `logActivityWithMetrics` function to log activities with all met - Added `updateActivityMetrics` function to update metrics on existing a - Added `getLatestRunningActivity` helper function - Captures execution timing (start/end timestamps) - Retrieves amp thread URL from `AgentLog.getStatus` - Converts credits to cents and logs to activity record - Uses `logActivityWithMetrics` and `updateActivityMetrics` for tracking - Added `getStatus` function to retrieve current status (thread URL, cre - Added `TaskMetricsPartial` type for HTMX auto-refresh - Extended `TaskDetailPage` to include `RetryContext` - Added Execution Details section on task detail page showing: - Amp Thread URL (clickable link) - Duration (formatted as "Xm Ys") - Cost (formatted as "$X.XX") - Retry Attempt count (if applicable) - Last Activity timestamp - Added `/partials/task/:id/metrics` endpoint for HTMX auto-refresh - Auto-refresh enabled while task is InProgress (every 5s) - Added `renderExecutionDetails` helper function - Added `executionDetailsStyles` for metric rows and execution section - Added dark mode support for execution details section Task-Id: t-148.4
2025-11-27	Fix filter dropdowns returning empty string for All option	Ben Sima
	The build passes. The fix I implemented: 1. Changed the API type in `Omni/Jr/Web.hs` to use `QueryParam "stat 2. Added manual parsing in `taskListHandler` with `parseStatus` and 3. Applied `emptyToNothing` to both status and priority params befor This ensures that when "All" is selected (empty string), it's treated as I also fixed two pre-existing issues that were blocking the build: - Type annotation for `show stage` in `Omni/Task/Core.hs` - `AesonKey.fromText` conversion in `Omni/Agent/Worker.hs` Task-Id: t-149.1
2025-11-27	Add logActivity helper and integrate into Worker.hs	Ben Sima
	Implementation complete. The task is done: 1. Created `logActivity` helper in `Omni/Task/Core.hs` that writes t 2. Integrated into Worker.hs at all key points: - `Claiming` - when claiming task - `Running` - when starting amp - `Reviewing` - when amp completes successfully - `Retrying` - on retry (includes attempt count in metadata) - `Completed` - on success (includes result type in metadata) - `Failed` - on failure (includes exit code or reason in metadata) Task-Id: t-148.2
2025-11-26	Improve worker prompt and fix output interleaving	Ben Sima
	- More explicit prompt: MUST run bild --test, fix hlint issues - Add workerQuiet flag to disable ANSI status bar in loop mode - Loop mode uses simple putText, manual jr work keeps status bar