# Subagent Hardening Design **Status:** Draft **Goal:** Robust background execution, async updates, audit logging, user confirmation. Based on Anthropic's [Effective Harnesses for Long-Running Agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents). ## 1. Background Execution with Async Updates ### 1.1 SubagentHandle Replace synchronous `runSubagent` with async spawn returning a handle: ```haskell -- | Handle to a running subagent for status queries and control data SubagentHandle = SubagentHandle { handleId :: SubagentId -- Unique ID (UUID) , handleAsync :: Async SubagentResult -- async thread handle , handleStartTime :: UTCTime , handleConfig :: SubagentConfig , handleStatus :: TVar SubagentRunStatus , handleEvents :: TQueue SubagentEvent -- Event stream } -- | Runtime status of a subagent (queryable) data SubagentRunStatus = SubagentRunStatus { runIteration :: Int , runTokensUsed :: Int , runCostCents :: Double , runElapsedSeconds :: Int , runCurrentActivity :: Text -- e.g. "Reading https://..." , runLastToolCall :: Maybe (Text, UTCTime) -- (tool_name, timestamp) } -- | Subagent lifecycle events for logging/streaming data SubagentEvent = SubagentStarted SubagentId SubagentConfig UTCTime | SubagentActivity SubagentId Text UTCTime | SubagentToolCall SubagentId Text Aeson.Value UTCTime | SubagentToolResult SubagentId Text Bool Text UTCTime | SubagentThinking SubagentId Text UTCTime -- Extended thinking | SubagentCost SubagentId Int Double UTCTime -- tokens, cents | SubagentCompleted SubagentId SubagentResult UTCTime | SubagentError SubagentId Text UTCTime deriving (Show, Eq, Generic) ``` ### 1.2 New API ```haskell -- | Spawn subagent in background, return handle immediately spawnSubagentAsync :: SubagentApiKeys -> SubagentConfig -> IO SubagentHandle -- | Query current status (non-blocking) querySubagentStatus :: SubagentHandle -> IO SubagentRunStatus -- | Check if complete (non-blocking) isSubagentDone :: SubagentHandle -> IO Bool -- | Wait for completion (blocking) waitSubagent :: SubagentHandle -> IO SubagentResult -- | Cancel a running subagent cancelSubagent :: SubagentHandle -> IO () -- | Read all events so far (for logging/UI) drainSubagentEvents :: SubagentHandle -> IO [SubagentEvent] ``` ### 1.3 Ava Integration Ava's orchestrator loop can now: 1. Spawn subagents in background 2. Continue conversation with user 3. Periodically poll for updates: `"🔍 WebCrawler running (45s, 12k tokens)..."` 4. Receive completion and synthesize result ```haskell -- In Ava's message handler: handle <- spawnSubagentAsync keys config -- Non-blocking check in conversation loop: status <- querySubagentStatus handle when (runElapsedSeconds status > 30) $ sendMessage chat $ "⏳ Subagent still working: " <> runCurrentActivity status -- When user asks for status: status <- querySubagentStatus handle sendMessage chat $ formatSubagentStatus status -- On completion: result <- waitSubagent handle sendMessage chat $ "✅ " <> subagentSummary result ``` ## 2. User Confirmation Before Spawning ### 2.1 Confirmation Flow Before spawning any subagent or long-running process, Ava must: ``` User: Research competitors for podcast transcription Ava: I'll spawn a WebCrawler subagent to research this. Estimated: • Time: ~5-10 minutes • Cost: up to $0.50 • Tools: web_search, read_webpages Proceed? [Yes/No] User: Yes Ava: 🚀 Spawning WebCrawler subagent... 🔍 [WebCrawler] Starting research... ``` ### 2.2 Implementation ```haskell data SpawnRequest = SpawnRequest { spawnConfig :: SubagentConfig , spawnEstimatedTime :: (Int, Int) -- (min, max) minutes , spawnEstimatedCost :: Double -- max cents , spawnRationale :: Text -- why we need this } -- | Generate confirmation message for user formatSpawnConfirmation :: SpawnRequest -> Text -- | Parse user confirmation response data ConfirmationResponse = Confirmed | Rejected | Modified SubagentConfig parseConfirmation :: Text -> ConfirmationResponse ``` ### 2.3 Tool Modification The `spawn_subagent` tool becomes a two-phase operation: 1. **Phase 1 (propose):** Returns confirmation request, doesn't spawn 2. **Phase 2 (confirm):** User confirms, actually spawns Alternative: Add `confirm_spawn` as separate tool that takes a pending spawn ID. ## 3. Audit Logging System ### 3.1 Log Storage All agent activity persisted to append-only JSONL files under `AVA_DATA_ROOT/logs/`: ``` $AVA_DATA_ROOT/logs/ # e.g. /home/ava/logs/ or _/var/ava/logs/ ├── ava/ │ ├── 2024-01-15.jsonl # Daily Ava conversation logs │ └── 2024-01-16.jsonl └── subagents/ ├── S-7f3a2b.jsonl # Per-subagent trace (named by SubagentId) └── S-9e4c1d.jsonl ``` ### 3.2 SubagentId Linking Each subagent gets a unique `SubagentId` (short UUID prefix) that links: - The `SubagentResult` returned to Ava - The JSONL log file (`S-{id}.jsonl`) - References in Ava's daily log ```haskell -- | Unique identifier for a subagent run newtype SubagentId = SubagentId { unSubagentId :: Text } deriving (Show, Eq, Generic, Aeson.ToJSON, Aeson.FromJSON) -- | Generate a new subagent ID (first 6 chars of UUID) newSubagentId :: IO SubagentId newSubagentId = SubagentId . Text.take 6 . UUID.toText <$> UUID.nextRandom -- | Path to subagent's log file subagentLogPath :: SubagentId -> FilePath subagentLogPath (SubagentId sid) = avaDataRoot "logs" "subagents" Text.unpack sid <> ".jsonl" ``` The `SubagentResult` includes the ID for cross-referencing: ```haskell data SubagentResult = SubagentResult { subagentId :: SubagentId -- NEW: links to S-{id}.jsonl , subagentOutput :: Aeson.Value , subagentSummary :: Text , ... } ``` ### 3.3 Log Entry Schema ```haskell data AuditLogEntry = AuditLogEntry { logTimestamp :: UTCTime , logSessionId :: SessionId -- Conversation session , logAgentId :: AgentId -- Ava or subagent ID , logUserId :: Maybe UserId -- Human user (Telegram, etc.) , logEventType :: AuditEventType , logContent :: Aeson.Value , logMetadata :: LogMetadata } data AuditEventType = UserMessage -- Incoming user message | AssistantMessage -- Ava response | ToolCall -- Tool invocation | ToolResult -- Tool response | SubagentSpawn -- Subagent created | SubagentComplete -- Subagent finished | ExtendedThinking -- Thinking block content | CostUpdate -- Token/cost tracking | ErrorOccurred -- Any error | SessionStart -- New conversation | SessionEnd -- Conversation ended deriving (Show, Eq, Generic) data LogMetadata = LogMetadata { metaInputTokens :: Maybe Int , metaOutputTokens :: Maybe Int , metaCostCents :: Maybe Double , metaModelId :: Maybe Text , metaParentAgentId :: Maybe AgentId -- For subagents , metaDuration :: Maybe Int -- Milliseconds } ``` ### 3.4 Logging Interface ```haskell -- | Append entry to audit log writeAuditLog :: AuditLogEntry -> IO () -- | Query logs by various criteria data LogQuery = LogQuery { queryAgentId :: Maybe AgentId , queryUserId :: Maybe UserId , queryTimeRange :: Maybe (UTCTime, UTCTime) , queryEventTypes :: Maybe [AuditEventType] , querySessionId :: Maybe SessionId , queryLimit :: Int } queryAuditLogs :: LogQuery -> IO [AuditLogEntry] -- | Get recent logs for debugging getRecentLogs :: AgentId -> Int -> IO [AuditLogEntry] -- | Search logs by content searchLogs :: Text -> IO [AuditLogEntry] ``` ### 3.5 Tools for Querying Logs **For Ben (CLI):** ```bash # View recent Ava logs ava logs --last 100 # View specific subagent trace by ID ava logs S-7f3a2b # Search for errors ava logs --type error --since "1 hour ago" # Follow live logs ava logs -f # Quick lookup with standard tools tail -f $AVA_DATA_ROOT/logs/ava/$(date +%Y-%m-%d).jsonl jq 'select(.eventType == "Error")' $AVA_DATA_ROOT/logs/ava/*.jsonl cat $AVA_DATA_ROOT/logs/subagents/S-7f3a2b.jsonl | jq . ``` **For Ava (Agent Tool):** ```haskell -- | Tool for Ava to query her own logs readAvaLogsTool :: Engine.Tool readAvaLogsTool = Engine.Tool { toolName = "read_ava_logs" , toolDescription = "Read Ava's audit logs or subagent traces. " <> "Use to diagnose issues, review past conversations, or inspect subagent runs." , toolJsonSchema = ... , toolExecute = executeReadLogs } -- Parameters: -- { "subagent_id": "S-7f3a2b" } -- Read specific subagent trace -- { "last_n": 50 } -- Last N entries from today's log -- { "search": "error", "since": "1h" } -- Search with time filter ``` This allows Ava to self-diagnose: "Let me check my logs for that subagent run..." ### 3.6 Automatic Logging Hook Integrate into Engine callbacks so logging is automatic: ```haskell auditingEngineConfig :: SessionId -> AgentId -> UserId -> EngineConfig auditingEngineConfig session agent user = EngineConfig { engineOnActivity = \txt -> writeAuditLog $ mkActivityEntry session agent txt , engineOnToolCall = \name args -> writeAuditLog $ mkToolCallEntry session agent name args , engineOnToolResult = \name success output -> writeAuditLog $ mkToolResultEntry session agent name success output , engineOnCost = \tokens cents -> writeAuditLog $ mkCostEntry session agent tokens cents , engineOnError = \err -> writeAuditLog $ mkErrorEntry session agent err , ... } ``` ## 4. Subagent Thinking Logs Capture extended thinking for debugging: ```haskell -- In Engine, when extended thinking is enabled: onThinkingBlock :: Text -> IO () onThinkingBlock content = do ts <- getCurrentTime writeAuditLog $ AuditLogEntry { logEventType = ExtendedThinking , logContent = object ["thinking" .= content] , ... } ``` ## 5. Implementation Plan ### Phase 1: Audit Logging (Foundation) 1. Create `Omni/Agent/AuditLog.hs` with types and writers 2. Integrate into Engine callbacks 3. Add CLI commands: `jr agent logs` 4. Migrate existing status logging to audit system ### Phase 2: Async Subagent Execution 1. Create `SubagentHandle` and `SubagentRunStatus` 2. Implement `spawnSubagentAsync`, `querySubagentStatus` 3. Add event queue for real-time updates 4. Update Ava integration for background polling ### Phase 3: User Confirmation 1. Add confirmation prompt generation 2. Implement two-phase spawn flow 3. Update Telegram handler for confirmation UX 4. Add timeout for pending confirmations ### Phase 4: CLI & Diagnostics 1. Full `jr agent logs` implementation with queries 2. Live log streaming (`-f` flag) 3. Subagent dashboard in status output 4. Health checks and metrics ## 6. Example Session with All Features ``` [14:05:22] User (ben): Research podcast transcription pricing [14:05:23] Ava → User: I'll spawn a WebCrawler subagent to research competitor pricing. Estimated: 5-10 min, up to $0.50 Proceed? [Yes/No] [14:05:28] User (ben): yes [14:05:29] Ava → User: 🚀 Spawning WebCrawler subagent (S-7f3a2b)... [14:05:29] [AUDIT] SubagentSpawn S-7f3a2b role=WebCrawler user=ben session=sess-123 [14:05:30] [AUDIT/S-7f3a2b] ToolCall web_search {"query": "podcast transcription pricing 2024"} [14:05:32] [AUDIT/S-7f3a2b] ToolResult web_search success=true [14:06:00] Ava → User: ⏳ Research in progress (30s, reading otter.ai/pricing...) [14:07:45] [AUDIT/S-7f3a2b] SubagentComplete status=success cost=$0.24 tokens=45000 [14:07:46] Ava → User: ✅ Research complete! Found 5 competitors... [structured findings with citations] # Later debugging: $ jr agent logs S-7f3a2b [14:05:30] ToolCall web_search {"query": "podcast transcription pricing 2024"} [14:05:32] ToolResult web_search (success, 5 results) [14:05:35] Thinking: "Looking at search results, otter.ai and descript appear most relevant..." [14:05:40] ToolCall read_webpages {"urls": ["https://otter.ai/pricing"]} ... ``` ## 7. References - Anthropic: [Effective Harnesses for Long-Running Agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents) - Current: `Omni/Agent/Subagent.hs`, `Omni/Agent/Event.hs` - Async: `Control.Concurrent.Async`, `Control.Concurrent.STM`