# Subagent Hardening Design

**Status:** Draft  
**Goal:** Robust background execution, async updates, audit logging, user confirmation.

Based on Anthropic's [Effective Harnesses for Long-Running Agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents).

## 1. Background Execution with Async Updates

### 1.1 SubagentHandle

Replace synchronous `runSubagent` with async spawn returning a handle:

```haskell
-- | Handle to a running subagent for status queries and control
data SubagentHandle = SubagentHandle
  { handleId :: SubagentId              -- Unique ID (UUID)
  , handleAsync :: Async SubagentResult -- async thread handle  
  , handleStartTime :: UTCTime
  , handleConfig :: SubagentConfig
  , handleStatus :: TVar SubagentRunStatus
  , handleEvents :: TQueue SubagentEvent  -- Event stream
  }

-- | Runtime status of a subagent (queryable)
data SubagentRunStatus = SubagentRunStatus
  { runIteration :: Int
  , runTokensUsed :: Int
  , runCostCents :: Double
  , runElapsedSeconds :: Int
  , runCurrentActivity :: Text  -- e.g. "Reading https://..."
  , runLastToolCall :: Maybe (Text, UTCTime)  -- (tool_name, timestamp)
  }

-- | Subagent lifecycle events for logging/streaming
data SubagentEvent
  = SubagentStarted SubagentId SubagentConfig UTCTime
  | SubagentActivity SubagentId Text UTCTime
  | SubagentToolCall SubagentId Text Aeson.Value UTCTime
  | SubagentToolResult SubagentId Text Bool Text UTCTime
  | SubagentThinking SubagentId Text UTCTime  -- Extended thinking
  | SubagentCost SubagentId Int Double UTCTime  -- tokens, cents
  | SubagentCompleted SubagentId SubagentResult UTCTime
  | SubagentError SubagentId Text UTCTime
  deriving (Show, Eq, Generic)
```

### 1.2 New API

```haskell
-- | Spawn subagent in background, return handle immediately
spawnSubagentAsync :: SubagentApiKeys -> SubagentConfig -> IO SubagentHandle

-- | Query current status (non-blocking)
querySubagentStatus :: SubagentHandle -> IO SubagentRunStatus

-- | Check if complete (non-blocking)
isSubagentDone :: SubagentHandle -> IO Bool

-- | Wait for completion (blocking)
waitSubagent :: SubagentHandle -> IO SubagentResult

-- | Cancel a running subagent
cancelSubagent :: SubagentHandle -> IO ()

-- | Read all events so far (for logging/UI)
drainSubagentEvents :: SubagentHandle -> IO [SubagentEvent]
```

### 1.3 Ava Integration

Ava's orchestrator loop can now:
1. Spawn subagents in background
2. Continue conversation with user
3. Periodically poll for updates: `"🔍 WebCrawler running (45s, 12k tokens)..."`
4. Receive completion and synthesize result

```haskell
-- In Ava's message handler:
handle <- spawnSubagentAsync keys config

-- Non-blocking check in conversation loop:
status <- querySubagentStatus handle
when (runElapsedSeconds status > 30) $
  sendMessage chat $ "⏳ Subagent still working: " <> runCurrentActivity status

-- When user asks for status:
status <- querySubagentStatus handle
sendMessage chat $ formatSubagentStatus status

-- On completion:
result <- waitSubagent handle
sendMessage chat $ "✅ " <> subagentSummary result
```

## 2. User Confirmation Before Spawning

### 2.1 Confirmation Flow

Before spawning any subagent or long-running process, Ava must:

```
User: Research competitors for podcast transcription

Ava: I'll spawn a WebCrawler subagent to research this. Estimated:
     • Time: ~5-10 minutes
     • Cost: up to $0.50
     • Tools: web_search, read_webpages
     
     Proceed? [Yes/No]

User: Yes

Ava: 🚀 Spawning WebCrawler subagent...
     🔍 [WebCrawler] Starting research...
```

### 2.2 Implementation

```haskell
data SpawnRequest = SpawnRequest
  { spawnConfig :: SubagentConfig
  , spawnEstimatedTime :: (Int, Int)  -- (min, max) minutes
  , spawnEstimatedCost :: Double      -- max cents
  , spawnRationale :: Text            -- why we need this
  }

-- | Generate confirmation message for user
formatSpawnConfirmation :: SpawnRequest -> Text

-- | Parse user confirmation response
data ConfirmationResponse = Confirmed | Rejected | Modified SubagentConfig

parseConfirmation :: Text -> ConfirmationResponse
```

### 2.3 Tool Modification

The `spawn_subagent` tool becomes a two-phase operation:

1. **Phase 1 (propose):** Returns confirmation request, doesn't spawn
2. **Phase 2 (confirm):** User confirms, actually spawns

Alternative: Add `confirm_spawn` as separate tool that takes a pending spawn ID.

## 3. Audit Logging System

### 3.1 Log Storage

All agent activity persisted to append-only JSONL files under `AVA_DATA_ROOT/logs/`:

```
$AVA_DATA_ROOT/logs/           # e.g. /home/ava/logs/ or _/var/ava/logs/
├── ava/
│   ├── 2024-01-15.jsonl       # Daily Ava conversation logs
│   └── 2024-01-16.jsonl
└── subagents/
    ├── S-7f3a2b.jsonl         # Per-subagent trace (named by SubagentId)
    └── S-9e4c1d.jsonl
```

### 3.2 SubagentId Linking

Each subagent gets a unique `SubagentId` (short UUID prefix) that links:
- The `SubagentResult` returned to Ava
- The JSONL log file (`S-{id}.jsonl`)
- References in Ava's daily log

```haskell
-- | Unique identifier for a subagent run
newtype SubagentId = SubagentId { unSubagentId :: Text }
  deriving (Show, Eq, Generic, Aeson.ToJSON, Aeson.FromJSON)

-- | Generate a new subagent ID (first 6 chars of UUID)
newSubagentId :: IO SubagentId
newSubagentId = SubagentId . Text.take 6 . UUID.toText <$> UUID.nextRandom

-- | Path to subagent's log file
subagentLogPath :: SubagentId -> FilePath
subagentLogPath (SubagentId sid) = 
  avaDataRoot </> "logs" </> "subagents" </> Text.unpack sid <> ".jsonl"
```

The `SubagentResult` includes the ID for cross-referencing:

```haskell
data SubagentResult = SubagentResult
  { subagentId :: SubagentId        -- NEW: links to S-{id}.jsonl
  , subagentOutput :: Aeson.Value
  , subagentSummary :: Text
  , ...
  }
```

### 3.3 Log Entry Schema

```haskell
data AuditLogEntry = AuditLogEntry
  { logTimestamp :: UTCTime
  , logSessionId :: SessionId      -- Conversation session
  , logAgentId :: AgentId          -- Ava or subagent ID
  , logUserId :: Maybe UserId      -- Human user (Telegram, etc.)
  , logEventType :: AuditEventType
  , logContent :: Aeson.Value
  , logMetadata :: LogMetadata
  }

data AuditEventType
  = UserMessage           -- Incoming user message
  | AssistantMessage      -- Ava response
  | ToolCall              -- Tool invocation
  | ToolResult            -- Tool response
  | SubagentSpawn         -- Subagent created
  | SubagentComplete      -- Subagent finished
  | ExtendedThinking      -- Thinking block content
  | CostUpdate            -- Token/cost tracking
  | ErrorOccurred         -- Any error
  | SessionStart          -- New conversation
  | SessionEnd            -- Conversation ended
  deriving (Show, Eq, Generic)

data LogMetadata = LogMetadata
  { metaInputTokens :: Maybe Int
  , metaOutputTokens :: Maybe Int
  , metaCostCents :: Maybe Double
  , metaModelId :: Maybe Text
  , metaParentAgentId :: Maybe AgentId  -- For subagents
  , metaDuration :: Maybe Int           -- Milliseconds
  }
```

### 3.4 Logging Interface

```haskell
-- | Append entry to audit log
writeAuditLog :: AuditLogEntry -> IO ()

-- | Query logs by various criteria
data LogQuery = LogQuery
  { queryAgentId :: Maybe AgentId
  , queryUserId :: Maybe UserId
  , queryTimeRange :: Maybe (UTCTime, UTCTime)
  , queryEventTypes :: Maybe [AuditEventType]
  , querySessionId :: Maybe SessionId
  , queryLimit :: Int
  }

queryAuditLogs :: LogQuery -> IO [AuditLogEntry]

-- | Get recent logs for debugging
getRecentLogs :: AgentId -> Int -> IO [AuditLogEntry]

-- | Search logs by content
searchLogs :: Text -> IO [AuditLogEntry]
```

### 3.5 Tools for Querying Logs

**For Ben (CLI):**

```bash
# View recent Ava logs
ava logs --last 100

# View specific subagent trace by ID
ava logs S-7f3a2b

# Search for errors
ava logs --type error --since "1 hour ago"

# Follow live logs
ava logs -f

# Quick lookup with standard tools
tail -f $AVA_DATA_ROOT/logs/ava/$(date +%Y-%m-%d).jsonl
jq 'select(.eventType == "Error")' $AVA_DATA_ROOT/logs/ava/*.jsonl
cat $AVA_DATA_ROOT/logs/subagents/S-7f3a2b.jsonl | jq .
```

**For Ava (Agent Tool):**

```haskell
-- | Tool for Ava to query her own logs
readAvaLogsTool :: Engine.Tool
readAvaLogsTool = Engine.Tool
  { toolName = "read_ava_logs"
  , toolDescription = 
      "Read Ava's audit logs or subagent traces. "
      <> "Use to diagnose issues, review past conversations, or inspect subagent runs."
  , toolJsonSchema = ...
  , toolExecute = executeReadLogs
  }

-- Parameters:
-- { "subagent_id": "S-7f3a2b" }           -- Read specific subagent trace
-- { "last_n": 50 }                         -- Last N entries from today's log
-- { "search": "error", "since": "1h" }     -- Search with time filter
```

This allows Ava to self-diagnose: "Let me check my logs for that subagent run..."

### 3.6 Automatic Logging Hook

Integrate into Engine callbacks so logging is automatic:

```haskell
auditingEngineConfig :: SessionId -> AgentId -> UserId -> EngineConfig
auditingEngineConfig session agent user = EngineConfig
  { engineOnActivity = \txt -> writeAuditLog $ mkActivityEntry session agent txt
  , engineOnToolCall = \name args -> writeAuditLog $ mkToolCallEntry session agent name args
  , engineOnToolResult = \name success output -> writeAuditLog $ mkToolResultEntry session agent name success output
  , engineOnCost = \tokens cents -> writeAuditLog $ mkCostEntry session agent tokens cents
  , engineOnError = \err -> writeAuditLog $ mkErrorEntry session agent err
  , ...
  }
```

## 4. Subagent Thinking Logs

Capture extended thinking for debugging:

```haskell
-- In Engine, when extended thinking is enabled:
onThinkingBlock :: Text -> IO ()
onThinkingBlock content = do
  ts <- getCurrentTime
  writeAuditLog $ AuditLogEntry
    { logEventType = ExtendedThinking
    , logContent = object ["thinking" .= content]
    , ...
    }
```

## 5. Implementation Plan

### Phase 1: Audit Logging (Foundation)
1. Create `Omni/Agent/AuditLog.hs` with types and writers
2. Integrate into Engine callbacks
3. Add CLI commands: `jr agent logs`
4. Migrate existing status logging to audit system

### Phase 2: Async Subagent Execution
1. Create `SubagentHandle` and `SubagentRunStatus`
2. Implement `spawnSubagentAsync`, `querySubagentStatus`
3. Add event queue for real-time updates
4. Update Ava integration for background polling

### Phase 3: User Confirmation
1. Add confirmation prompt generation
2. Implement two-phase spawn flow
3. Update Telegram handler for confirmation UX
4. Add timeout for pending confirmations

### Phase 4: CLI & Diagnostics
1. Full `jr agent logs` implementation with queries
2. Live log streaming (`-f` flag)
3. Subagent dashboard in status output
4. Health checks and metrics

## 6. Example Session with All Features

```
[14:05:22] User (ben): Research podcast transcription pricing

[14:05:23] Ava → User: I'll spawn a WebCrawler subagent to research competitor pricing.
         Estimated: 5-10 min, up to $0.50
         Proceed? [Yes/No]

[14:05:28] User (ben): yes

[14:05:29] Ava → User: 🚀 Spawning WebCrawler subagent (S-7f3a2b)...
[14:05:29] [AUDIT] SubagentSpawn S-7f3a2b role=WebCrawler user=ben session=sess-123

[14:05:30] [AUDIT/S-7f3a2b] ToolCall web_search {"query": "podcast transcription pricing 2024"}
[14:05:32] [AUDIT/S-7f3a2b] ToolResult web_search success=true

[14:06:00] Ava → User: ⏳ Research in progress (30s, reading otter.ai/pricing...)

[14:07:45] [AUDIT/S-7f3a2b] SubagentComplete status=success cost=$0.24 tokens=45000

[14:07:46] Ava → User: ✅ Research complete! Found 5 competitors...
         [structured findings with citations]

# Later debugging:
$ jr agent logs S-7f3a2b
[14:05:30] ToolCall web_search {"query": "podcast transcription pricing 2024"}
[14:05:32] ToolResult web_search (success, 5 results)
[14:05:35] Thinking: "Looking at search results, otter.ai and descript appear most relevant..."
[14:05:40] ToolCall read_webpages {"urls": ["https://otter.ai/pricing"]}
...
```

## 7. References

- Anthropic: [Effective Harnesses for Long-Running Agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents)
- Current: `Omni/Agent/Subagent.hs`, `Omni/Agent/Event.hs`
- Async: `Control.Concurrent.Async`, `Control.Concurrent.STM`