Omni/Agent/Subagent/DESIGN.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352

# Subagent System Design

**Status:** Draft  
**Goal:** Enable Ava (orchestrator) to spawn specialized subagents for parallel, token-intensive tasks.

## 1. Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                    Ava (Orchestrator)                       │
│  Model: claude-sonnet-4.5 (via OpenRouter)                  │
│  Role: Task decomposition, delegation, synthesis            │
│  Memory: Read/Write access                                  │
├─────────────────────────────────────────────────────────────┤
│  Tools: spawn_subagent, all existing Ava tools              │
└───────────────┬───────────────────────────────────────┬─────┘
                │                                       │
                ▼                                       ▼
┌───────────────────────────┐       ┌───────────────────────────┐
│  Subagent: WebCrawler     │       │  Subagent: CodeReviewer   │
│  Model: claude-haiku      │       │  Model: claude-opus       │
│  Tools: web_search,       │       │  Tools: read_file,        │
│         http_get,         │       │         search_codebase,  │
│         python_exec       │       │         run_bash          │
│  Memory: Read-only        │       │  Memory: Read-only        │
│  Limits: 600s, $0.50      │       │  Limits: 300s, $1.00      │
└───────────────────────────┘       └───────────────────────────┘
```

## 2. Key Design Decisions

### 2.1 Hierarchical (No Sub-Subagents)
- Subagents cannot spawn their own subagents
- Prevents runaway token consumption
- Keeps orchestrator in control

### 2.2 Memory Access
- **Orchestrator (Ava):** Full read/write to Memory system
- **Subagents:** Read-only access to memories
- Prevents conflicting memory writes from parallel agents

### 2.3 Model Selection by Role
| Role | Model | Rationale |
|------|-------|-----------|
| Orchestrator | claude-sonnet-4.5 | Balance of capability/cost |
| Deep reasoning | claude-opus | Complex analysis, architecture |
| Quick tasks | claude-haiku | Fast, cheap for simple lookups |
| Code tasks | claude-sonnet | Good code understanding |

### 2.4 Resource Limits (Guardrails)
Each subagent has strict limits:
- **Timeout:** Max wall-clock time (default: 600s)
- **Cost cap:** Max spend in cents (default: 50c)
- **Token cap:** Max total tokens (default: 100k)
- **Iteration cap:** Max agent loop iterations (default: 20)

### 2.5 Extended Thinking
- Configurable per-subagent
- Enabled for deep research tasks
- Disabled for quick lookups

## 3. Data Types

```haskell
-- | Subagent role determines toolset and model
data SubagentRole
  = WebCrawler        -- Deep web research
  | CodeReviewer      -- Code analysis, PR review
  | DataExtractor     -- Structured data extraction
  | Researcher        -- General research with web+docs
  | CustomRole Text   -- User-defined role
  deriving (Show, Eq, Generic)

-- | Configuration for spawning a subagent
data SubagentConfig = SubagentConfig
  { subagentRole :: SubagentRole
  , subagentTask :: Text           -- What to accomplish
  , subagentModel :: Maybe Text    -- Override default model
  , subagentTimeout :: Int         -- Seconds (default: 600)
  , subagentMaxCost :: Double      -- Cents (default: 50.0)
  , subagentMaxTokens :: Int       -- Default: 100000
  , subagentMaxIterations :: Int   -- Default: 20
  , subagentExtendedThinking :: Bool
  , subagentContext :: Maybe Text  -- Additional context from orchestrator
  } deriving (Show, Eq, Generic)

-- | Result returned by subagent to orchestrator
data SubagentResult = SubagentResult
  { subagentOutput :: Aeson.Value  -- Structured result
  , subagentSummary :: Text        -- Human-readable summary
  , subagentConfidence :: Double   -- 0.0-1.0 confidence score
  , subagentTokensUsed :: Int
  , subagentCostCents :: Double
  , subagentDuration :: Int        -- Seconds
  , subagentIterations :: Int
  , subagentStatus :: SubagentStatus
  } deriving (Show, Eq, Generic)

data SubagentStatus
  = SubagentSuccess
  | SubagentTimeout
  | SubagentCostExceeded
  | SubagentError Text
  deriving (Show, Eq, Generic)
```

## 4. Tool: spawn_subagent

This is the main interface for the orchestrator to spawn subagents.

```haskell
spawnSubagentTool :: Engine.Tool
spawnSubagentTool = Engine.Tool
  { toolName = "spawn_subagent"
  , toolDescription = 
      "Spawn a specialized subagent for a focused task. "
      <> "Use for tasks that benefit from deep exploration, parallel execution, "
      <> "or specialized tools. The subagent will iterate until task completion "
      <> "or resource limits are reached."
  , toolJsonSchema = ...
  , toolExecute = executeSpawnSubagent
  }
```

**Parameters:**
```json
{
  "role": "web_crawler | code_reviewer | data_extractor | researcher | custom",
  "task": "Research competitor pricing for podcast transcription services",
  "context": "We're building a pricing page and need market data",
  "model": "claude-haiku",
  "timeout": 600,
  "max_cost_cents": 50,
  "extended_thinking": false
}
```

**Response:**
```json
{
  "status": "success",
  "summary": "Found 5 competitors with pricing ranging from $0.10-$0.25/min",
  "output": {
    "competitors": [
      {"name": "Otter.ai", "pricing": "$0.12/min", "features": ["..."]},
      ...
    ]
  },
  "confidence": 0.85,
  "tokens_used": 45000,
  "cost_cents": 23.5,
  "duration_seconds": 180,
  "iterations": 8
}
```

## 5. Role-Specific Tool Sets

### 5.1 WebCrawler
```haskell
webCrawlerTools :: [Engine.Tool]
webCrawlerTools =
  [ webSearchTool      -- Search the web
  , webReaderTool      -- Fetch and parse web pages
  , pythonExecTool     -- Execute Python for data processing
  ]
```
**Use case:** Deep market research, competitive analysis, documentation gathering

### 5.2 CodeReviewer
```haskell
codeReviewerTools :: [Engine.Tool]
codeReviewerTools =
  [ readFileTool
  , searchCodebaseTool
  , searchAndReadTool
  , runBashTool        -- For running tests, linters
  ]
```
**Use case:** PR review, architecture analysis, test verification

### 5.3 DataExtractor
```haskell
dataExtractorTools :: [Engine.Tool]
dataExtractorTools =
  [ webReaderTool
  , pythonExecTool
  ]
```
**Use case:** Scraping structured data, parsing PDFs, extracting metrics

### 5.4 Researcher
```haskell
researcherTools :: [Engine.Tool]
researcherTools =
  [ webSearchTool
  , webReaderTool
  , readFileTool
  , searchCodebaseTool
  ]
```
**Use case:** General research combining web and local codebase

## 6. Subagent System Prompt Template

```
You are a specialized {ROLE} subagent working on a focused task.

## Your Task
{TASK}

## Context from Orchestrator
{CONTEXT}

## Your Capabilities
{TOOL_DESCRIPTIONS}

## Guidelines
1. Work iteratively: search → evaluate → refine → verify
2. Return structured data when possible (JSON objects)
3. Include confidence scores for your findings
4. If stuck, explain what you tried and what didn't work
5. Stop when you have sufficient information OR hit resource limits

## Output Format
When complete, provide:
1. A structured result (JSON) with the requested data
2. A brief summary of findings
3. Confidence score (0.0-1.0) indicating reliability
4. Any caveats or limitations
```

## 7. Orchestrator Delegation Logic

The orchestrator (Ava) should spawn subagents when:

1. **Deep research needed:** "Research all competitors in X market"
2. **Parallel tasks:** Multiple independent subtasks that can run concurrently
3. **Specialized tools:** Task requires tools the orchestrator shouldn't use directly
4. **Token-intensive:** Task would consume excessive tokens in main context

The orchestrator should NOT spawn subagents for:

1. **Simple queries:** Quick lookups, single tool calls
2. **Conversation continuation:** Multi-turn dialogue with user
3. **Memory writes:** Tasks that need to update Ava's memory

## 8. Execution Flow

```
1. Orchestrator calls spawn_subagent tool
2. Subagent module:
   a. Creates fresh agent config from SubagentConfig
   b. Selects model based on role (or override)
   c. Builds tool list for role
   d. Constructs system prompt
   e. Calls Engine.runAgentWithProvider
   f. Monitors resource usage
   g. Returns SubagentResult
3. Orchestrator receives structured result
4. Orchestrator synthesizes into response
```

## 9. Concurrency Model

Initial implementation: **Sequential** (one subagent at a time)

Future enhancement: **Parallel** spawning with:
- `async` library for concurrent execution
- Aggregate cost tracking across all subagents
- Combined timeout for parallel group

```haskell
-- Future: Parallel spawning
spawnParallel :: [SubagentConfig] -> IO [SubagentResult]
spawnParallel configs = mapConcurrently runSubagent configs
```

## 10. Status Reporting

Subagents report status back to the orchestrator via callbacks:

```haskell
data SubagentCallbacks = SubagentCallbacks
  { onSubagentStart :: Text -> IO ()           -- "Starting web research..."
  , onSubagentActivity :: Text -> IO ()        -- "Searching for X..."
  , onSubagentToolCall :: Text -> Text -> IO () -- Tool name, args
  , onSubagentComplete :: SubagentResult -> IO ()
  }
```

For Telegram, this appears as:
```
🔍 Subagent [WebCrawler]: Starting research...
🔍 Subagent [WebCrawler]: Searching "podcast transcription pricing"...
🔍 Subagent [WebCrawler]: Reading otter.ai/pricing...
✅ Subagent [WebCrawler]: Complete (180s, $0.24)
```

## 11. Implementation Plan

### Phase 1: Core Infrastructure
1. Create `Omni/Agent/Subagent.hs` with data types
2. Implement `runSubagent` function using existing Engine
3. Add `spawn_subagent` tool
4. Basic WebCrawler role with existing web tools

### Phase 2: Role Expansion
1. Add CodeReviewer role
2. Add DataExtractor role
3. Add Researcher role
4. Custom role support

### Phase 3: Advanced Features
1. Parallel subagent execution
2. Extended thinking integration
3. Cross-subagent context sharing
4. Cost aggregation and budgeting

## 12. Testing Strategy

```haskell
test :: Test.Tree
test = Test.group "Omni.Agent.Subagent"
  [ Test.unit "SubagentConfig JSON roundtrip" <| ...
  , Test.unit "role selects correct tools" <| ...
  , Test.unit "timeout terminates subagent" <| ...
  , Test.unit "cost limit stops execution" <| ...
  , Test.unit "WebCrawler role has web tools" <| ...
  ]
```

## 13. Cost Analysis

Based on Anthropic's research findings:
- Subagents use ~15× more tokens than single-agent
- But provide better results for complex tasks
- 80% of performance variance from token budget

**Budget recommendations:**
| Task Type | Subagent Budget | Expected Tokens |
|-----------|-----------------|-----------------|
| Quick lookup | $0.10 | ~10k |
| Standard research | $0.50 | ~50k |
| Deep analysis | $2.00 | ~200k |

## 14. References

- [Claude Agent SDK - Subagents](https://platform.claude.com/docs/en/agent-sdk/subagents)
- [Multi-Agent Research System](https://www.anthropic.com/engineering/multi-agent-research-system)
- [OpenAI Agents Python](https://openai.github.io/openai-agents-python/agents/)
- Existing: `Omni/Agent/Engine.hs`, `Omni/Agent/Provider.hs`, `Omni/Agent/Tools.hs`