| Age | Commit message (Collapse) | Author |
|
Use Gemini Flash to classify group messages before running the
full Sonnet agent. Skips casual banter to save tokens/cost.
- shouldEngageInGroup: yes/no classifier using gemini-2.0-flash
- Only runs for group chats, private chats skip the filter
- On classifier failure, defaults to engaging (fail-open)
|
|
- Remove mention-based filtering, bot sees all group messages
- Add response rules to system prompt for group chats:
- tool invocation = always respond
- direct question = respond
- factual correction = maybe respond
- casual banter = stay silent
- Empty response in group = intentional silence (no fallback msg)
- Add chat type context to system prompt
|
|
- Only respond in groups when @mentioned or replied to
- Add ChatType to TelegramMessage (private/group/supergroup/channel)
- Add getMe API call to fetch bot username on startup
- Add shouldRespondInGroup helper function
|
|
- Fix Provider.hs to strip leading whitespace from OpenRouter responses
- Fix FunctionCall parser to handle missing 'arguments' field
- Use eitherDecode for better error messages on parse failures
- Switch to claude-sonnet-4.5 for main agent
- Use gemini-2.0-flash for conversation summarization (cheaper)
- Add read_webpage tool for fetching and summarizing URLs
- Add tagsoup to Haskell deps (unused, kept for future)
|
|
Refactor Telegram.hs into submodules to reduce file size:
- Types.hs: data types, JSON parsing
- Media.hs: file downloads, image/voice analysis
- Reminders.hs: reminder loop, user chat persistence
Multimedia improvements:
- Vision uses third-person to avoid LLM confusion
- Better message framing for embedded descriptions
- Size validation (10MB images, 20MB voice)
- MIME type validation for voice messages
New features:
- Reply support: bot sees context when users reply
- Web search: default 5->10, max 10->20 results
- Guardrails: duplicate tool limit 3->10 for research
- Timezone: todos parse/display in Eastern time (ET)
|
|
- Add TelegramPhoto and TelegramVoice types
- Parse photo and voice fields from Telegram updates
- Download photos/voice via Telegram API
- Analyze images using Claude vision via OpenRouter
- Transcribe voice messages using Gemini audio via OpenRouter
- Wire multimedia processing into handleAuthorizedMessage
Photos are analyzed with user's caption as context.
Voice messages are transcribed and treated as text input.
|
|
Adds a background reminder loop that checks every 5 minutes for overdue
todos and sends Telegram notifications.
Changes:
- Add last_reminded_at column to todos table with auto-migration
- Add listTodosDueForReminder to find overdue, unreminded todos
- Add markReminderSent to update reminder timestamp
- Add user_chats table to map user_id -> chat_id for notifications
- Add recordUserChat called on each message to track chat IDs
- Add reminderLoop forked in runTelegramBot
- 24-hour anti-spam interval between reminders per todo
|
|
When the LLM returned empty content after executing tools, the agent
would complete with an empty message. Now both agent loops (LLM-based
and Provider-based) detect this case and inject a prompt asking the
LLM to provide a response to the user.
|
|
- Omni/Agent/Tools/Todos.hs: todo_add, todo_list, todo_complete, todo_delete
- Supports optional due dates in YYYY-MM-DD or YYYY-MM-DD HH:MM format
- Lists can filter by pending, all, or overdue
- Add todos table to Memory.hs schema
- Wire into Telegram bot
|
|
- Add sender_name column to conversation_messages table
- Migrate existing messages to set sender_name='bensima'
- Show sender names in conversation context (e.g., 'bensima: hello')
- Pass userName when saving user messages in Telegram bot
|
|
|
|
|
|
|
|
|
|
|
|
- Omni/Agent/Tools/Calendar.hs: calendar_list, calendar_add, calendar_search
- Wire into Telegram bot alongside other tools
- Integrates with local CalDAV via khal
|
|
- Omni/Agent/Tools/Pdf.hs: Extract text from PDFs using pdftotext
- Omni/Agent/Tools/Notes.hs: Quick notes CRUD with topics
- Add notes table schema to Memory.hs initMemoryDb
- Wire both tools into Telegram bot with logging callbacks
|
|
- Add Omni/Agent/Tools/WebSearch.hs with Kagi Search API integration
- webSearchTool for agents to search the web
- kagiSearch function for direct API access
- Load KAGI_API_KEY from environment
- Wire web search into Telegram bot tools
- Results formatted with title, URL, and snippet
Closes t-252
|
|
- Add tgAllowedUserIds field to TelegramConfig
- Load ALLOWED_TELEGRAM_USER_IDS from environment (comma-separated)
- Check isUserAllowed before processing messages
- Reject unauthorized users with friendly message
- Empty whitelist or '*' allows all users
- Add tests for whitelist behavior
|
|
- Add sendTypingAction to show typing indicator when processing
- Add conversation_messages and conversation_summaries tables
- Implement conversation history with token counting
- Auto-summarize when context exceeds threshold (3000 tokens)
- Save user/assistant messages for multi-turn context
- Add ConversationMessage, ConversationSummary, MessageRole types
Tasks created: t-252 (web search), t-253 (calendar), t-254 (PDF),
t-255 (knowledge graph), t-256 (notes)
|
|
- Set response timeout to polling timeout + 10s for long polling
- Remove Markdown parse_mode to avoid 400 errors on special chars
|
|
|
|
- Omni/Agent/Telegram.hs: Telegram API client with getUpdates/sendMessage
- Omni/Bot.hs: Standalone CLI for running the bot
- User identification via Memory.getOrCreateUserByTelegramId
- Memory-enhanced agent with remember/recall tools
- Run with: bot --token=XXX or TELEGRAM_BOT_TOKEN env var
|
|
- User management with Telegram ID identification
- Memory storage with Ollama embeddings (nomic-embed-text)
- Semantic similarity search via cosine similarity
- remember/recall tools for agents
- runAgentWithMemory wrapper for memory-enhanced agents
- Separate memory.db database for user privacy
|
|
- Create Omni/Agent/Provider.hs with unified Provider interface
- Support OpenRouter (cloud), Ollama (local), Amp (subprocess stub)
- Add runAgentWithProvider to Engine.hs for Provider-based execution
- Add EngineType to Core.hs (EngineOpenRouter, EngineOllama, EngineAmp)
- Add --engine flag to 'jr work' command
- Worker.hs dispatches to appropriate provider based on engine type
Usage:
jr work <task-id> # OpenRouter (default)
jr work <task-id> --engine=ollama # Local Ollama
jr work <task-id> --engine=amp # Amp CLI (stub)
|
|
Defines architecture for multi-agent system with:
- Provider abstraction (OpenRouter, Ollama, Amp backends)
- Shared memory system (sqlite-vss, multi-user, cross-agent)
- Tool registry for pluggable tool sets
- Evals framework for regression testing
- Telegram bot as first concrete agent
Tasks: t-247 through t-251
|
|
Add explicit guidance on:
- Reading files with large ranges (500+ lines) instead of many small chunks
- Using read_file directly when target file is known vs search_and_read
- Cost awareness: planning refactors, avoiding redundant reads
- Tool call limits for complex tasks
|
|
The task was being added to the prompt twice, once in the base prompt and once
in the user prompt.
|
|
Worked with Gemini and Opus to improve the system prompt with learnings from the
Amp prompt. Removed reference to Omni/Task/README.md because it is deprecated in
favor of `jr task`.
|
|
jr prompt <task-id> constructs and prints the full system prompt
that would be sent to the agent, including:
- Agent configuration (model, cost budget)
- Base instructions
- AGENTS.md content
- Relevant facts from knowledge base
- Retry/progress context if applicable
Useful for debugging agent behavior and token usage.
|
|
Timeline tool display:
- Grep/search: ✓ Grep pattern in filepath
- Read file: ✓ Read filepath @start-end
- Edit file: ✓ Edit filepath
- Bash: ϟ command (lightning bolt prompt)
- Tool results only shown for meaningful output
New search_and_read tool:
- Combines search + read in one operation
- Uses ripgrep --context for surrounding lines
- More efficient than separate search then read
Worker prompt updated to prefer search_and_read over
separate search + read_file calls
|
|
- Fix Worker.hs to use EngineError instead of tuple
- Fix Types.hs imports for LazyText.encodeUtf8 and dayOfWeek
- Remove duplicate SortOrder from Components.hs (import from Types.hs)
- Add orphan instance pragmas to Pages.hs and Partials.hs
- Clean up unused imports
|
|
The HTMX-refreshed AgentEventsPartial was missing:
- Cost/token summary in header
- Live toggle button
- Autoscroll toggle button
- Comment form
Now matches the full page renderUnifiedTimeline output.
|
|
Jr was completing tasks but then going into verification loops,
re-reading files and 'tracing through logic' after tests passed.
This burned ~4 cents of extra cost on t-221.
Made instructions more emphatic:
- 'STOP IMMEDIATELY' with explicit list of what NOT to do
- 'ANY further tool calls are wasted money'
- Repeated in BUILD SYSTEM NOTES section
Task-Id: t-227
|
|
Instructs the agent to:
- Use line ranges when reading large files (>500 lines)
- Use minimal context for edit_file old_str matching
- Re-read exact lines after failed edits
- Stop after 2-3 failed edits to reconsider approach
- Flag very large files (>2000 lines) for refactoring
Task-Id: t-225
|
|
Tracks 'old_str not found' errors from edit_file tool calls. After 5
consecutive failures, stops the agent to prevent burning tokens on
impossible edits.
This catches the pattern where the agent repeatedly tries to edit a
large file with incorrect old_str matches, which was the root cause
of t-222 exceeding its cost budget.
Task-Id: t-224
|
|
Cost limits by complexity level:
- Complexity 1: 50 cents
- Complexity 2: 100 cents
- Complexity 3: 200 cents (default)
- Complexity 4: 400 cents
- Complexity 5: 600 cents
This prevents low-complexity tasks from burning budget while allowing
complex tasks more room for iteration.
Task-Id: t-223
|
|
- Add clickable LIVE toggle button that pauses/resumes timeline polling
- Green pulsing when active, grey when paused
- Uses htmx:beforeRequest event to cancel requests when paused
- Increase duplicate tool call guardrail from 20 to 30
Task-Id: t-211
|
|
- Add formatToolCallSummary to extract key argument from JSON
- Shows run_bash command, file paths for read/edit/write, patterns for search
- Display summary inline in tool call header (e.g., run_bash: `ls -la`)
- Increase token guardrail from 1M to 2M to prevent premature stops
Task-Id: t-212
|
|
The limit of 5 was too aggressive - reading 5 different files while
exploring a codebase would trigger the guardrail. 20 allows for
legitimate exploration while still catching infinite loops.
|
|
- updateTaskStatusWithActor logs status_change events to agent_events
- Worker uses Junior actor for status changes - Jr review uses
System/Human actors appropriately - CLI task update uses Human actor
- Remove task_activity table schema (migrated to agent_events) -
addComment now inserts into agent_events with event_type='comment'
Task-Id: t-213
|
|
- Add 'actor' column to agent_events table (human/junior/system)
- Add System to CommentAuthor type (reused for actor) - Add SQL
FromField/ToField instances for CommentAuthor - Update insertAgentEvent
to accept actor parameter - Update all SELECT queries to include
actor column - Update Worker.hs to pass actor for all event types -
Guardrail events logged with System actor
Migration: ALTER TABLE adds column with default 'junior' for existing
rows.
Task-Id: t-213.1
|
|
Implement runtime guardrails in Engine.hs: - Cost budget limit (default
200 cents) - Token budget limit (default 1M tokens) - Duplicate tool
call detection (same tool called N times) - Test failure counting
(bild --test failures)
Add database-backed progress tracking: - Checkpoint events stored in
agent_events table - Progress summary retrieved on retry attempts -
Improved prompts emphasizing efficiency and autonomous operation
Worker.hs improvements: - Uses guardrails configuration - Reports
guardrail violations via callbacks - Better prompt structure for
autonomous operation
Task-Id: t-203
|
|
Perfect! All changes are in place and working correctly. Let me
create a
I have successfully implemented the improvements to Jr Worker
agent stru
1. **Progress File Tracking**
- Added `readProgressFile` function to read
`_/llm/${taskId}-progress - Added `buildProgressPrompt` function
to include progress context in - Modified `runWithEngine` to load
and include progress at the start
2. **Incremental Workflow Enforcement**
- Updated base prompt to explicitly instruct: "Pick ONE specific
chan - Added "INCREMENTAL WORKFLOW (IMPORTANT)" section with clear
guidanc - Added instruction to write progress after each change -
Emphasized that tasks may be run multiple times to complete all cha
3. **Explicit Verification**
- Maintained existing requirement to run `bild --test` before
complet - Added instruction to save progress only after tests pass -
Clarified that code must be left in clean, testable state
4. **Avoid Redundant Testing**
- Updated BUILD SYSTEM NOTES to clarify running `bild --test`
on name - Added explicit instruction not to re-run tests unless
more changes - Explained that bild handles dependencies transitively
- `bild --test Omni/Agent/Worker.hs` - **PASSED** ✓ - `lint
Omni/Agent/Worker.hs` - **NO ISSUES** ✓
- `_/llm/t-203-progress.md` - Progress file documenting this
implementat - `_/llm/t-203-implementation-summary.md` - Detailed
summary of changes
The implementation follows industry best practices from Anthropic,
OpenA - Reduced token usage through focused, incremental changes
- Better code quality with isolated, tested changes - Improved
reliability with progress tracking across sessions - Clear workflow
preventing "declaring victory" too early
Task-Id: t-203
|
|
Perfect! The changes are exactly what we need. The diff shows that I've
1. ✅ `bild --test` on a namespace tests all its dependencies -
no need t 2. ✅ Don't re-run tests after they already passed 3. ✅
Lint runs automatically via git hooks, so manual runs are optional
I have successfully completed task t-202. The changes made to
`Omni/Agen
1. **Explaining dependency handling**: Added explicit note that
`bild --
2. **Preventing redundant testing**: Added instruction to not
re-run tes
3. **Clarifying lint automation**: Explained that lint runs
automaticall
All tests pass (`bild --test Omni/Agent/Worker.hs`) and all lint checks
Task-Id: t-202
|
|
Perfect! Everything passes. Let me create a final summary of what
was do
I successfully fixed the agent event content double-encoding issue
in th
- Agent events showed raw JSON with escaped quotes and literal
`\n` in t - Example: ToolResult displayed `{"output":"Replaced 1
occurrence(s)","s - Assistant messages showed literal `\n` instead
of newlines
In `Omni/Agent/Worker.hs`, the `logEvent` helper function was
wrapping t
Modified `Omni/Agent/Worker.hs` to distinguish between text and
structur
1. **Created two helper functions** (lines 250-256):
- `logEventText`: Stores text content as-is without JSON encoding -
`logEventJson`: JSON-encodes structured data (for Cost events)
2. **Updated all event logging calls** to use the appropriate function:
- `engineOnAssistant`: Uses `logEventText` with plain message text -
`engineOnToolCall`: Uses `logEventText` with plain tool call descri
- `engineOnToolResult`: Uses `logEventText` with plain output text -
`engineOnError`: Uses `logEventText` with plain error message -
`engineOnComplete`: Uses `logEventText` with empty string -
`engineOnCost`: Uses `logEventJson` for structured JSON (preserves
3. **No changes to Web.hs** were needed - the rendering functions
alread
✅ `bild --test Omni/Jr/Web.hs` - PASSED ✅ `lint
Omni/Agent/Worker.hs` - PASSED ✅ `lint Omni/Jr/Web.hs` - PASSED
The fix is complete and ready for commit. Agent events will now display
Task-Id: t-200
|
|
Perfect! All tests pass for the affected modules. Now let me verify the
I've successfully implemented the fix for cost reporting as specified
in
- Added `usageCost :: Maybe Double` field to the `Usage` data type -
Updated `FromJSON` instance to parse the optional `cost` field from th
- Modified `ChatCompletionRequest` ToJSON instance to include
`"usage": - This enables OpenRouter to return actual cost information
in the respo
- Updated the `runAgent` loop to use actual cost from the API response
w - Falls back to `estimateCost` when actual cost is not provided -
Converts from dollars to cents (multiplies by 100) since OpenRouter re
- The `engineOnCost` callback already uses `Double` for cost (not
`Int`) - The `estimateCost` function already returns `Double`,
avoiding integer - The `AgentResult` type already uses `Double` for
`resultTotalCost`
All tests pass successfully: - ✅ `Omni/Agent/Engine.hs` - All 14
tests pass, including new tests for - ✅ `Omni/Agent/Worker.hs` -
Builds successfully - ✅ `Omni/Agent.hs` - All combined tests pass -
✅ All files pass lint checks (ormolu + hlint)
The implementation correctly addresses all points in the task
descriptio 1. ✅ Parses actual cost from OpenRouter API response
2. ✅ Enables usage accounting in requests 3. ✅ Uses Double for
cost to avoid rounding issues 4. ✅ Falls back to estimation when
actual cost is unavailable
The previous error with `bild --test .` was due to `.` not being
a valid
Task-Id: t-197.8
|
|
Perfect! The changes have been successfully implemented. Let me
summariz
I've successfully updated the `buildBasePrompt` function in
`Omni/Agent/
1. **Line 320**: Changed "including hlint suggestions" → "including
lint 2. **Line 324**: Changed "if hlint finds issues" → "if lint
finds issues 3. **Line 325**: Changed "You must fix hlint suggestions
like:" → "You m 4. **Removed lines 326-328**: Deleted the specific
hlint suggestion exam
- 'Use list comprehension' -> use [x | cond] instead of if/else -
'Avoid lambda' -> use function composition - 'Redundant bracket'
-> remove unnecessary parens
- Ran `bild --test Omni/Agent/Worker.hs` ✓ PASSED with no errors
The prompt now correctly references the `lint` command instead of
`hlint
Task-Id: t-199
|
|
I have successfully completed task t-197.8 to fix cost reporting
by pars
**Omni/Agent/Engine.hs:** 1. Added `usageCost :: Maybe Double`
field to the `Usage` type to captur 2. Updated `FromJSON` instance to
parse the optional `"cost"` field 3. Modified `ChatCompletionRequest`
ToJSON instance to include `"usage": 4. Changed cost types from `Int`
to `Double` throughout (engineOnCost ca 5. Updated `estimateCost`
to use floating-point division instead of inte 6. Modified `runAgent`
to use actual cost from API when available, conve 7. Added new test
case for parsing usage with cost field
**Omni/Agent/Worker.hs:** 1. Updated `runWithEngine` signature to
return `Double` for cost 2. Changed `totalCostRef` from `IORef Int`
to `IORef Double` 3. Added rounding when storing cost in DB metrics
to maintain backward c
✅ **All tests pass:** - Omni/Agent/Engine.hs - 16 unit tests pass
- Omni/Agent/Worker.hs - Builds successfully - Omni/Agent.hs - All
integration tests pass - Omni/Jr.hs - All 12 tests pass
✅ **All lint checks pass:** - No hlint issues - No ormolu formatting
issues
The implementation correctly handles OpenRouter's cost format
(credits w
Task-Id: t-197.8
|
|
run_bash tool now checks if the working directory exists before
executing. Previously invalid cwd caused system-level chdir error.
Now returns clean tool error the agent can understand and react to.
|