summaryrefslogtreecommitdiff
path: root/Omni/Agent
AgeCommit message (Collapse)Author
6 daystelegram: fix audio transcription model and prompt orderBen Sima
- Switch from gemini-2.0-flash-001 to gemini-2.5-flash - Put audio content before text prompt (model was ignoring audio) - Strengthen prompt to return only transcription
6 daystelegram: unified message queue with async/scheduled sendsBen Sima
- Add Messages.hs with scheduled_messages table and dispatcher loop - All outbound messages now go through the queue (1s polling) - Disable streaming responses, use runAgentWithProvider instead - Add send_message tool for delayed messages (up to 30 days) - Add list_pending_messages and cancel_message tools - Reminders now queue messages instead of sending directly - Exponential backoff retry (max 5 attempts) for failed sends
6 daysFix Telegram streaming markdown parse errorsBen Sima
Amp-Thread-ID: https://ampcode.com/threads/T-019b1894-b431-777d-aba3-65a51e720ef2 Co-authored-by: Amp <amp@ampcode.com>
6 daysAdd ISO 8601 timestamps to conversation context messagesBen Sima
6 daysAdd knowledge graph with typed relations to Memory moduleBen Sima
- Add RelationType with 6 relation types - Add MemoryLink type and memory_links table - Add graph functions: linkMemories, getMemoryLinks, queryGraph - Add link_memories and query_graph agent tools - Wire up graph tools to Telegram bot - Include memory ID in recall results for linking - Fix streaming usage parsing for cost tracking Closes t-255 Amp-Thread-ID: https://ampcode.com/threads/T-019b181f-d6cd-70de-8857-c445baef7508 Co-authored-by: Amp <amp@ampcode.com>
6 daysfeat: only allow whitelisted users to add bot to groupsBen Sima
When the bot is added to a group, check if the user who added it is in the whitelist. If not, send a message explaining and leave the group immediately. This prevents unauthorized users from bypassing DM access controls by adding the bot to a group.
6 daysfeat: allow all users in group chats, whitelist only for DMsBen Sima
6 daysfeat: enable Markdown rendering in Telegram messagesBen Sima
Add parse_mode=Markdown to sendMessage and editMessage API calls
6 daysfix: accumulate streaming tool call arguments across SSE chunksBen Sima
OpenAI's SSE streaming sends tool calls incrementally - the first chunk has the id and function name, subsequent chunks contain argument fragments. Previously each chunk was treated as a complete tool call, causing invalid JSON arguments. - Add ToolCallDelta type with index for partial tool call data - Add StreamToolCallDelta chunk type - Track tool calls by index in IntMap accumulator - Merge argument fragments across chunks via mergeToolCallDelta - Build final ToolCall objects from accumulator when stream ends - Handle new StreamToolCallDelta in Engine.hs pattern match
6 daystelegram: add conversation context to group pre-filterBen Sima
Pre-filter now sees last 5 messages so it can detect when user is continuing a conversation with Ava, even without explicit mention. - Fetch recent messages before shouldEngageInGroup - Update classifier prompt to understand Ava context - Handle follow-up messages to bot's previous responses
6 daysfix: correct cost estimation formulasBen Sima
- Update to Dec 2024 OpenRouter pricing - Use blended input/output rates - Add gemini-flash, claude-sonnet-4.5 specific rates - Fix math: was off by ~30x for Claude models
6 daystelegram: add cheap pre-filter for group messagesBen Sima
Use Gemini Flash to classify group messages before running the full Sonnet agent. Skips casual banter to save tokens/cost. - shouldEngageInGroup: yes/no classifier using gemini-2.0-flash - Only runs for group chats, private chats skip the filter - On classifier failure, defaults to engaging (fail-open)
6 daystelegram: intelligent group response (LLM decides when to speak)Ben Sima
- Remove mention-based filtering, bot sees all group messages - Add response rules to system prompt for group chats: - tool invocation = always respond - direct question = respond - factual correction = maybe respond - casual banter = stay silent - Empty response in group = intentional silence (no fallback msg) - Add chat type context to system prompt
6 daystelegram: add group chat supportBen Sima
- Only respond in groups when @mentioned or replied to - Add ChatType to TelegramMessage (private/group/supergroup/channel) - Add getMe API call to fetch bot username on startup - Add shouldRespondInGroup helper function
6 daystelegram: fix parsing, add webpage reader, use geminiBen Sima
- Fix Provider.hs to strip leading whitespace from OpenRouter responses - Fix FunctionCall parser to handle missing 'arguments' field - Use eitherDecode for better error messages on parse failures - Switch to claude-sonnet-4.5 for main agent - Use gemini-2.0-flash for conversation summarization (cheaper) - Add read_webpage tool for fetching and summarizing URLs - Add tagsoup to Haskell deps (unused, kept for future)
6 daystelegram bot: refactor + multimedia + reply supportBen Sima
Refactor Telegram.hs into submodules to reduce file size: - Types.hs: data types, JSON parsing - Media.hs: file downloads, image/voice analysis - Reminders.hs: reminder loop, user chat persistence Multimedia improvements: - Vision uses third-person to avoid LLM confusion - Better message framing for embedded descriptions - Size validation (10MB images, 20MB voice) - MIME type validation for voice messages New features: - Reply support: bot sees context when users reply - Web search: default 5->10, max 10->20 results - Guardrails: duplicate tool limit 3->10 for research - Timezone: todos parse/display in Eastern time (ET)
6 daysfeat: add image and voice message support for Telegram botBen Sima
- Add TelegramPhoto and TelegramVoice types - Parse photo and voice fields from Telegram updates - Download photos/voice via Telegram API - Analyze images using Claude vision via OpenRouter - Transcribe voice messages using Gemini audio via OpenRouter - Wire multimedia processing into handleAuthorizedMessage Photos are analyzed with user's caption as context. Voice messages are transcribed and treated as text input.
6 daysfeat: add reminder service for todosBen Sima
Adds a background reminder loop that checks every 5 minutes for overdue todos and sends Telegram notifications. Changes: - Add last_reminded_at column to todos table with auto-migration - Add listTodosDueForReminder to find overdue, unreminded todos - Add markReminderSent to update reminder timestamp - Add user_chats table to map user_id -> chat_id for notifications - Add recordUserChat called on each message to track chat IDs - Add reminderLoop forked in runTelegramBot - 24-hour anti-spam interval between reminders per todo
6 daysfix: prompt for text response when agent returns empty after tool callsBen Sima
When the LLM returned empty content after executing tools, the agent would complete with an empty message. Now both agent loops (LLM-based and Provider-based) detect this case and inject a prompt asking the LLM to provide a response to the user.
6 daysAdd todo tools with due datesBen Sima
- Omni/Agent/Tools/Todos.hs: todo_add, todo_list, todo_complete, todo_delete - Supports optional due dates in YYYY-MM-DD or YYYY-MM-DD HH:MM format - Lists can filter by pending, all, or overdue - Add todos table to Memory.hs schema - Wire into Telegram bot
6 daysAdd sender_name to conversation messages for group chat supportBen Sima
- Add sender_name column to conversation_messages table - Migrate existing messages to set sender_name='bensima' - Show sender names in conversation context (e.g., 'bensima: hello') - Pass userName when saving user messages in Telegram bot
6 daysAdd current user name to Telegram bot system promptBen Sima
6 daysShow calendar name in events and add optional calendar filterBen Sima
6 daysInstruct bot to always include text response after tool callsBen Sima
6 daysAdd current date/time to Telegram bot system promptBen Sima
6 daysFilter calendar to BenSimaShared and Kate onlyBen Sima
6 daysAdd calendar tools using khal CLIBen Sima
- Omni/Agent/Tools/Calendar.hs: calendar_list, calendar_add, calendar_search - Wire into Telegram bot alongside other tools - Integrates with local CalDAV via khal
6 daysAdd PDF and Notes tools to Telegram botBen Sima
- Omni/Agent/Tools/Pdf.hs: Extract text from PDFs using pdftotext - Omni/Agent/Tools/Notes.hs: Quick notes CRUD with topics - Add notes table schema to Memory.hs initMemoryDb - Wire both tools into Telegram bot with logging callbacks
7 daysTelegram bot: Kagi web search toolBen Sima
- Add Omni/Agent/Tools/WebSearch.hs with Kagi Search API integration - webSearchTool for agents to search the web - kagiSearch function for direct API access - Load KAGI_API_KEY from environment - Wire web search into Telegram bot tools - Results formatted with title, URL, and snippet Closes t-252
7 daysTelegram bot: user whitelist access controlBen Sima
- Add tgAllowedUserIds field to TelegramConfig - Load ALLOWED_TELEGRAM_USER_IDS from environment (comma-separated) - Check isUserAllowed before processing messages - Reject unauthorized users with friendly message - Empty whitelist or '*' allows all users - Add tests for whitelist behavior
7 daysTelegram bot: conversation history and summariesBen Sima
- Add sendTypingAction to show typing indicator when processing - Add conversation_messages and conversation_summaries tables - Implement conversation history with token counting - Auto-summarize when context exceeds threshold (3000 tokens) - Save user/assistant messages for multi-turn context - Add ConversationMessage, ConversationSummary, MessageRole types Tasks created: t-252 (web search), t-253 (calendar), t-254 (PDF), t-255 (knowledge graph), t-256 (notes)
7 daysFix telegram bot timeout and sendMessage 400 errorBen Sima
- Set response timeout to polling timeout + 10s for long polling - Remove Markdown parse_mode to avoid 400 errors on special chars
7 daysMerge telegram bot system prompt with user's preferred styleBen Sima
7 daysAdd Telegram bot agent (t-251)Ben Sima
- Omni/Agent/Telegram.hs: Telegram API client with getUpdates/sendMessage - Omni/Bot.hs: Standalone CLI for running the bot - User identification via Memory.getOrCreateUserByTelegramId - Memory-enhanced agent with remember/recall tools - Run with: bot --token=XXX or TELEGRAM_BOT_TOKEN env var
7 daysAdd cross-agent memory system (t-248)Ben Sima
- User management with Telegram ID identification - Memory storage with Ollama embeddings (nomic-embed-text) - Semantic similarity search via cosine similarity - remember/recall tools for agents - runAgentWithMemory wrapper for memory-enhanced agents - Separate memory.db database for user privacy
7 dayst-247: Add Provider abstraction for multi-backend LLM supportBen Sima
- Create Omni/Agent/Provider.hs with unified Provider interface - Support OpenRouter (cloud), Ollama (local), Amp (subprocess stub) - Add runAgentWithProvider to Engine.hs for Provider-based execution - Add EngineType to Core.hs (EngineOpenRouter, EngineOllama, EngineAmp) - Add --engine flag to 'jr work' command - Worker.hs dispatches to appropriate provider based on engine type Usage: jr work <task-id> # OpenRouter (default) jr work <task-id> --engine=ollama # Local Ollama jr work <task-id> --engine=amp # Amp CLI (stub)
7 daysAdd Omni/Agent/PLAN.md - agent infrastructure roadmapBen Sima
Defines architecture for multi-agent system with: - Provider abstraction (OpenRouter, Ollama, Amp backends) - Shared memory system (sqlite-vss, multi-user, cross-agent) - Tool registry for pluggable tool sets - Evals framework for regression testing - Telegram bot as first concrete agent Tasks: t-247 through t-251
2025-12-02Improve agent system prompt for better token efficiencyBen Sima
Add explicit guidance on: - Reading files with large ranges (500+ lines) instead of many small chunks - Using read_file directly when target file is known vs search_and_read - Cost awareness: planning refactors, avoiding redundant reads - Tool call limits for complex tasks
2025-12-02Remove duplicate prompt contentBen Sima
The task was being added to the prompt twice, once in the base prompt and once in the user prompt.
2025-12-02System prompt improvementsBen Sima
Worked with Gemini and Opus to improve the system prompt with learnings from the Amp prompt. Removed reference to Omni/Task/README.md because it is deprecated in favor of `jr task`.
2025-12-02jr: add 'prompt' command to inspect agent system promptBen Sima
jr prompt <task-id> constructs and prints the full system prompt that would be sent to the agent, including: - Agent configuration (model, cost budget) - Base instructions - AGENTS.md content - Relevant facts from knowledge base - Retry/progress context if applicable Useful for debugging agent behavior and token usage.
2025-12-01Compact amp-style timeline rendering and targeted file readingBen Sima
Timeline tool display: - Grep/search: ✓ Grep pattern in filepath - Read file: ✓ Read filepath @start-end - Edit file: ✓ Edit filepath - Bash: ϟ command (lightning bolt prompt) - Tool results only shown for meaningful output New search_and_read tool: - Combines search + read in one operation - Uses ripgrep --context for surrounding lines - More efficient than separate search then read Worker prompt updated to prefer search_and_read over separate search + read_file calls
2025-12-01Fix build errors in Jr modulesBen Sima
- Fix Worker.hs to use EngineError instead of tuple - Fix Types.hs imports for LazyText.encodeUtf8 and dayOfWeek - Remove duplicate SortOrder from Components.hs (import from Types.hs) - Add orphan instance pragmas to Pages.hs and Partials.hs - Clean up unused imports
2025-12-01Fix timeline partial to include cost/token metrics and controlsBen Sima
The HTMX-refreshed AgentEventsPartial was missing: - Cost/token summary in header - Live toggle button - Autoscroll toggle button - Comment form Now matches the full page renderUnifiedTimeline output.
2025-12-01Strengthen prompt to stop immediately after tests passBen Sima
Jr was completing tasks but then going into verification loops, re-reading files and 'tracing through logic' after tests passed. This burned ~4 cents of extra cost on t-221. Made instructions more emphatic: - 'STOP IMMEDIATELY' with explicit list of what NOT to do - 'ANY further tool calls are wasted money' - Repeated in BUILD SYSTEM NOTES section Task-Id: t-227
2025-12-01Add prompt guidance for large file editingBen Sima
Instructs the agent to: - Use line ranges when reading large files (>500 lines) - Use minimal context for edit_file old_str matching - Re-read exact lines after failed edits - Stop after 2-3 failed edits to reconsider approach - Flag very large files (>2000 lines) for refactoring Task-Id: t-225
2025-12-01Add guardrail for repeated edit_file failuresBen Sima
Tracks 'old_str not found' errors from edit_file tool calls. After 5 consecutive failures, stops the agent to prevent burning tokens on impossible edits. This catches the pattern where the agent repeatedly tries to edit a large file with incorrect old_str matches, which was the root cause of t-222 exceeding its cost budget. Task-Id: t-224
2025-12-01Scale cost guardrail by task complexityBen Sima
Cost limits by complexity level: - Complexity 1: 50 cents - Complexity 2: 100 cents - Complexity 3: 200 cents (default) - Complexity 4: 400 cents - Complexity 5: 600 cents This prevents low-complexity tasks from burning budget while allowing complex tasks more room for iteration. Task-Id: t-223
2025-12-01Clicking LIVE label toggles live updates on/offBen Sima
- Add clickable LIVE toggle button that pauses/resumes timeline polling - Green pulsing when active, grey when paused - Uses htmx:beforeRequest event to cancel requests when paused - Increase duplicate tool call guardrail from 20 to 30 Task-Id: t-211
2025-12-01Show tool call arguments inline instead of JSON blobBen Sima
- Add formatToolCallSummary to extract key argument from JSON - Shows run_bash command, file paths for read/edit/write, patterns for search - Display summary inline in tool call header (e.g., run_bash: `ls -la`) - Increase token guardrail from 1M to 2M to prevent premature stops Task-Id: t-212