# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Overview OpenRabbit is an enterprise-grade AI code review system for Gitea (and GitHub). It provides automated PR review, issue triage, interactive chat, and codebase analysis through a collection of specialized AI agents. ## Commands ### Development ```bash # Run tests pytest tests/ -v # Run specific test file pytest tests/test_ai_review.py -v # Install dependencies pip install -r tools/ai-review/requirements.txt # Run a PR review locally cd tools/ai-review python main.py pr owner/repo 123 # Run issue triage python main.py issue owner/repo 456 # Test chat functionality python main.py chat owner/repo "How does authentication work?" # Run with custom config python main.py pr owner/repo 123 --config /path/to/config.yml ``` ### Testing Workflows ```bash # Validate workflow YAML syntax python -c "import yaml; yaml.safe_load(open('.github/workflows/ai-review.yml'))" # Test security scanner python -c "from security.security_scanner import SecurityScanner; s = SecurityScanner(); print(list(s.scan_content('password = \"secret123\"', 'test.py')))" ``` ## Architecture ### Agent System The codebase uses an **agent-based architecture** where specialized agents handle different types of events: 1. **BaseAgent** (`agents/base_agent.py`) - Abstract base class providing: - Gitea API client integration - LLM client integration with rate limiting - Common comment management (upsert, find AI comments) - Prompt loading from `prompts/` directory - Standard execution flow with error handling 2. **Specialized Agents** - Each agent implements: - `can_handle(event_type, event_data)` - Determines if agent should process the event - `execute(context)` - Main execution logic - Returns `AgentResult` with success status, message, data, and actions taken - **PRAgent** - Reviews pull requests with inline comments and security scanning - **IssueAgent** - Triages issues and responds to @ai-bot commands - **CodebaseAgent** - Analyzes entire codebase health and tech debt - **ChatAgent** - Interactive assistant with tool calling (search_codebase, read_file, search_web) 3. **Dispatcher** (`dispatcher.py`) - Routes events to appropriate agents: - Registers agents at startup - Determines which agents can handle each event - Executes agents (supports concurrent execution) - Returns aggregated results ### Multi-Provider LLM Client The `LLMClient` (`clients/llm_client.py`) provides a unified interface for multiple LLM providers: - **OpenAI** - Primary provider (gpt-4.1-mini default) - **OpenRouter** - Multi-provider access (claude-3.5-sonnet) - **Ollama** - Self-hosted models (codellama:13b) Key features: - Tool/function calling support via `call_with_tools(messages, tools)` - JSON response parsing with fallback extraction - Provider-specific configuration via `config.yml` ### Platform Abstraction The `GiteaClient` (`clients/gitea_client.py`) provides a unified REST API client for **Gitea** (also compatible with GitHub API): - Issue operations (create, update, list, get, comments, labels) - PR operations (get, diff, files, reviews) - Repository operations (get repo, file contents, branches) Environment variables: - `AI_REVIEW_API_URL` - API base URL (e.g., `https://api.github.com` or `https://gitea.example.com/api/v1`) - `AI_REVIEW_TOKEN` - Authentication token ### Security Scanner The `SecurityScanner` (`security/security_scanner.py`) uses **pattern-based detection** with 17 built-in rules covering: - OWASP Top 10 categories (A01-A10) - Common vulnerabilities (SQL injection, XSS, hardcoded secrets, weak crypto) - Returns `SecurityFinding` objects with severity (HIGH/MEDIUM/LOW), CWE references, and recommendations Can scan: - File content via `scan_content(content, filename)` - Git diffs via `scan_diff(diff)` - only scans added lines ### Chat Agent Tool Calling The `ChatAgent` implements an **iterative tool calling loop**: 1. Send user message + system prompt to LLM with available tools 2. If LLM returns tool calls, execute each tool and append results to conversation 3. Repeat until LLM returns a final response (max 5 iterations) Available tools: - `search_codebase` - Searches repository files and code patterns - `read_file` - Reads specific file contents (truncated at 8KB) - `search_web` - Queries SearXNG instance (requires `SEARXNG_URL`) ## Configuration ### Primary Config File: `tools/ai-review/config.yml` Critical settings: ```yaml provider: openai # openai | openrouter | ollama model: openai: gpt-4.1-mini openrouter: anthropic/claude-3.5-sonnet ollama: codellama:13b interaction: mention_prefix: "@codebot" # Bot trigger name - update workflows too! commands: - explain # Explain what the issue is about - suggest # Suggest solutions or next steps - security # Security analysis - summarize # Summarize the issue - triage # Full triage with labeling review: fail_on_severity: HIGH # Fail CI if HIGH severity issues found max_diff_lines: 800 # Skip review if diff too large agents: chat: max_iterations: 5 # Tool calling loop limit ``` **Important**: When changing `mention_prefix`, also update all workflow files in `.gitea/workflows/`: - `ai-comment-reply.yml` - `ai-chat.yml` - `ai-issue-triage.yml` Look for: `if: contains(github.event.comment.body, '@codebot')` and update to your new bot name. Current bot name: `@codebot` ### Environment Variables Required: - `AI_REVIEW_API_URL` - Platform API URL - `AI_REVIEW_TOKEN` - Bot authentication token - `OPENAI_API_KEY` - OpenAI API key (or provider-specific key) Optional: - `SEARXNG_URL` - SearXNG instance for web search - `OPENROUTER_API_KEY` - OpenRouter API key - `OLLAMA_HOST` - Ollama server URL ## Workflow Architecture Workflows are located in `.gitea/workflows/`: - **ai-review.yml** / **enterprise-ai-review.yml** - Triggered on PR open/sync - **ai-issue-triage.yml** - Triggered on `@codebot triage` mention in issue comments - **ai-comment-reply.yml** - Triggered on issue comments with @bot mentions - **ai-chat.yml** - Triggered on issue comments for chat (non-command mentions) - **ai-codebase-review.yml** - Scheduled weekly analysis **Note**: Issue triage is now **opt-in** via `@codebot triage` command, not automatic on issue creation. Key workflow pattern: 1. Checkout repository 2. Setup Python 3.11 3. Install dependencies (`pip install requests pyyaml`) 4. Set environment variables 5. Run `python main.py ` ## Prompt Templates Prompts are stored in `tools/ai-review/prompts/` as Markdown files: - `base.md` - Base instructions for all reviews - `issue_triage.md` - Issue classification template - `issue_response.md` - Issue response template **Important**: JSON examples in prompts must use **double curly braces** (`{{` and `}}`) to escape Python's `.format()` method. This is tested in `tests/test_ai_review.py::TestPromptFormatting`. ## Code Patterns ### Creating a New Agent ```python from agents.base_agent import BaseAgent, AgentContext, AgentResult class MyAgent(BaseAgent): def can_handle(self, event_type: str, event_data: dict) -> bool: # Check if agent is enabled in config if not self.config.get("agents", {}).get("my_agent", {}).get("enabled", True): return False return event_type == "my_event_type" def execute(self, context: AgentContext) -> AgentResult: # Load prompt template prompt = self.load_prompt("my_prompt") formatted = prompt.format(data=context.event_data.get("field")) # Call LLM with rate limiting response = self.call_llm(formatted) # Post comment to issue/PR self.upsert_comment( context.owner, context.repo, issue_index, response.content ) return AgentResult( success=True, message="Agent completed", actions_taken=["Posted comment"] ) ``` ### Calling LLM with Tools ```python messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Search for authentication code"} ] tools = [{ "type": "function", "function": { "name": "search_code", "description": "Search codebase", "parameters": { "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"] } } }] response = self.llm.call_with_tools(messages, tools=tools) if response.tool_calls: for tc in response.tool_calls: result = execute_tool(tc.name, tc.arguments) messages.append({ "role": "tool", "tool_call_id": tc.id, "content": result }) ``` ### Adding Security Rules Edit `tools/ai-review/security/security_scanner.py` or create `security/security_rules.yml`: ```yaml rules: - id: SEC018 name: Custom Rule Name pattern: 'regex_pattern_here' severity: HIGH # HIGH, MEDIUM, LOW category: A03:2021 Injection cwe: CWE-XXX description: What this detects recommendation: How to fix it ``` ## Testing The test suite (`tests/test_ai_review.py`) covers: 1. **Prompt Formatting** - Ensures prompts don't have unescaped `{}` that break `.format()` 2. **Module Imports** - Verifies all modules can be imported 3. **Security Scanner** - Tests pattern detection and false positive rate 4. **Agent Context** - Tests dataclass creation and validation 5. **Metrics** - Tests enterprise metrics collection Run specific test classes: ```bash pytest tests/test_ai_review.py::TestPromptFormatting -v pytest tests/test_ai_review.py::TestSecurityScanner -v ``` ## Common Development Tasks ### Adding a New Command to @codebot 1. Add command to `config.yml` under `interaction.commands` 2. Add handler method in `IssueAgent` (e.g., `_command_yourcommand()`) 3. Update `_handle_command()` to route the command to your handler 4. Update README.md with command documentation 5. Add tests in `tests/test_ai_review.py` Example commands: - `@codebot help` - Show all available commands with examples - `@codebot triage` - Full issue triage with labeling - `@codebot explain` - Explain the issue - `@codebot suggest` - Suggest solutions - `@codebot setup-labels` - Automatic label setup (built-in, not in config) ### Changing the Bot Name 1. Edit `config.yml`: `interaction.mention_prefix: "@newname"` 2. Update all Gitea workflow files in `.gitea/workflows/` (search for `contains(github.event.comment.body`) 3. Update README.md and documentation ### Supporting a New LLM Provider 1. Create provider class in `clients/llm_client.py` inheriting from `BaseLLMProvider` 2. Implement `call()` and optionally `call_with_tools()` 3. Register in `LLMClient.PROVIDERS` dict 4. Add model config to `config.yml` 5. Document in README.md ## Repository Labels ### Automatic Label Setup (Recommended) Use the `@codebot setup-labels` command to automatically configure labels. This command: **For repositories with existing labels:** - Detects naming patterns: `Kind/Bug`, `Priority - High`, `type: bug` - Maps existing labels to OpenRabbit schema using aliases - Creates only missing labels following detected pattern - Zero duplicate labels **For fresh repositories:** - Creates OpenRabbit's default label set - Uses standard naming: `type:`, `priority:`, status labels **Example with existing `Kind/` and `Priority -` labels:** ``` @codebot setup-labels ✅ Found 18 existing labels with pattern: prefix_slash Proposed Mapping: | OpenRabbit Expected | Your Existing Label | Status | |---------------------|---------------------|--------| | type: bug | Kind/Bug | ✅ Map | | type: feature | Kind/Feature | ✅ Map | | priority: high | Priority - High | ✅ Map | | ai-reviewed | (missing) | ⚠️ Create | ✅ Created Kind/Question ✅ Created Status - AI Reviewed Setup Complete! Auto-labeling will use your existing label schema. ``` ### Manual Label Setup The system expects these labels to exist in repositories for auto-labeling: - `priority: critical`, `priority: high`, `priority: medium`, `priority: low` - `type: bug`, `type: feature`, `type: question`, `type: documentation`, `type: security`, `type: testing` - `ai-approved`, `ai-changes-required`, `ai-reviewed` Labels are mapped in `config.yml` under the `labels` section. ### Label Configuration Format Labels support two formats for backwards compatibility: **New format (with colors and aliases):** ```yaml labels: type: bug: name: "type: bug" color: "d73a4a" # Red description: "Something isn't working" aliases: ["Kind/Bug", "bug", "Type: Bug"] # For auto-detection ``` **Old format (strings only):** ```yaml labels: type: bug: "type: bug" # Still works, uses default blue color ``` ### Label Pattern Detection The `setup-labels` command detects these patterns (configured in `label_patterns`): 1. **prefix_slash**: `Kind/Bug`, `Type/Feature`, `Category/X` 2. **prefix_dash**: `Priority - High`, `Status - Blocked` 3. **colon**: `type: bug`, `priority: high` When creating missing labels, the bot follows the detected pattern to maintain consistency.