All checks were successful
Enterprise AI Code Review / ai-review (pull_request) Successful in 16s
Implements help command that shows all available bot commands with examples and categories. Users can now type '@codebot help' to see complete command list without reading docs.
418 lines
13 KiB
Markdown
418 lines
13 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Overview
|
|
|
|
OpenRabbit is an enterprise-grade AI code review system for Gitea (and GitHub). It provides automated PR review, issue triage, interactive chat, and codebase analysis through a collection of specialized AI agents.
|
|
|
|
## Commands
|
|
|
|
### Development
|
|
|
|
```bash
|
|
# Run tests
|
|
pytest tests/ -v
|
|
|
|
# Run specific test file
|
|
pytest tests/test_ai_review.py -v
|
|
|
|
# Install dependencies
|
|
pip install -r tools/ai-review/requirements.txt
|
|
|
|
# Run a PR review locally
|
|
cd tools/ai-review
|
|
python main.py pr owner/repo 123
|
|
|
|
# Run issue triage
|
|
python main.py issue owner/repo 456
|
|
|
|
# Test chat functionality
|
|
python main.py chat owner/repo "How does authentication work?"
|
|
|
|
# Run with custom config
|
|
python main.py pr owner/repo 123 --config /path/to/config.yml
|
|
```
|
|
|
|
### Testing Workflows
|
|
|
|
```bash
|
|
# Validate workflow YAML syntax
|
|
python -c "import yaml; yaml.safe_load(open('.github/workflows/ai-review.yml'))"
|
|
|
|
# Test security scanner
|
|
python -c "from security.security_scanner import SecurityScanner; s = SecurityScanner(); print(list(s.scan_content('password = \"secret123\"', 'test.py')))"
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Agent System
|
|
|
|
The codebase uses an **agent-based architecture** where specialized agents handle different types of events:
|
|
|
|
1. **BaseAgent** (`agents/base_agent.py`) - Abstract base class providing:
|
|
- Gitea API client integration
|
|
- LLM client integration with rate limiting
|
|
- Common comment management (upsert, find AI comments)
|
|
- Prompt loading from `prompts/` directory
|
|
- Standard execution flow with error handling
|
|
|
|
2. **Specialized Agents** - Each agent implements:
|
|
- `can_handle(event_type, event_data)` - Determines if agent should process the event
|
|
- `execute(context)` - Main execution logic
|
|
- Returns `AgentResult` with success status, message, data, and actions taken
|
|
|
|
- **PRAgent** - Reviews pull requests with inline comments and security scanning
|
|
- **IssueAgent** - Triages issues and responds to @ai-bot commands
|
|
- **CodebaseAgent** - Analyzes entire codebase health and tech debt
|
|
- **ChatAgent** - Interactive assistant with tool calling (search_codebase, read_file, search_web)
|
|
|
|
3. **Dispatcher** (`dispatcher.py`) - Routes events to appropriate agents:
|
|
- Registers agents at startup
|
|
- Determines which agents can handle each event
|
|
- Executes agents (supports concurrent execution)
|
|
- Returns aggregated results
|
|
|
|
### Multi-Provider LLM Client
|
|
|
|
The `LLMClient` (`clients/llm_client.py`) provides a unified interface for multiple LLM providers:
|
|
|
|
- **OpenAI** - Primary provider (gpt-4.1-mini default)
|
|
- **OpenRouter** - Multi-provider access (claude-3.5-sonnet)
|
|
- **Ollama** - Self-hosted models (codellama:13b)
|
|
|
|
Key features:
|
|
- Tool/function calling support via `call_with_tools(messages, tools)`
|
|
- JSON response parsing with fallback extraction
|
|
- Provider-specific configuration via `config.yml`
|
|
|
|
### Platform Abstraction
|
|
|
|
The `GiteaClient` (`clients/gitea_client.py`) provides a unified REST API client for **Gitea** (also compatible with GitHub API):
|
|
|
|
- Issue operations (create, update, list, get, comments, labels)
|
|
- PR operations (get, diff, files, reviews)
|
|
- Repository operations (get repo, file contents, branches)
|
|
|
|
Environment variables:
|
|
- `AI_REVIEW_API_URL` - API base URL (e.g., `https://api.github.com` or `https://gitea.example.com/api/v1`)
|
|
- `AI_REVIEW_TOKEN` - Authentication token
|
|
|
|
### Security Scanner
|
|
|
|
The `SecurityScanner` (`security/security_scanner.py`) uses **pattern-based detection** with 17 built-in rules covering:
|
|
|
|
- OWASP Top 10 categories (A01-A10)
|
|
- Common vulnerabilities (SQL injection, XSS, hardcoded secrets, weak crypto)
|
|
- Returns `SecurityFinding` objects with severity (HIGH/MEDIUM/LOW), CWE references, and recommendations
|
|
|
|
Can scan:
|
|
- File content via `scan_content(content, filename)`
|
|
- Git diffs via `scan_diff(diff)` - only scans added lines
|
|
|
|
### Chat Agent Tool Calling
|
|
|
|
The `ChatAgent` implements an **iterative tool calling loop**:
|
|
|
|
1. Send user message + system prompt to LLM with available tools
|
|
2. If LLM returns tool calls, execute each tool and append results to conversation
|
|
3. Repeat until LLM returns a final response (max 5 iterations)
|
|
|
|
Available tools:
|
|
- `search_codebase` - Searches repository files and code patterns
|
|
- `read_file` - Reads specific file contents (truncated at 8KB)
|
|
- `search_web` - Queries SearXNG instance (requires `SEARXNG_URL`)
|
|
|
|
## Configuration
|
|
|
|
### Primary Config File: `tools/ai-review/config.yml`
|
|
|
|
Critical settings:
|
|
|
|
```yaml
|
|
provider: openai # openai | openrouter | ollama
|
|
|
|
model:
|
|
openai: gpt-4.1-mini
|
|
openrouter: anthropic/claude-3.5-sonnet
|
|
ollama: codellama:13b
|
|
|
|
interaction:
|
|
mention_prefix: "@codebot" # Bot trigger name - update workflows too!
|
|
commands:
|
|
- explain # Explain what the issue is about
|
|
- suggest # Suggest solutions or next steps
|
|
- security # Security analysis
|
|
- summarize # Summarize the issue
|
|
- triage # Full triage with labeling
|
|
|
|
review:
|
|
fail_on_severity: HIGH # Fail CI if HIGH severity issues found
|
|
max_diff_lines: 800 # Skip review if diff too large
|
|
|
|
agents:
|
|
chat:
|
|
max_iterations: 5 # Tool calling loop limit
|
|
```
|
|
|
|
**Important**: When changing `mention_prefix`, also update all workflow files in `.gitea/workflows/`:
|
|
- `ai-comment-reply.yml`
|
|
- `ai-chat.yml`
|
|
- `ai-issue-triage.yml`
|
|
|
|
Look for: `if: contains(github.event.comment.body, '@codebot')` and update to your new bot name.
|
|
|
|
Current bot name: `@codebot`
|
|
|
|
### Environment Variables
|
|
|
|
Required:
|
|
- `AI_REVIEW_API_URL` - Platform API URL
|
|
- `AI_REVIEW_TOKEN` - Bot authentication token
|
|
- `OPENAI_API_KEY` - OpenAI API key (or provider-specific key)
|
|
|
|
Optional:
|
|
- `SEARXNG_URL` - SearXNG instance for web search
|
|
- `OPENROUTER_API_KEY` - OpenRouter API key
|
|
- `OLLAMA_HOST` - Ollama server URL
|
|
|
|
## Workflow Architecture
|
|
|
|
Workflows are located in `.gitea/workflows/`:
|
|
|
|
- **ai-review.yml** / **enterprise-ai-review.yml** - Triggered on PR open/sync
|
|
- **ai-issue-triage.yml** - Triggered on `@codebot triage` mention in issue comments
|
|
- **ai-comment-reply.yml** - Triggered on issue comments with @bot mentions
|
|
- **ai-chat.yml** - Triggered on issue comments for chat (non-command mentions)
|
|
- **ai-codebase-review.yml** - Scheduled weekly analysis
|
|
|
|
**Note**: Issue triage is now **opt-in** via `@codebot triage` command, not automatic on issue creation.
|
|
|
|
Key workflow pattern:
|
|
1. Checkout repository
|
|
2. Setup Python 3.11
|
|
3. Install dependencies (`pip install requests pyyaml`)
|
|
4. Set environment variables
|
|
5. Run `python main.py <command> <args>`
|
|
|
|
## Prompt Templates
|
|
|
|
Prompts are stored in `tools/ai-review/prompts/` as Markdown files:
|
|
|
|
- `base.md` - Base instructions for all reviews
|
|
- `issue_triage.md` - Issue classification template
|
|
- `issue_response.md` - Issue response template
|
|
|
|
**Important**: JSON examples in prompts must use **double curly braces** (`{{` and `}}`) to escape Python's `.format()` method. This is tested in `tests/test_ai_review.py::TestPromptFormatting`.
|
|
|
|
## Code Patterns
|
|
|
|
### Creating a New Agent
|
|
|
|
```python
|
|
from agents.base_agent import BaseAgent, AgentContext, AgentResult
|
|
|
|
class MyAgent(BaseAgent):
|
|
def can_handle(self, event_type: str, event_data: dict) -> bool:
|
|
# Check if agent is enabled in config
|
|
if not self.config.get("agents", {}).get("my_agent", {}).get("enabled", True):
|
|
return False
|
|
return event_type == "my_event_type"
|
|
|
|
def execute(self, context: AgentContext) -> AgentResult:
|
|
# Load prompt template
|
|
prompt = self.load_prompt("my_prompt")
|
|
formatted = prompt.format(data=context.event_data.get("field"))
|
|
|
|
# Call LLM with rate limiting
|
|
response = self.call_llm(formatted)
|
|
|
|
# Post comment to issue/PR
|
|
self.upsert_comment(
|
|
context.owner,
|
|
context.repo,
|
|
issue_index,
|
|
response.content
|
|
)
|
|
|
|
return AgentResult(
|
|
success=True,
|
|
message="Agent completed",
|
|
actions_taken=["Posted comment"]
|
|
)
|
|
```
|
|
|
|
### Calling LLM with Tools
|
|
|
|
```python
|
|
messages = [
|
|
{"role": "system", "content": "You are a helpful assistant"},
|
|
{"role": "user", "content": "Search for authentication code"}
|
|
]
|
|
|
|
tools = [{
|
|
"type": "function",
|
|
"function": {
|
|
"name": "search_code",
|
|
"description": "Search codebase",
|
|
"parameters": {
|
|
"type": "object",
|
|
"properties": {"query": {"type": "string"}},
|
|
"required": ["query"]
|
|
}
|
|
}
|
|
}]
|
|
|
|
response = self.llm.call_with_tools(messages, tools=tools)
|
|
|
|
if response.tool_calls:
|
|
for tc in response.tool_calls:
|
|
result = execute_tool(tc.name, tc.arguments)
|
|
messages.append({
|
|
"role": "tool",
|
|
"tool_call_id": tc.id,
|
|
"content": result
|
|
})
|
|
```
|
|
|
|
### Adding Security Rules
|
|
|
|
Edit `tools/ai-review/security/security_scanner.py` or create `security/security_rules.yml`:
|
|
|
|
```yaml
|
|
rules:
|
|
- id: SEC018
|
|
name: Custom Rule Name
|
|
pattern: 'regex_pattern_here'
|
|
severity: HIGH # HIGH, MEDIUM, LOW
|
|
category: A03:2021 Injection
|
|
cwe: CWE-XXX
|
|
description: What this detects
|
|
recommendation: How to fix it
|
|
```
|
|
|
|
## Testing
|
|
|
|
The test suite (`tests/test_ai_review.py`) covers:
|
|
|
|
1. **Prompt Formatting** - Ensures prompts don't have unescaped `{}` that break `.format()`
|
|
2. **Module Imports** - Verifies all modules can be imported
|
|
3. **Security Scanner** - Tests pattern detection and false positive rate
|
|
4. **Agent Context** - Tests dataclass creation and validation
|
|
5. **Metrics** - Tests enterprise metrics collection
|
|
|
|
Run specific test classes:
|
|
```bash
|
|
pytest tests/test_ai_review.py::TestPromptFormatting -v
|
|
pytest tests/test_ai_review.py::TestSecurityScanner -v
|
|
```
|
|
|
|
## Common Development Tasks
|
|
|
|
### Adding a New Command to @codebot
|
|
|
|
1. Add command to `config.yml` under `interaction.commands`
|
|
2. Add handler method in `IssueAgent` (e.g., `_command_yourcommand()`)
|
|
3. Update `_handle_command()` to route the command to your handler
|
|
4. Update README.md with command documentation
|
|
5. Add tests in `tests/test_ai_review.py`
|
|
|
|
Example commands:
|
|
- `@codebot help` - Show all available commands with examples
|
|
- `@codebot triage` - Full issue triage with labeling
|
|
- `@codebot explain` - Explain the issue
|
|
- `@codebot suggest` - Suggest solutions
|
|
- `@codebot setup-labels` - Automatic label setup (built-in, not in config)
|
|
|
|
### Changing the Bot Name
|
|
|
|
1. Edit `config.yml`: `interaction.mention_prefix: "@newname"`
|
|
2. Update all Gitea workflow files in `.gitea/workflows/` (search for `contains(github.event.comment.body`)
|
|
3. Update README.md and documentation
|
|
|
|
### Supporting a New LLM Provider
|
|
|
|
1. Create provider class in `clients/llm_client.py` inheriting from `BaseLLMProvider`
|
|
2. Implement `call()` and optionally `call_with_tools()`
|
|
3. Register in `LLMClient.PROVIDERS` dict
|
|
4. Add model config to `config.yml`
|
|
5. Document in README.md
|
|
|
|
## Repository Labels
|
|
|
|
### Automatic Label Setup (Recommended)
|
|
|
|
Use the `@codebot setup-labels` command to automatically configure labels. This command:
|
|
|
|
**For repositories with existing labels:**
|
|
- Detects naming patterns: `Kind/Bug`, `Priority - High`, `type: bug`
|
|
- Maps existing labels to OpenRabbit schema using aliases
|
|
- Creates only missing labels following detected pattern
|
|
- Zero duplicate labels
|
|
|
|
**For fresh repositories:**
|
|
- Creates OpenRabbit's default label set
|
|
- Uses standard naming: `type:`, `priority:`, status labels
|
|
|
|
**Example with existing `Kind/` and `Priority -` labels:**
|
|
```
|
|
@codebot setup-labels
|
|
|
|
✅ Found 18 existing labels with pattern: prefix_slash
|
|
|
|
Proposed Mapping:
|
|
| OpenRabbit Expected | Your Existing Label | Status |
|
|
|---------------------|---------------------|--------|
|
|
| type: bug | Kind/Bug | ✅ Map |
|
|
| type: feature | Kind/Feature | ✅ Map |
|
|
| priority: high | Priority - High | ✅ Map |
|
|
| ai-reviewed | (missing) | ⚠️ Create |
|
|
|
|
✅ Created Kind/Question
|
|
✅ Created Status - AI Reviewed
|
|
|
|
Setup Complete! Auto-labeling will use your existing label schema.
|
|
```
|
|
|
|
### Manual Label Setup
|
|
|
|
The system expects these labels to exist in repositories for auto-labeling:
|
|
|
|
- `priority: critical`, `priority: high`, `priority: medium`, `priority: low`
|
|
- `type: bug`, `type: feature`, `type: question`, `type: documentation`, `type: security`, `type: testing`
|
|
- `ai-approved`, `ai-changes-required`, `ai-reviewed`
|
|
|
|
Labels are mapped in `config.yml` under the `labels` section.
|
|
|
|
### Label Configuration Format
|
|
|
|
Labels support two formats for backwards compatibility:
|
|
|
|
**New format (with colors and aliases):**
|
|
```yaml
|
|
labels:
|
|
type:
|
|
bug:
|
|
name: "type: bug"
|
|
color: "d73a4a" # Red
|
|
description: "Something isn't working"
|
|
aliases: ["Kind/Bug", "bug", "Type: Bug"] # For auto-detection
|
|
```
|
|
|
|
**Old format (strings only):**
|
|
```yaml
|
|
labels:
|
|
type:
|
|
bug: "type: bug" # Still works, uses default blue color
|
|
```
|
|
|
|
### Label Pattern Detection
|
|
|
|
The `setup-labels` command detects these patterns (configured in `label_patterns`):
|
|
|
|
1. **prefix_slash**: `Kind/Bug`, `Type/Feature`, `Category/X`
|
|
2. **prefix_dash**: `Priority - High`, `Status - Blocked`
|
|
3. **colon**: `type: bug`, `priority: high`
|
|
|
|
When creating missing labels, the bot follows the detected pattern to maintain consistency.
|