Hiddenden/openrabbit

Fork 0

Files

latte 69d9963597

Enterprise AI Code Review / ai-review (pull_request) Successful in 32s

Details

update

2025-12-28 14:10:04 +00:00

11 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Overview

OpenRabbit is an enterprise-grade AI code review system for Gitea (and GitHub). It provides automated PR review, issue triage, interactive chat, and codebase analysis through a collection of specialized AI agents.

Commands

Development

# Run tests
pytest tests/ -v

# Run specific test file
pytest tests/test_ai_review.py -v

# Install dependencies
pip install -r tools/ai-review/requirements.txt

# Run a PR review locally
cd tools/ai-review
python main.py pr owner/repo 123

# Run issue triage
python main.py issue owner/repo 456

# Test chat functionality
python main.py chat owner/repo "How does authentication work?"

# Run with custom config
python main.py pr owner/repo 123 --config /path/to/config.yml

Testing Workflows

# Validate workflow YAML syntax
python -c "import yaml; yaml.safe_load(open('.github/workflows/ai-review.yml'))"

# Test security scanner
python -c "from security.security_scanner import SecurityScanner; s = SecurityScanner(); print(list(s.scan_content('password = \"secret123\"', 'test.py')))"

Architecture

Agent System

The codebase uses an agent-based architecture where specialized agents handle different types of events:

BaseAgent (agents/base_agent.py) - Abstract base class providing:
- Gitea API client integration
- LLM client integration with rate limiting
- Common comment management (upsert, find AI comments)
- Prompt loading from prompts/ directory
- Standard execution flow with error handling
Specialized Agents - Each agent implements:
- can_handle(event_type, event_data) - Determines if agent should process the event
- execute(context) - Main execution logic
- Returns AgentResult with success status, message, data, and actions taken
- PRAgent - Reviews pull requests with inline comments and security scanning
- IssueAgent - Triages issues and responds to @ai-bot commands
- CodebaseAgent - Analyzes entire codebase health and tech debt
- ChatAgent - Interactive assistant with tool calling (search_codebase, read_file, search_web)
Dispatcher (dispatcher.py) - Routes events to appropriate agents:
- Registers agents at startup
- Determines which agents can handle each event
- Executes agents (supports concurrent execution)
- Returns aggregated results

Multi-Provider LLM Client

The LLMClient (clients/llm_client.py) provides a unified interface for multiple LLM providers:

OpenAI - Primary provider (gpt-4.1-mini default)
OpenRouter - Multi-provider access (claude-3.5-sonnet)
Ollama - Self-hosted models (codellama:13b)

Key features:

Tool/function calling support via call_with_tools(messages, tools)
JSON response parsing with fallback extraction
Provider-specific configuration via config.yml

Platform Abstraction

The GiteaClient (clients/gitea_client.py) provides a unified REST API client for Gitea (also compatible with GitHub API):

Issue operations (create, update, list, get, comments, labels)
PR operations (get, diff, files, reviews)
Repository operations (get repo, file contents, branches)

Environment variables:

AI_REVIEW_API_URL - API base URL (e.g., https://api.github.com or https://gitea.example.com/api/v1)
AI_REVIEW_TOKEN - Authentication token

Security Scanner

The SecurityScanner (security/security_scanner.py) uses pattern-based detection with 17 built-in rules covering:

OWASP Top 10 categories (A01-A10)
Common vulnerabilities (SQL injection, XSS, hardcoded secrets, weak crypto)
Returns SecurityFinding objects with severity (HIGH/MEDIUM/LOW), CWE references, and recommendations

Can scan:

File content via scan_content(content, filename)
Git diffs via scan_diff(diff) - only scans added lines

Chat Agent Tool Calling

The ChatAgent implements an iterative tool calling loop:

Send user message + system prompt to LLM with available tools
If LLM returns tool calls, execute each tool and append results to conversation
Repeat until LLM returns a final response (max 5 iterations)

Available tools:

search_codebase - Searches repository files and code patterns
read_file - Reads specific file contents (truncated at 8KB)
search_web - Queries SearXNG instance (requires SEARXNG_URL)

Configuration

Primary Config File: `tools/ai-review/config.yml`

Critical settings:

provider: openai  # openai | openrouter | ollama

model:
  openai: gpt-4.1-mini
  openrouter: anthropic/claude-3.5-sonnet
  ollama: codellama:13b

interaction:
  mention_prefix: "@codebot"  # Bot trigger name - update workflows too!
  commands:
    - explain      # Explain what the issue is about
    - suggest      # Suggest solutions or next steps
    - security     # Security analysis
    - summarize    # Summarize the issue
    - triage       # Full triage with labeling
  
review:
  fail_on_severity: HIGH  # Fail CI if HIGH severity issues found
  max_diff_lines: 800     # Skip review if diff too large
  
agents:
  chat:
    max_iterations: 5  # Tool calling loop limit

Important: When changing mention_prefix, also update all workflow files in .gitea/workflows/:

ai-comment-reply.yml
ai-chat.yml
ai-issue-triage.yml

Look for: if: contains(github.event.comment.body, '@codebot') and update to your new bot name.

Current bot name: @codebot

Environment Variables

Required:

AI_REVIEW_API_URL - Platform API URL
AI_REVIEW_TOKEN - Bot authentication token
OPENAI_API_KEY - OpenAI API key (or provider-specific key)

Optional:

SEARXNG_URL - SearXNG instance for web search
OPENROUTER_API_KEY - OpenRouter API key
OLLAMA_HOST - Ollama server URL

Workflow Architecture

Workflows are located in .gitea/workflows/:

ai-review.yml / enterprise-ai-review.yml - Triggered on PR open/sync
ai-issue-triage.yml - Triggered on @codebot triage mention in issue comments
ai-comment-reply.yml - Triggered on issue comments with @bot mentions
ai-chat.yml - Triggered on issue comments for chat (non-command mentions)
ai-codebase-review.yml - Scheduled weekly analysis

Note: Issue triage is now opt-in via @codebot triage command, not automatic on issue creation.

Key workflow pattern:

Checkout repository
Setup Python 3.11
Install dependencies (pip install requests pyyaml)
Set environment variables
Run python main.py <command> <args>

Prompt Templates

Prompts are stored in tools/ai-review/prompts/ as Markdown files:

base.md - Base instructions for all reviews
issue_triage.md - Issue classification template
issue_response.md - Issue response template

Important: JSON examples in prompts must use double curly braces ({{ and }}) to escape Python's .format() method. This is tested in tests/test_ai_review.py::TestPromptFormatting.

Code Patterns

Creating a New Agent

from agents.base_agent import BaseAgent, AgentContext, AgentResult

class MyAgent(BaseAgent):
    def can_handle(self, event_type: str, event_data: dict) -> bool:
        # Check if agent is enabled in config
        if not self.config.get("agents", {}).get("my_agent", {}).get("enabled", True):
            return False
        return event_type == "my_event_type"
    
    def execute(self, context: AgentContext) -> AgentResult:
        # Load prompt template
        prompt = self.load_prompt("my_prompt")
        formatted = prompt.format(data=context.event_data.get("field"))
        
        # Call LLM with rate limiting
        response = self.call_llm(formatted)
        
        # Post comment to issue/PR
        self.upsert_comment(
            context.owner,
            context.repo,
            issue_index,
            response.content
        )
        
        return AgentResult(
            success=True,
            message="Agent completed",
            actions_taken=["Posted comment"]
        )

Calling LLM with Tools

messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "Search for authentication code"}
]

tools = [{
    "type": "function",
    "function": {
        "name": "search_code",
        "description": "Search codebase",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }
}]

response = self.llm.call_with_tools(messages, tools=tools)

if response.tool_calls:
    for tc in response.tool_calls:
        result = execute_tool(tc.name, tc.arguments)
        messages.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": result
        })

Adding Security Rules

Edit tools/ai-review/security/security_scanner.py or create security/security_rules.yml:

rules:
  - id: SEC018
    name: Custom Rule Name
    pattern: 'regex_pattern_here'
    severity: HIGH  # HIGH, MEDIUM, LOW
    category: A03:2021 Injection
    cwe: CWE-XXX
    description: What this detects
    recommendation: How to fix it

Testing

The test suite (tests/test_ai_review.py) covers:

Prompt Formatting - Ensures prompts don't have unescaped {} that break .format()
Module Imports - Verifies all modules can be imported
Security Scanner - Tests pattern detection and false positive rate
Agent Context - Tests dataclass creation and validation
Metrics - Tests enterprise metrics collection

Run specific test classes:

pytest tests/test_ai_review.py::TestPromptFormatting -v
pytest tests/test_ai_review.py::TestSecurityScanner -v

Common Development Tasks

Adding a New Command to @codebot

Add command to config.yml under interaction.commands
Add handler method in IssueAgent (e.g., _command_yourcommand())
Update _handle_command() to route the command to your handler
Update README.md with command documentation
Add tests in tests/test_ai_review.py

Example commands:

@codebot triage - Full issue triage with labeling
@codebot explain - Explain the issue
@codebot suggest - Suggest solutions

Changing the Bot Name

Edit config.yml: interaction.mention_prefix: "@newname"
Update all Gitea workflow files in .gitea/workflows/ (search for contains(github.event.comment.body)
Update README.md and documentation

Supporting a New LLM Provider

Create provider class in clients/llm_client.py inheriting from BaseLLMProvider
Implement call() and optionally call_with_tools()
Register in LLMClient.PROVIDERS dict
Add model config to config.yml
Document in README.md

Repository Labels

The system expects these labels to exist in repositories for auto-labeling:

priority: high, priority: medium, priority: low
type: bug, type: feature, type: question, type: documentation
ai-approved, ai-changes-required, ai-reviewed

Labels are mapped in config.yml under the labels section.

11 KiB Raw Blame History