Files
openrabbit/README.md
Latte 7cc5d26948
All checks were successful
CI / ci (push) Successful in 9s
Deploy / deploy-local-runner (push) Has been skipped
Deploy / deploy-ssh (push) Successful in 7s
Docker / docker (push) Successful in 6s
Security / security (push) Successful in 7s
Add AI_PROVIDER and AI_MODEL support
2026-03-01 19:56:14 +01:00

682 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OpenRabbit
Enterprise-grade AI code review system for **Gitea** and **GitHub** with automated PR review, issue triage, interactive chat, and codebase analysis.
---
## Features
| Feature | Description |
|---------|-------------|
| **PR Review** | Inline comments, security scanning, severity-based CI failure |
| **PR Summaries** | Auto-generate comprehensive PR summaries with change analysis and impact assessment |
| **Issue Triage** | On-demand classification, labeling, priority assignment via `@codebot triage` |
| **Chat** | Interactive AI chat with codebase search and web search tools |
| **@codebot Commands** | `@codebot summarize`, `changelog`, `explain-diff`, `explain`, `suggest`, `triage`, `review-again` in comments |
| **Codebase Analysis** | Health scores, tech debt tracking, weekly reports |
| **Security Scanner** | 17 OWASP-aligned rules + SAST integration (Bandit, Semgrep) |
| **Dependency Scanning** | Vulnerability detection for Python, JavaScript dependencies |
| **Test Coverage** | AI-powered test suggestions for untested code |
| **Architecture Compliance** | Layer separation enforcement, circular dependency detection |
| **Notifications** | Slack/Discord alerts for security findings and reviews |
| **Compliance** | Audit trail, CODEOWNERS enforcement, regulatory support |
| **Multi-Provider LLM** | OpenAI, Anthropic Claude, Azure OpenAI, Google Gemini, Ollama |
| **Enterprise Ready** | Audit logging, metrics, Prometheus export |
| **Gitea Native** | Built for Gitea workflows and API (also works with GitHub) |
---
## 📦 Installation
**Quick Setup (5 minutes):**
```bash
# Clone OpenRabbit
git clone https://github.com/YourOrg/openrabbit.git
cd openrabbit
# Run interactive setup wizard
./setup.sh
```
The wizard will generate workflow files, create configuration, and guide you through the remaining steps.
**📖 See [INSTALL.md](INSTALL.md) for:**
- Detailed installation instructions
- Manual setup guide
- Platform-specific differences (Gitea vs GitHub)
- Troubleshooting common issues
---
## Quick Start
### 1. Set Repository/Organization Secrets
```
AI_PROVIDER - LLM provider: openai | openrouter | ollama | anthropic | azure | gemini
AI_MODEL - Model to use for the active provider (e.g. gpt-4.1-mini, claude-3-5-sonnet-20241022)
OPENAI_API_KEY - OpenAI API key (or use OpenRouter/Ollama)
SEARXNG_URL - (Optional) SearXNG instance URL for web search
```
**For Gitea:**
```
AI_REVIEW_TOKEN - Bot token with repo + issue permissions
```
**For GitHub:**
The built-in `GITHUB_TOKEN` is used automatically.
### 2. Add Workflows to Repository
Workflows are located in `.gitea/workflows/`.
#### Gitea Example
#### Gitea PR Review Workflow
```yaml
# .gitea/workflows/enterprise-ai-review.yml
name: AI PR Review
on: [pull_request]
jobs:
ai-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/checkout@v4
with:
repository: YourOrg/OpenRabbit
path: .ai-review
token: ${{ secrets.AI_REVIEW_TOKEN }}
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install requests pyyaml
- name: Run AI Review
env:
AI_REVIEW_TOKEN: ${{ secrets.AI_REVIEW_TOKEN }}
AI_REVIEW_REPO: ${{ gitea.repository }}
AI_REVIEW_API_URL: https://your-gitea.example.com/api/v1
AI_PROVIDER: ${{ secrets.AI_PROVIDER }}
AI_MODEL: ${{ secrets.AI_MODEL }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
cd .ai-review/tools/ai-review
python main.py pr ${{ gitea.repository }} ${{ gitea.event.pull_request.number }}
```
See `.gitea/workflows/` for all workflow examples.
### 3. Create Labels (Automatic Setup)
**Option A: Automatic Setup (Recommended)**
Create an issue and comment:
```
@codebot setup-labels
```
The bot will automatically:
- Detect your existing label schema (e.g., `Kind/Bug`, `Priority - High`)
- Map existing labels to OpenRabbit's auto-labeling system
- Create only the missing labels you need
- Follow your repository's naming convention
**Option B: Manual Setup**
Create these labels in your repository for auto-labeling:
- `priority: critical`, `priority: high`, `priority: medium`, `priority: low`
- `type: bug`, `type: feature`, `type: question`, `type: documentation`
- `ai-approved`, `ai-changes-required`, `ai-reviewed`
---
## Project Structure
```
tools/ai-review/
├── agents/ # Agent implementations
│ ├── base_agent.py # Abstract base agent
│ ├── issue_agent.py # Issue triage & @codebot commands
│ ├── pr_agent.py # PR review with security scan
│ ├── codebase_agent.py # Codebase health analysis
│ ├── chat_agent.py # Interactive chat with tool calling
│ ├── dependency_agent.py # Dependency vulnerability scanning
│ ├── test_coverage_agent.py # Test coverage analysis
│ └── architecture_agent.py # Architecture compliance checking
├── clients/ # API clients
│ ├── gitea_client.py # Gitea REST API wrapper
│ ├── llm_client.py # Multi-provider LLM client with tool support
│ └── providers/ # Additional LLM providers
│ ├── anthropic_provider.py # Direct Anthropic Claude API
│ ├── azure_provider.py # Azure OpenAI Service
│ └── gemini_provider.py # Google Gemini API
├── security/ # Security scanning
│ ├── security_scanner.py # 17 OWASP-aligned rules
│ └── sast_scanner.py # Bandit, Semgrep, Trivy integration
├── notifications/ # Alerting system
│ └── notifier.py # Slack, Discord, webhook notifications
├── compliance/ # Compliance & audit
│ ├── audit_trail.py # Audit logging with integrity verification
│ └── codeowners.py # CODEOWNERS enforcement
├── utils/ # Utility functions
│ ├── ignore_patterns.py # .ai-reviewignore support
│ └── webhook_sanitizer.py # Input validation
├── enterprise/ # Enterprise features
│ ├── audit_logger.py # JSONL audit logging
│ └── metrics.py # Prometheus-compatible metrics
├── prompts/ # AI prompt templates
├── main.py # CLI entry point
└── config.yml # Configuration
.github/workflows/ # GitHub Actions workflows
├── ai-review.yml # PR review workflow
├── ai-issue-triage.yml # Issue triage workflow
├── ai-codebase-review.yml # Codebase analysis
├── ai-comment-reply.yml # @codebot command responses
└── ai-chat.yml # Interactive AI chat
.gitea/workflows/ # Gitea Actions workflows
├── enterprise-ai-review.yml
├── ai-issue-triage.yml
├── ai-codebase-review.yml
├── ai-comment-reply.yml
└── ai-chat.yml
```
---
## CLI Commands
```bash
# Review a pull request
python main.py pr owner/repo 123
# Triage an issue
python main.py issue owner/repo 456
# Respond to @codebot command
python main.py comment owner/repo 456 "@codebot explain"
# Analyze codebase
python main.py codebase owner/repo
# Chat with the AI bot
python main.py chat owner/repo "How does authentication work?"
python main.py chat owner/repo "Find all API endpoints" --issue 789
```
---
## @codebot Commands
### Issue Commands
In any issue comment:
| Command | Description |
|---------|-------------|
| `@codebot help` | **Help:** Show all available commands with examples |
| `@codebot setup-labels` | **Setup:** Automatically create/map repository labels for auto-labeling |
| `@codebot triage` | Full issue triage with auto-labeling and analysis |
| `@codebot summarize` | Summarize the issue in 2-3 sentences |
| `@codebot explain` | Explain what the issue is about |
| `@codebot suggest` | Suggest solutions or next steps |
| `@codebot check-deps` | Scan dependencies for security vulnerabilities |
| `@codebot suggest-tests` | Suggest test cases for changed code |
| `@codebot refactor-suggest` | Suggest refactoring opportunities |
| `@codebot architecture` | Check architecture compliance (alias: `arch-check`) |
| `@codebot` (any question) | Chat with AI using codebase/web search tools |
### Pull Request Commands
In any PR comment:
| Command | Description |
|---------|-------------|
| `@codebot summarize` | Generate a comprehensive PR summary with changes, files affected, and impact |
| `@codebot changelog` | Generate Keep a Changelog format entries ready for CHANGELOG.md |
| `@codebot explain-diff` | Explain code changes in plain language for non-technical stakeholders |
| `@codebot review-again` | Re-run AI code review on current PR state without new commits |
#### PR Summary (`@codebot summarize`)
**Features:**
- 📋 Generates structured summary of PR changes
- ✨ Categorizes change type (Feature/Bugfix/Refactor/Documentation/Testing)
- 📝 Lists what was added, modified, and removed
- 📁 Shows all affected files with descriptions
- 🎯 Assesses impact scope (small/medium/large)
- 🤖 Automatically generates on PRs with empty descriptions
**When to use:**
- When a PR lacks a description
- To quickly understand what changed
- For standardized PR documentation
- Before reviewing complex PRs
**Example output:**
```markdown
## 📋 Pull Request Summary
This PR implements automatic PR summary generation...
**Type:** ✨ Feature
## Changes
✅ Added:
- PR summary generation in PRAgent
- Auto-summary for empty PR descriptions
📝 Modified:
- Updated config.yml with new settings
## Files Affected
- `tools/ai-review/prompts/pr_summary.md` - New prompt template
- 📝 `tools/ai-review/agents/pr_agent.py` - Added summary methods
## Impact
🟡 **Scope:** Medium
Adds new feature without affecting existing functionality
```
#### Changelog Generator (`@codebot changelog`)
**Features:**
- 📋 Generates Keep a Changelog format entries
- 🏷️ Categorizes changes (Added/Changed/Fixed/Removed/Security)
- ⚠️ Detects breaking changes automatically
- 📊 Includes technical details (files, LOC, components)
- 📝 Ready to copy-paste into CHANGELOG.md
**When to use:**
- Preparing release notes
- Maintaining CHANGELOG.md
- Customer-facing announcements
- Version documentation
**Example output:**
```markdown
## 📋 Changelog for PR #123
### ✨ Added
- User authentication system with JWT tokens
- Password reset functionality via email
### 🔄 Changed
- Updated database schema for user table
- Refactored login endpoint for better error handling
### 🐛 Fixed
- Session timeout bug causing premature logouts
- Security vulnerability in password validation
### 🔒 Security
- Fixed XSS vulnerability in user input validation
---
### ⚠️ BREAKING CHANGES
- **Removed legacy API endpoint /api/v1/old - migrate to /api/v2**
---
### 📊 Technical Details
- **Files changed:** 15
- **Lines:** +450 / -120
- **Main components:** auth/, api/users/, database/
```
#### Diff Explainer (`@codebot explain-diff`)
**Features:**
- 📖 Translates technical changes into plain language
- 🎯 Perfect for non-technical stakeholders (PMs, designers)
- 🔍 File-by-file breakdown with "what" and "why"
- 🏗️ Architecture impact analysis
- ⚠️ Breaking change detection
- 📊 Technical summary for reference
**When to use:**
- New team members reviewing complex PRs
- Non-technical reviewers need to understand changes
- Documenting architectural decisions
- Learning from others' code
**Example output:**
```markdown
## 📖 Code Changes Explained (PR #123)
### 🎯 Overview
This PR adds user authentication using secure tokens that expire after 24 hours, enabling users to log in securely without storing passwords in the application.
### 🔍 What Changed
#### `auth/jwt.py` (new)
**What changed:** Creates secure tokens for logged-in users
**Why it matters:** Enables the app to remember who you are without constantly asking for your password
#### 📝 `api/users.py` (modified)
**What changed:** Added a login page where users can sign in
**Why it matters:** Users can now create accounts and access their personal data
---
### 🏗️ Architecture Impact
Introduces a security layer across the entire application, ensuring only authenticated users can access protected features.
**New dependencies:**
- PyJWT (for creating secure tokens)
- bcrypt (for password encryption)
**Affected components:**
- API (all endpoints now check authentication)
- Database (added user credentials storage)
---
### ⚠️ Breaking Changes
- **All API endpoints now require authentication - existing scripts need to be updated**
---
### 📊 Technical Summary
- **Files changed:** 5
- **Lines:** +200 / -10
- **Components:** auth/, api/
```
#### Review Again (`@codebot review-again`)
**Features:**
- ✅ Shows diff from previous review (resolved/new/changed issues)
- 🏷️ Updates labels based on new severity
- ⚡ No need for empty commits to trigger review
- 🔧 Respects latest `.ai-review.yml` configuration
**When to use:**
- After addressing review feedback in comments
- When AI flagged a false positive and you explained it
- After updating `.ai-review.yml` security rules
- To re-evaluate severity after code clarification
**Example:**
```
The hardcoded string at line 45 is a public API URL, not a secret.
@codebot review-again
```
**New to OpenRabbit?** Just type `@codebot help` in any issue to see all available commands!
### Label Setup Command
The `@codebot setup-labels` command intelligently detects your existing label schema and sets up auto-labeling:
**For repositories with existing labels (e.g., `Kind/Bug`, `Priority - High`):**
- Detects your naming pattern (prefix/slash, prefix-dash, or colon-style)
- Maps your existing labels to OpenRabbit's schema
- Creates only missing labels following your pattern
- Zero duplicate labels created
**For fresh repositories:**
- Creates OpenRabbit's default label set
- Uses `type:`, `priority:`, and status labels
**Example output:**
```
@codebot setup-labels
✅ Found 18 existing labels with pattern: prefix_slash
Detected Categories:
- Kind (7 labels): Bug, Feature, Documentation, Security, Testing
- Priority (4 labels): Critical, High, Medium, Low
Proposed Mapping:
| OpenRabbit Expected | Your Existing Label | Status |
|---------------------|---------------------|--------|
| type: bug | Kind/Bug | ✅ Map |
| priority: high | Priority - High | ✅ Map |
| ai-reviewed | (missing) | ⚠️ Create |
✅ Created Kind/Question (#cc317c)
✅ Created Status - AI Reviewed (#1d76db)
Setup Complete! Auto-labeling will use your existing label schema.
```
---
## Interactive Chat
The chat agent is an interactive AI assistant with tool-calling capabilities:
**Tools Available:**
- `search_codebase` - Search repository files and code
- `read_file` - Read specific files
- `search_web` - Search the web via SearXNG
**Example:**
```
@codebot How do I configure rate limiting in this project?
```
The bot will search the codebase, read relevant files, and provide a comprehensive answer.
---
## Configuration
Edit `tools/ai-review/config.yml`:
```yaml
# Set via AI_PROVIDER secret — or hardcode here as fallback
provider: openai # openai | openrouter | ollama | anthropic | azure | gemini
# Set via AI_MODEL secret — or hardcode per provider here
model:
openai: gpt-4.1-mini
openrouter: anthropic/claude-3.5-sonnet
ollama: codellama:13b
agents:
issue:
enabled: true
auto_label: true
pr:
enabled: true
inline_comments: true
security_scan: true
codebase:
enabled: true
chat:
enabled: true
searxng_url: "" # Or set SEARXNG_URL env var
interaction:
respond_to_mentions: true
mention_prefix: "@codebot" # Customize your bot name here!
commands:
- summarize
- explain
- suggest
```
---
## Customizing the Bot Name
The default bot name is `@codebot`. To change it:
**Step 1:** Edit `tools/ai-review/config.yml`:
```yaml
interaction:
mention_prefix: "@yourname" # e.g., "@assistant", "@reviewer", etc.
```
**Step 2:** Update all workflow files in `.gitea/workflows/`:
- `ai-comment-reply.yml`
- `ai-chat.yml`
- `ai-issue-triage.yml`
Look for and update:
```yaml
if: contains(github.event.comment.body, '@codebot')
```
Change `@codebot` to your new bot name.
**Step 3 (CRITICAL):** Update bot username to prevent infinite loops:
In all three workflow files, find:
```yaml
github.event.comment.user.login != 'Bartender'
```
Replace `'Bartender'` with your bot's Gitea username. This prevents the bot from triggering itself when it posts comments containing `@codebot`, which would cause infinite loops and 10+ duplicate workflow runs.
---
## Security Scanning
17 rules covering OWASP Top 10:
| Category | Examples |
|----------|----------|
| Injection | SQL injection, command injection, XSS |
| Access Control | Hardcoded secrets, private keys |
| Crypto Failures | Weak hashing (MD5/SHA1), insecure random |
| Misconfiguration | Debug mode, CORS wildcard, SSL bypass |
---
## Documentation
| Document | Description |
|----------|-------------|
| [Getting Started](docs/getting-started.md) | Quick setup guide |
| [Configuration](docs/configuration.md) | All options explained |
| [Agents](docs/agents.md) | Agent documentation |
| [Security](docs/security.md) | Security rules reference |
| [Workflows](docs/workflows.md) | GitHub & Gitea workflow examples |
| [API Reference](docs/api-reference.md) | Client and agent APIs |
| [Enterprise](docs/enterprise.md) | Audit logging, metrics |
| [Troubleshooting](docs/troubleshooting.md) | Common issues |
---
## LLM Providers
| Provider | Model | Use Case |
|----------|-------|----------|
| OpenAI | gpt-4.1-mini | Fast, reliable, default |
| Anthropic | claude-3.5-sonnet | Direct Claude API access |
| Azure OpenAI | gpt-4 (deployment) | Enterprise Azure deployments |
| Google Gemini | gemini-1.5-pro | GCP customers, Vertex AI |
| OpenRouter | claude-3.5-sonnet | Multi-provider access |
| Ollama | codellama:13b | Self-hosted, private |
### Provider Configuration
The provider and model can be set via Gitea secrets so you don't need to edit `config.yml`:
| Secret | Description | Example |
|--------|-------------|---------|
| `AI_PROVIDER` | Which LLM provider to use | `openrouter` |
| `AI_MODEL` | Model for the active provider | `google/gemini-2.0-flash` |
The `config.yml` values are used as fallback when secrets are not set.
```yaml
# In config.yml (fallback defaults)
provider: openai # openai | anthropic | azure | gemini | openrouter | ollama
# Azure OpenAI
azure:
endpoint: "" # Set via AZURE_OPENAI_ENDPOINT env var
deployment: "gpt-4"
api_version: "2024-02-15-preview"
# Google Gemini (Vertex AI)
gemini:
project: "" # Set via GOOGLE_CLOUD_PROJECT env var
region: "us-central1"
```
### Environment Variables
| Variable | Provider | Description |
|----------|----------|-------------|
| `AI_PROVIDER` | All | Override the active provider (e.g. `openrouter`) |
| `AI_MODEL` | All | Override the model for the active provider |
| `OPENAI_API_KEY` | OpenAI | API key |
| `ANTHROPIC_API_KEY` | Anthropic | API key |
| `AZURE_OPENAI_ENDPOINT` | Azure | Service endpoint URL |
| `AZURE_OPENAI_API_KEY` | Azure | API key |
| `AZURE_OPENAI_DEPLOYMENT` | Azure | Deployment name |
| `GOOGLE_API_KEY` | Gemini | API key (public API) |
| `GOOGLE_CLOUD_PROJECT` | Vertex AI | GCP project ID |
| `OPENROUTER_API_KEY` | OpenRouter | API key |
| `OLLAMA_HOST` | Ollama | Server URL (default: localhost:11434) |
---
## Enterprise Features
- **Audit Logging**: JSONL logs with integrity checksums and daily rotation
- **Compliance**: HIPAA, SOC2, PCI-DSS, GDPR support with configurable rules
- **CODEOWNERS Enforcement**: Validate approvals against CODEOWNERS file
- **Notifications**: Slack/Discord webhooks for critical findings
- **SAST Integration**: Bandit, Semgrep, Trivy for advanced security scanning
- **Metrics**: Prometheus-compatible export
- **Rate Limiting**: Configurable request limits and timeouts
- **Custom Security Rules**: Define your own patterns via YAML
- **Tool Calling**: LLM function calling for interactive chat
- **Ignore Patterns**: `.ai-reviewignore` for excluding files from review
### Notifications Configuration
```yaml
# In config.yml
notifications:
enabled: true
threshold: "warning" # info | warning | error | critical
slack:
enabled: true
webhook_url: "" # Set via SLACK_WEBHOOK_URL env var
channel: "#code-review"
discord:
enabled: true
webhook_url: "" # Set via DISCORD_WEBHOOK_URL env var
```
### Compliance Configuration
```yaml
compliance:
enabled: true
audit:
enabled: true
log_file: "audit.log"
retention_days: 90
codeowners:
enabled: true
require_approval: true
```
---
## License
MIT