remove documentation that are no longer needed

2026-01-16 11:15:51 +00:00
parent e8d28225e0
commit b24ae0dcda
8 changed files with 0 additions and 1517 deletions
@@ -0,0 +1,791 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Overview
+
+OpenRabbit is an enterprise-grade AI code review system for Gitea (and GitHub). It provides automated PR review, issue triage, interactive chat, and codebase analysis through a collection of specialized AI agents.
+
+## Commands
+
+### Development
+
+```bash
+# Run tests
+pytest tests/ -v
+
+# Run specific test file
+pytest tests/test_ai_review.py -v
+
+# Install dependencies
+pip install -r tools/ai-review/requirements.txt
+
+# Run a PR review locally
+cd tools/ai-review
+python main.py pr owner/repo 123
+
+# Run issue triage
+python main.py issue owner/repo 456
+
+# Test chat functionality
+python main.py chat owner/repo "How does authentication work?"
+
+# Run with custom config
+python main.py pr owner/repo 123 --config /path/to/config.yml
+```
+
+### Testing Workflows
+
+```bash
+# Validate workflow YAML syntax
+python -c "import yaml; yaml.safe_load(open('.github/workflows/ai-review.yml'))"
+
+# Test security scanner
+python -c "from security.security_scanner import SecurityScanner; s = SecurityScanner(); print(list(s.scan_content('password = \"secret123\"', 'test.py')))"
+
+# Test webhook sanitization
+cd tools/ai-review
+python -c "from utils.webhook_sanitizer import sanitize_webhook_data; print(sanitize_webhook_data({'user': {'email': 'test@example.com'}}))"
+
+# Test safe dispatch
+python utils/safe_dispatch.py issue_comment owner/repo '{"action": "created", "issue": {"number": 1}, "comment": {"body": "test"}}'
+```
+
+## Architecture
+
+### Agent System
+
+The codebase uses an **agent-based architecture** where specialized agents handle different types of events:
+
+1. **BaseAgent** (`agents/base_agent.py`) - Abstract base class providing:
+   - Gitea API client integration
+   - LLM client integration with rate limiting
+   - Common comment management (upsert, find AI comments)
+   - Prompt loading from `prompts/` directory
+   - Standard execution flow with error handling
+
+2. **Specialized Agents** - Each agent implements:
+   - `can_handle(event_type, event_data)` - Determines if agent should process the event
+   - `execute(context)` - Main execution logic
+   - Returns `AgentResult` with success status, message, data, and actions taken
+
+   **Core Agents:**
+   - **PRAgent** - Reviews pull requests with inline comments and security scanning
+   - **IssueAgent** - Triages issues and responds to @codebot commands
+   - **CodebaseAgent** - Analyzes entire codebase health and tech debt
+   - **ChatAgent** - Interactive assistant with tool calling (search_codebase, read_file, search_web)
+
+   **Specialized Agents:**
+   - **DependencyAgent** - Scans dependencies for security vulnerabilities (Python, JavaScript)
+   - **TestCoverageAgent** - Analyzes code for test coverage gaps and suggests test cases
+   - **ArchitectureAgent** - Enforces layer separation and detects architecture violations
+
+3. **Dispatcher** (`dispatcher.py`) - Routes events to appropriate agents:
+   - Registers agents at startup
+   - Determines which agents can handle each event
+   - Executes agents (supports concurrent execution)
+   - Returns aggregated results
+
+### Multi-Provider LLM Client
+
+The `LLMClient` (`clients/llm_client.py`) provides a unified interface for multiple LLM providers:
+
+**Core Providers (in llm_client.py):**
+- **OpenAI** - Primary provider (gpt-4.1-mini default)
+- **OpenRouter** - Multi-provider access (claude-3.5-sonnet)
+- **Ollama** - Self-hosted models (codellama:13b)
+
+**Additional Providers (in clients/providers/):**
+- **AnthropicProvider** - Direct Anthropic Claude API (claude-3.5-sonnet)
+- **AzureOpenAIProvider** - Azure OpenAI Service with API key auth
+- **AzureOpenAIWithAADProvider** - Azure OpenAI with Azure AD authentication
+- **GeminiProvider** - Google Gemini API (public)
+- **VertexAIGeminiProvider** - Google Vertex AI Gemini (enterprise GCP)
+
+Key features:
+- Tool/function calling support via `call_with_tools(messages, tools)`
+- JSON response parsing with fallback extraction
+- Provider-specific configuration via `config.yml`
+- Configurable timeouts per provider
+
+### Platform Abstraction
+
+The `GiteaClient` (`clients/gitea_client.py`) provides a unified REST API client for **Gitea** (also compatible with GitHub API):
+
+- Issue operations (create, update, list, get, comments, labels)
+- PR operations (get, diff, files, reviews)
+- Repository operations (get repo, file contents, branches)
+
+Environment variables:
+- `AI_REVIEW_API_URL` - API base URL (e.g., `https://api.github.com` or `https://gitea.example.com/api/v1`)
+- `AI_REVIEW_TOKEN` - Authentication token
+
+### Security Scanner
+
+The `SecurityScanner` (`security/security_scanner.py`) uses **pattern-based detection** with 17 built-in rules covering:
+
+- OWASP Top 10 categories (A01-A10)
+- Common vulnerabilities (SQL injection, XSS, hardcoded secrets, weak crypto)
+- Returns `SecurityFinding` objects with severity (HIGH/MEDIUM/LOW), CWE references, and recommendations
+
+Can scan:
+- File content via `scan_content(content, filename)`
+- Git diffs via `scan_diff(diff)` - only scans added lines
+
+### Chat Agent Tool Calling
+
+The `ChatAgent` implements an **iterative tool calling loop**:
+
+1. Send user message + system prompt to LLM with available tools
+2. If LLM returns tool calls, execute each tool and append results to conversation
+3. Repeat until LLM returns a final response (max 5 iterations)
+
+Available tools:
+- `search_codebase` - Searches repository files and code patterns
+- `read_file` - Reads specific file contents (truncated at 8KB)
+- `search_web` - Queries SearXNG instance (requires `SEARXNG_URL`)
+
+## Configuration
+
+### Primary Config File: `tools/ai-review/config.yml`
+
+Critical settings:
+
+```yaml
+provider: openai  # openai | openrouter | ollama
+
+model:
+  openai: gpt-4.1-mini
+  openrouter: anthropic/claude-3.5-sonnet
+  ollama: codellama:13b
+
+interaction:
+  mention_prefix: "@codebot"  # Bot trigger name - update workflows too!
+  commands:
+    - explain      # Explain what the issue is about
+    - suggest      # Suggest solutions or next steps
+    - security     # Security analysis
+    - summarize    # Summarize the issue
+    - triage       # Full triage with labeling
+    - review-again # Re-run PR review (PR comments only)
+  
+review:
+  fail_on_severity: HIGH  # Fail CI if HIGH severity issues found
+  max_diff_lines: 800     # Skip review if diff too large
+  
+agents:
+  chat:
+    max_iterations: 5  # Tool calling loop limit
+```
+
+**Important**: When changing `mention_prefix`, also update all workflow files in `.gitea/workflows/`:
+- `ai-comment-reply.yml`
+- `ai-chat.yml`
+- `ai-issue-triage.yml`
+
+Look for: `if: contains(github.event.comment.body, '@codebot')` and update to your new bot name.
+
+Current bot name: `@codebot`
+
+### Environment Variables
+
+Required:
+- `AI_REVIEW_API_URL` - Platform API URL
+- `AI_REVIEW_TOKEN` - Bot authentication token
+- `OPENAI_API_KEY` - OpenAI API key (or provider-specific key)
+
+Optional:
+- `SEARXNG_URL` - SearXNG instance for web search
+- `OPENROUTER_API_KEY` - OpenRouter API key
+- `OLLAMA_HOST` - Ollama server URL
+
+## Workflow Architecture
+
+Workflows are located in `.gitea/workflows/` and are **mutually exclusive** to prevent duplicate runs:
+
+- **enterprise-ai-review.yml** - Triggered on PR open/sync
+- **ai-issue-triage.yml** - Triggered ONLY on `@codebot triage` in comments
+- **ai-comment-reply.yml** - Triggered on specific commands: `help`, `explain`, `suggest`, `security`, `summarize`, `changelog`, `explain-diff`, `review-again`, `setup-labels`
+- **ai-chat.yml** - Triggered on `@codebot` mentions that are NOT specific commands (free-form questions)
+- **ai-codebase-review.yml** - Scheduled weekly analysis
+
+**Workflow Routing Logic:**
+1. If comment contains `@codebot triage` → ai-issue-triage.yml only
+2. If comment contains specific command (e.g., `@codebot help`) → ai-comment-reply.yml only
+3. If comment contains `@codebot <question>` (no command) → ai-chat.yml only
+
+This prevents the issue where all three workflows would trigger on every `@codebot` mention, causing massive duplication.
+
+**CRITICAL: Bot Self-Trigger Prevention**
+
+All workflows include `github.event.comment.user.login != 'Bartender'` to prevent infinite loops. Without this check:
+- Bot posts comment mentioning `@codebot`
+- Workflow triggers, bot posts another comment with `@codebot`
+- Triggers again infinitely → 10+ duplicate runs
+
+**If you change the bot username**, update all three workflow files:
+- `.gitea/workflows/ai-comment-reply.yml`
+- `.gitea/workflows/ai-chat.yml`
+- `.gitea/workflows/ai-issue-triage.yml`
+
+Look for: `github.event.comment.user.login != 'Bartender'` and replace `'Bartender'` with your bot's username.
+
+**Note**: Issue triage is now **opt-in** via `@codebot triage` command, not automatic on issue creation.
+
+Key workflow pattern:
+1. Checkout repository
+2. Setup Python 3.11
+3. Install dependencies (`pip install requests pyyaml`)
+4. Set environment variables
+5. Run `python main.py <command> <args>`
+
+## Prompt Templates
+
+Prompts are stored in `tools/ai-review/prompts/` as Markdown files:
+
+- `base.md` - Base instructions for all reviews
+- `pr_summary.md` - PR summary generation template
+- `changelog.md` - Keep a Changelog format generation template
+- `explain_diff.md` - Plain-language diff explanation template
+- `issue_triage.md` - Issue classification template
+- `issue_response.md` - Issue response template
+
+**Important**: JSON examples in prompts must use **double curly braces** (`{{` and `}}`) to escape Python's `.format()` method. This is tested in `tests/test_ai_review.py::TestPromptFormatting`.
+
+## Code Patterns
+
+### Creating a New Agent
+
+```python
+from agents.base_agent import BaseAgent, AgentContext, AgentResult
+
+class MyAgent(BaseAgent):
+    def can_handle(self, event_type: str, event_data: dict) -> bool:
+        # Check if agent is enabled in config
+        if not self.config.get("agents", {}).get("my_agent", {}).get("enabled", True):
+            return False
+        return event_type == "my_event_type"
+    
+    def execute(self, context: AgentContext) -> AgentResult:
+        # Load prompt template
+        prompt = self.load_prompt("my_prompt")
+        formatted = prompt.format(data=context.event_data.get("field"))
+        
+        # Call LLM with rate limiting
+        response = self.call_llm(formatted)
+        
+        # Post comment to issue/PR
+        self.upsert_comment(
+            context.owner,
+            context.repo,
+            issue_index,
+            response.content
+        )
+        
+        return AgentResult(
+            success=True,
+            message="Agent completed",
+            actions_taken=["Posted comment"]
+        )
+```
+
+### Calling LLM with Tools
+
+```python
+messages = [
+    {"role": "system", "content": "You are a helpful assistant"},
+    {"role": "user", "content": "Search for authentication code"}
+]
+
+tools = [{
+    "type": "function",
+    "function": {
+        "name": "search_code",
+        "description": "Search codebase",
+        "parameters": {
+            "type": "object",
+            "properties": {"query": {"type": "string"}},
+            "required": ["query"]
+        }
+    }
+}]
+
+response = self.llm.call_with_tools(messages, tools=tools)
+
+if response.tool_calls:
+    for tc in response.tool_calls:
+        result = execute_tool(tc.name, tc.arguments)
+        messages.append({
+            "role": "tool",
+            "tool_call_id": tc.id,
+            "content": result
+        })
+```
+
+### Adding Security Rules
+
+Edit `tools/ai-review/security/security_scanner.py` or create `security/security_rules.yml`:
+
+```yaml
+rules:
+  - id: SEC018
+    name: Custom Rule Name
+    pattern: 'regex_pattern_here'
+    severity: HIGH  # HIGH, MEDIUM, LOW
+    category: A03:2021 Injection
+    cwe: CWE-XXX
+    description: What this detects
+    recommendation: How to fix it
+```
+
+## Security Best Practices
+
+**CRITICAL**: Always follow these security guidelines when modifying workflows or handling webhook data.
+
+### Workflow Security Rules
+
+1. **Never pass full webhook data to environment variables**
+   ```yaml
+   # ❌ NEVER DO THIS
+   env:
+     EVENT_DATA: ${{ toJSON(github.event) }}  # Exposes emails, tokens, etc.
+   
+   # ✅ ALWAYS DO THIS
+   run: |
+     EVENT_DATA=$(cat <<EOF
+     {
+       "issue": {"number": ${{ github.event.issue.number }}},
+       "comment": {"body": $(echo '${{ github.event.comment.body }}' | jq -Rs .)}
+     }
+     EOF
+     )
+     python utils/safe_dispatch.py issue_comment "$REPO" "$EVENT_DATA"
+   ```
+
+2. **Always validate repository format**
+   ```bash
+   # Validate before use
+   if ! echo "$REPO" | grep -qE '^[a-zA-Z0-9_-]+/[a-zA-Z0-9_-]+$'; then
+       echo "Error: Invalid repository format"
+       exit 1
+   fi
+   ```
+
+3. **Use safe_dispatch.py for webhook processing**
+   ```bash
+   # Instead of inline Python with os.environ, use:
+   python utils/safe_dispatch.py issue_comment owner/repo "$EVENT_JSON"
+   ```
+
+### Input Validation
+
+Always use `webhook_sanitizer.py` utilities:
+
+```python
+from utils.webhook_sanitizer import (
+    sanitize_webhook_data,      # Remove sensitive fields
+    validate_repository_format,  # Validate owner/repo format
+    extract_minimal_context,     # Extract only necessary fields
+)
+
+# Validate repository input
+owner, repo = validate_repository_format(repo_string)  # Raises ValueError if invalid
+
+# Sanitize webhook data
+sanitized = sanitize_webhook_data(raw_event_data)
+
+# Extract minimal context (reduces attack surface)
+minimal = extract_minimal_context(event_type, sanitized)
+```
+
+### Pre-commit Security Scanning
+
+Install pre-commit hooks to catch security issues before commit:
+
+```bash
+# Install pre-commit
+pip install pre-commit
+
+# Install hooks
+pre-commit install
+
+# Run manually
+pre-commit run --all-files
+```
+
+The hooks will:
+- Scan Python files for security vulnerabilities
+- Validate workflow files for security anti-patterns
+- Detect hardcoded secrets
+- Run security scanner on code changes
+
+### Security Resources
+
+- **SECURITY.md** - Complete security guidelines and best practices
+- **tools/ai-review/utils/webhook_sanitizer.py** - Input validation utilities
+- **tools/ai-review/utils/safe_dispatch.py** - Safe webhook dispatch wrapper
+- **.pre-commit-config.yaml** - Pre-commit hook configuration
+
+## Testing
+
+The test suite covers:
+
+1. **Prompt Formatting** (`tests/test_ai_review.py`) - Ensures prompts don't have unescaped `{}` that break `.format()`
+2. **Module Imports** - Verifies all modules can be imported
+3. **Security Scanner** - Tests pattern detection and false positive rate
+4. **Agent Context** - Tests dataclass creation and validation
+5. **Security Utilities** (`tests/test_security_utils.py`) - Tests webhook sanitization, validation, and safe dispatch
+6. **Safe Dispatch** (`tests/test_safe_dispatch.py`) - Tests secure event dispatching
+5. **Metrics** - Tests enterprise metrics collection
+
+Run specific test classes:
+```bash
+pytest tests/test_ai_review.py::TestPromptFormatting -v
+pytest tests/test_ai_review.py::TestSecurityScanner -v
+```
+
+## Common Development Tasks
+
+### PR Summary Generation
+
+The PR summary feature automatically generates comprehensive summaries for pull requests.
+
+**Key Features:**
+- Auto-generates summary for PRs with empty descriptions
+- Can be manually triggered with `@codebot summarize` in PR comments
+- Analyzes diff to extract key changes, files affected, and impact
+- Categorizes change type (Feature/Bugfix/Refactor/Documentation/Testing)
+- Posts as comment or updates PR description (configurable)
+
+**Implementation Details:**
+
+1. **Auto-Summary on PR Open** - `PRAgent.execute()`:
+   - Checks if PR body is empty and `auto_summary.enabled` is true
+   - Calls `_generate_pr_summary()` automatically
+   - Continues with normal PR review after posting summary
+
+2. **Manual Trigger** - `@codebot summarize` in PR comments:
+   - `PRAgent.can_handle()` detects `summarize` command in PR comments
+   - Routes to `_handle_summarize_command()`
+   - Generates and posts summary on demand
+
+3. **Summary Generation** - `_generate_pr_summary()`:
+   - Fetches PR diff using `_get_diff()`
+   - Loads `prompts/pr_summary.md` template
+   - Calls LLM with diff to analyze changes
+   - Returns structured JSON with summary data
+   - Formats using `_format_pr_summary()`
+   - Posts as comment or updates description based on config
+
+4. **Configuration** - `config.yml`:
+   ```yaml
+   agents:
+     pr:
+       auto_summary:
+         enabled: true  # Auto-generate for empty PRs
+         post_as_comment: true  # true = comment, false = update description
+   ```
+
+**Summary Structure:**
+- Brief 2-3 sentence overview
+- Change type categorization (Feature/Bugfix/Refactor/etc)
+- Key changes (Added/Modified/Removed)
+- Files affected with descriptions
+- Impact assessment (scope: small/medium/large)
+
+**Common Use Cases:**
+- Developers who forget to write PR descriptions
+- Quick understanding of complex changes
+- Standardized documentation format
+- Pre-review context for reviewers
+
+### PR Changelog Generation
+
+The `@codebot changelog` command generates Keep a Changelog format entries from PR diffs.
+
+**Key Features:**
+- Generates structured changelog entries following Keep a Changelog format
+- Categorizes changes: Added/Changed/Deprecated/Removed/Fixed/Security
+- Automatically detects breaking changes
+- Includes technical details (files changed, LOC, components)
+- Output is ready to copy-paste into CHANGELOG.md
+
+**Implementation Details:**
+
+1. **Command Handler** - `PRAgent._handle_changelog_command()`:
+   - Triggered by `@codebot changelog` in PR comments
+   - Fetches PR title, description, and diff
+   - Loads `prompts/changelog.md` template
+   - Formats prompt with PR context
+
+2. **LLM Analysis** - Generates structured JSON:
+   ```json
+   {
+     "changelog": {
+       "added": ["New features"],
+       "changed": ["Changes to existing functionality"],
+       "fixed": ["Bug fixes"],
+       "security": ["Security fixes"]
+     },
+     "breaking_changes": ["Breaking changes"],
+     "technical_details": {
+       "files_changed": 15,
+       "insertions": 450,
+       "deletions": 120,
+       "main_components": ["auth/", "api/"]
+     }
+   }
+   ```
+
+3. **Formatting** - `_format_changelog()`:
+   - Converts JSON to Keep a Changelog markdown format
+   - Uses emojis for visual categorization (✨ Added, 🔄 Changed, 🐛 Fixed)
+   - Highlights breaking changes prominently
+   - Includes technical summary at the end
+   - Omits empty sections for clean output
+
+4. **Prompt Engineering** - `prompts/changelog.md`:
+   - User-focused language (not developer jargon)
+   - Filters noise (formatting, typos, minor refactoring)
+   - Groups related changes
+   - Active voice, concise entries
+   - Maximum 100 characters per entry
+
+**Common Use Cases:**
+- Preparing release notes
+- Maintaining CHANGELOG.md
+- Customer-facing announcements
+- Version documentation
+
+**Workflow Safety:**
+- Only triggers on PR comments (not issue comments)
+- Included in ai-comment-reply.yml workflow conditions
+- Excluded from ai-chat.yml to prevent duplicate runs
+- No automatic triggering - manual command only
+
+### Code Diff Explainer
+
+The `@codebot explain-diff` command translates technical code changes into plain language for non-technical stakeholders.
+
+**Key Features:**
+- Plain-language explanations without jargon
+- File-by-file breakdown with "what" and "why" context
+- Architecture impact analysis
+- Breaking change detection
+- Perfect for PMs, designers, and new team members
+
+**Implementation Details:**
+
+1. **Command Handler** - `PRAgent._handle_explain_diff_command()`:
+   - Triggered by `@codebot explain-diff` in PR comments
+   - Fetches PR title, description, and full diff
+   - Loads `prompts/explain_diff.md` template
+   - Formats prompt with PR context
+
+2. **LLM Analysis** - Generates plain-language JSON:
+   ```json
+   {
+     "overview": "High-level summary in everyday language",
+     "key_changes": [
+       {
+         "file": "path/to/file.py",
+         "status": "new|modified|deleted",
+         "explanation": "What changed (no jargon)",
+         "why_it_matters": "Business/user impact"
+       }
+     ],
+     "architecture_impact": {
+       "description": "System-level effects explained simply",
+       "new_dependencies": ["External libraries added"],
+       "affected_components": ["System parts impacted"]
+     },
+     "breaking_changes": ["User-facing breaking changes"],
+     "technical_details": { /* Stats for reference */ }
+   }
+   ```
+
+3. **Formatting** - `_format_diff_explanation()`:
+   - Converts JSON to readable markdown
+   - Uses emojis for visual categorization (➕ new, 📝 modified, 🗑️ deleted)
+   - Highlights breaking changes prominently
+   - Includes technical summary for developers
+   - Omits empty sections for clean output
+
+4. **Prompt Engineering** - `prompts/explain_diff.md`:
+   - **Avoids jargon**: "API" → "connection point between systems"
+   - **Explains why**: Not just what changed, but why it matters
+   - **Uses analogies**: "Caching" → "memory system for faster loading"
+   - **Focus on impact**: Who is affected and how
+   - **Groups changes**: Combines related files into themes
+   - **Translates concepts**: Technical terms → everyday language
+
+**Plain Language Rules:**
+- ❌ "Refactored authentication middleware" → ✅ "Updated login system for better security"
+- ❌ "Implemented Redis caching" → ✅ "Added memory to make pages load 10x faster"
+- ❌ "Database migration" → ✅ "Updated how data is stored"
+
+**Common Use Cases:**
+- New team members understanding large PRs
+- Non-technical reviewers (PMs, designers) reviewing features
+- Documenting architectural decisions
+- Learning from other developers' code
+
+**Workflow Safety:**
+- Only triggers on PR comments (not issue comments)
+- Included in ai-comment-reply.yml workflow conditions
+- Excluded from ai-chat.yml to prevent duplicate runs
+- No automatic triggering - manual command only
+
+### Review-Again Command Implementation
+
+The `@codebot review-again` command allows manual re-triggering of PR reviews without new commits.
+
+**Key Features:**
+- Detects `@codebot review-again` in PR comments (not issue comments)
+- Compares new review with previous review to show resolved/new issues
+- Updates existing AI review comment instead of creating duplicates
+- Updates PR labels based on new severity assessment
+
+**Implementation Details:**
+
+1. **PRAgent.can_handle()** - Handles `issue_comment` events on PRs containing "review-again"
+2. **PRAgent._handle_review_again()** - Main handler that:
+   - Fetches previous review comment
+   - Re-runs full PR review (security scan + AI analysis)
+   - Compares findings using `_compare_reviews()`
+   - Generates diff report with `_format_review_update()`
+   - Updates comment and labels
+
+3. **Review Comparison** - Uses finding keys (file:line:description) to match issues:
+   - **Resolved**: Issues in previous but not in current review
+   - **New**: Issues in current but not in previous review
+   - **Still Present**: Issues in both reviews
+   - **Severity Changed**: Same issue with different severity
+
+4. **Workflow Integration** - `.gitea/workflows/ai-comment-reply.yml`:
+   - Detects if comment is on PR or issue
+   - Uses `dispatch` command for PRs to route to PRAgent
+   - Preserves backward compatibility with issue commands
+
+**Usage:**
+```bash
+# In a PR comment:
+@codebot review-again
+```
+
+**Common Use Cases:**
+- Re-evaluate after explaining false positives in comments
+- Test new `.ai-review.yml` configuration
+- Update severity after code clarification
+- Faster iteration without empty commits
+
+### Adding a New Command to @codebot
+
+1. Add command to `config.yml` under `interaction.commands`
+2. Add handler method in `IssueAgent` (e.g., `_command_yourcommand()`)
+3. Update `_handle_command()` to route the command to your handler
+4. Update README.md with command documentation
+5. Add tests in `tests/test_ai_review.py`
+
+Example commands:
+- `@codebot help` - Show all available commands with examples
+- `@codebot triage` - Full issue triage with labeling
+- `@codebot explain` - Explain the issue
+- `@codebot suggest` - Suggest solutions
+- `@codebot summarize` - Generate PR summary or issue summary (works on both)
+- `@codebot changelog` - Generate Keep a Changelog format entries (PR comments only)
+- `@codebot explain-diff` - Explain code changes in plain language (PR comments only)
+- `@codebot setup-labels` - Automatic label setup (built-in, not in config)
+- `@codebot review-again` - Re-run PR review without new commits (PR comments only)
+
+### Changing the Bot Name
+
+1. Edit `config.yml`: `interaction.mention_prefix: "@newname"`
+2. Update all Gitea workflow files in `.gitea/workflows/` (search for `contains(github.event.comment.body`)
+3. Update README.md and documentation
+
+### Supporting a New LLM Provider
+
+1. Create provider class in `clients/llm_client.py` inheriting from `BaseLLMProvider`
+2. Implement `call()` and optionally `call_with_tools()`
+3. Register in `LLMClient.PROVIDERS` dict
+4. Add model config to `config.yml`
+5. Document in README.md
+
+## Repository Labels
+
+### Automatic Label Setup (Recommended)
+
+Use the `@codebot setup-labels` command to automatically configure labels. This command:
+
+**For repositories with existing labels:**
+- Detects naming patterns: `Kind/Bug`, `Priority - High`, `type: bug`
+- Maps existing labels to OpenRabbit schema using aliases
+- Creates only missing labels following detected pattern
+- Zero duplicate labels
+
+**For fresh repositories:**
+- Creates OpenRabbit's default label set
+- Uses standard naming: `type:`, `priority:`, status labels
+
+**Example with existing `Kind/` and `Priority -` labels:**
+```
+@codebot setup-labels
+
+✅ Found 18 existing labels with pattern: prefix_slash
+
+Proposed Mapping:
+| OpenRabbit Expected | Your Existing Label | Status |
+|---------------------|---------------------|--------|
+| type: bug          | Kind/Bug            | ✅ Map |
+| type: feature      | Kind/Feature        | ✅ Map |
+| priority: high     | Priority - High     | ✅ Map |
+| ai-reviewed        | (missing)           | ⚠️ Create |
+
+✅ Created Kind/Question
+✅ Created Status - AI Reviewed
+
+Setup Complete! Auto-labeling will use your existing label schema.
+```
+
+### Manual Label Setup
+
+The system expects these labels to exist in repositories for auto-labeling:
+
+- `priority: critical`, `priority: high`, `priority: medium`, `priority: low`
+- `type: bug`, `type: feature`, `type: question`, `type: documentation`, `type: security`, `type: testing`
+- `ai-approved`, `ai-changes-required`, `ai-reviewed`
+
+Labels are mapped in `config.yml` under the `labels` section.
+
+### Label Configuration Format
+
+Labels support two formats for backwards compatibility:
+
+**New format (with colors and aliases):**
+```yaml
+labels:
+  type:
+    bug:
+      name: "type: bug"
+      color: "d73a4a"  # Red
+      description: "Something isn't working"
+      aliases: ["Kind/Bug", "bug", "Type: Bug"]  # For auto-detection
+```
+
+**Old format (strings only):**
+```yaml
+labels:
+  type:
+    bug: "type: bug"  # Still works, uses default blue color
+```
+
+### Label Pattern Detection
+
+The `setup-labels` command detects these patterns (configured in `label_patterns`):
+
+1. **prefix_slash**: `Kind/Bug`, `Type/Feature`, `Category/X`
+2. **prefix_dash**: `Priority - High`, `Status - Blocked`
+3. **colon**: `type: bug`, `priority: high`
+
+When creating missing labels, the bot follows the detected pattern to maintain consistency.
@@ -0,0 +1,419 @@
+# Security Guidelines for OpenRabbit
+
+This document outlines security best practices and requirements for OpenRabbit development.
+
+## Table of Contents
+
+1. [Workflow Security](#workflow-security)
+2. [Webhook Data Handling](#webhook-data-handling)
+3. [Input Validation](#input-validation)
+4. [Secret Management](#secret-management)
+5. [Security Scanning](#security-scanning)
+6. [Reporting Vulnerabilities](#reporting-vulnerabilities)
+
+---
+
+## Workflow Security
+
+### Principle: Minimize Data Exposure
+
+**Problem:** GitHub Actions/Gitea Actions can expose sensitive data through:
+- Environment variables visible in logs
+- Debug output
+- Error messages
+- Process listings
+
+**Solution:** Use minimal data in workflows and sanitize all inputs.
+
+### ❌ Bad: Exposing Full Webhook Data
+
+```yaml
+# NEVER DO THIS - exposes all user data, emails, tokens
+env:
+  EVENT_JSON: ${{ toJSON(github.event) }}
+run: |
+  python process.py "$EVENT_JSON"
+```
+
+**Why this is dangerous:**
+- Full webhook payloads can contain user emails, private repo URLs, installation tokens
+- Data appears in workflow logs if debug mode is enabled
+- Environment variables can be dumped by malicious code
+- Violates principle of least privilege
+
+### ✅ Good: Minimal Data Extraction
+
+```yaml
+# SAFE: Only extract necessary fields
+run: |
+  EVENT_DATA=$(cat <<EOF
+  {
+    "issue": {
+      "number": ${{ github.event.issue.number }}
+    },
+    "comment": {
+      "body": $(echo '${{ github.event.comment.body }}' | jq -Rs .)
+    }
+  }
+  EOF
+  )
+  python utils/safe_dispatch.py issue_comment "$REPO" "$EVENT_DATA"
+```
+
+**Why this is safe:**
+- Only includes necessary fields (number, body)
+- Agents fetch full data from API with proper auth
+- Reduces attack surface
+- Follows data minimization principle
+
+### Input Validation Requirements
+
+All workflow inputs MUST be validated before use:
+
+1. **Repository Format**
+   ```bash
+   # Validate owner/repo format
+   if ! echo "$REPO" | grep -qE '^[a-zA-Z0-9_-]+/[a-zA-Z0-9_-]+$'; then
+       echo "Error: Invalid repository format"
+       exit 1
+   fi
+   ```
+
+2. **Numeric Inputs**
+   ```bash
+   # Validate issue/PR numbers are numeric
+   if ! [[ "$ISSUE_NUMBER" =~ ^[0-9]+$ ]]; then
+       echo "Error: Invalid issue number"
+       exit 1
+   fi
+   ```
+
+3. **String Sanitization**
+   ```bash
+   # Use jq for JSON string escaping
+   BODY=$(echo "$RAW_BODY" | jq -Rs .)
+   ```
+
+### Boolean Comparison
+
+```bash
+# ❌ WRONG: String comparison on boolean
+if [ "$IS_PR" = "true" ]; then
+
+# ✅ CORRECT: Use workflow expression
+IS_PR="${{ gitea.event.issue.pull_request != null }}"
+if [ "$IS_PR" = "true" ]; then
+```
+
+---
+
+## Webhook Data Handling
+
+### Using the Sanitization Utilities
+
+Always use `utils/webhook_sanitizer.py` when handling webhook data:
+
+```python
+from utils.webhook_sanitizer import (
+    sanitize_webhook_data,
+    validate_repository_format,
+    extract_minimal_context,
+)
+
+# Sanitize data before logging or storing
+sanitized = sanitize_webhook_data(raw_event_data)
+
+# Extract only necessary fields
+minimal = extract_minimal_context(event_type, sanitized)
+
+# Validate repository input
+owner, repo = validate_repository_format(repo_string)
+```
+
+### Sensitive Fields (Automatically Redacted)
+
+The sanitizer removes these fields:
+- `email`, `private_email`, `email_addresses`
+- `token`, `access_token`, `refresh_token`, `api_key`
+- `secret`, `password`, `private_key`, `ssh_key`
+- `phone`, `address`, `ssn`, `credit_card`
+- `installation_id`, `node_id`
+
+### Large Field Truncation
+
+These fields are truncated to prevent log flooding:
+- `body`: 500 characters
+- `description`: 500 characters
+- `message`: 500 characters
+
+---
+
+## Input Validation
+
+### Repository Name Validation
+
+```python
+from utils.webhook_sanitizer import validate_repository_format
+
+try:
+    owner, repo = validate_repository_format(user_input)
+except ValueError as e:
+    logger.error(f"Invalid repository: {e}")
+    return
+```
+
+**Checks performed:**
+- Format is `owner/repo`
+- No path traversal (`..`)
+- No shell injection characters (`;`, `|`, `&`, `` ` ``, etc.)
+- Non-empty owner and repo name
+
+### Event Data Size Limits
+
+```python
+# Maximum event size: 10MB
+MAX_EVENT_SIZE = 10 * 1024 * 1024
+
+if len(event_json) > MAX_EVENT_SIZE:
+    raise ValueError("Event data too large")
+```
+
+### JSON Validation
+
+```python
+try:
+    data = json.loads(event_json)
+except json.JSONDecodeError as e:
+    raise ValueError(f"Invalid JSON: {e}")
+
+if not isinstance(data, dict):
+    raise ValueError("Event data must be a JSON object")
+```
+
+---
+
+## Secret Management
+
+### Environment Variables
+
+Required secrets (set in CI/CD settings):
+- `AI_REVIEW_TOKEN` - Gitea/GitHub API token (read/write access)
+- `OPENAI_API_KEY` - OpenAI API key
+- `OPENROUTER_API_KEY` - OpenRouter API key (optional)
+- `OLLAMA_HOST` - Ollama server URL (optional)
+
+### ❌ Never Commit Secrets
+
+```python
+# NEVER DO THIS
+api_key = "sk-1234567890abcdef"  # ❌ Hardcoded secret
+
+# NEVER DO THIS
+config = {
+    "openai_key": "sk-1234567890abcdef"  # ❌ Secret in config
+}
+```
+
+### ✅ Always Use Environment Variables
+
+```python
+# CORRECT
+api_key = os.environ.get("OPENAI_API_KEY")
+if not api_key:
+    raise ValueError("OPENAI_API_KEY not set")
+```
+
+### Secret Scanning
+
+The security scanner checks for:
+- Hardcoded API keys (pattern: `sk-[a-zA-Z0-9]{32,}`)
+- AWS keys (`AKIA[0-9A-Z]{16}`)
+- Private keys (`-----BEGIN.*PRIVATE KEY-----`)
+- Passwords in code (`password\s*=\s*["'][^"']+["']`)
+
+---
+
+## Security Scanning
+
+### Automated Scanning
+
+All code is scanned for vulnerabilities:
+
+1. **PR Reviews** - Automatic security scan on every PR
+2. **Pre-commit Hooks** - Local scanning before commit
+3. **Pattern-based Detection** - 17 built-in security rules
+
+### Running Manual Scans
+
+```bash
+# Scan a specific file
+python -c "
+from security.security_scanner import SecurityScanner
+s = SecurityScanner()
+with open('myfile.py') as f:
+    findings = s.scan_content(f.read(), 'myfile.py')
+    for f in findings:
+        print(f'{f.severity}: {f.description}')
+"
+
+# Scan a git diff
+git diff | python tools/ai-review/security/scan_diff.py
+```
+
+### Security Rule Categories
+
+- **A01: Broken Access Control** - Missing auth, insecure file operations
+- **A02: Cryptographic Failures** - Weak crypto, hardcoded secrets
+- **A03: Injection** - SQL injection, command injection, XSS
+- **A06: Vulnerable Components** - Insecure imports
+- **A07: Authentication Failures** - Weak auth mechanisms
+- **A09: Logging Failures** - Security logging issues
+
+### Severity Levels
+
+- **HIGH**: Critical vulnerabilities requiring immediate fix
+  - SQL injection, command injection, hardcoded secrets
+  
+- **MEDIUM**: Important issues requiring attention
+  - Missing input validation, weak crypto, XSS
+  
+- **LOW**: Best practice violations
+  - TODO comments with security keywords, eval() usage
+
+### CI Failure Threshold
+
+Configure in `config.yml`:
+
+```yaml
+review:
+  fail_on_severity: HIGH  # Fail CI if HIGH severity found
+```
+
+---
+
+## Webhook Signature Validation
+
+### Future GitHub Integration
+
+When accepting webhooks directly (not through Gitea Actions):
+
+```python
+from utils.webhook_sanitizer import validate_webhook_signature
+
+# Validate webhook is from GitHub
+signature = request.headers.get("X-Hub-Signature-256")
+payload = request.get_data(as_text=True)
+secret = os.environ["WEBHOOK_SECRET"]
+
+if not validate_webhook_signature(payload, signature, secret):
+    return "Unauthorized", 401
+```
+
+**Important:** Always validate webhook signatures to prevent:
+- Replay attacks
+- Forged webhook events
+- Unauthorized access
+
+---
+
+## Reporting Vulnerabilities
+
+### Security Issues
+
+If you discover a security vulnerability:
+
+1. **DO NOT** create a public issue
+2. Email security contact: [maintainer email]
+3. Include:
+   - Description of the vulnerability
+   - Steps to reproduce
+   - Potential impact
+   - Suggested fix (if available)
+
+### Response Timeline
+
+- **Acknowledgment**: Within 48 hours
+- **Initial Assessment**: Within 1 week
+- **Fix Development**: Depends on severity
+  - HIGH: Within 1 week
+  - MEDIUM: Within 2 weeks
+  - LOW: Next release cycle
+
+---
+
+## Security Checklist for Contributors
+
+Before submitting a PR:
+
+- [ ] No secrets in code or config files
+- [ ] All user inputs are validated
+- [ ] No SQL injection vulnerabilities
+- [ ] No command injection vulnerabilities
+- [ ] No XSS vulnerabilities
+- [ ] Sensitive data is sanitized before logging
+- [ ] Environment variables are not exposed in workflows
+- [ ] Repository format validation is used
+- [ ] Error messages don't leak sensitive info
+- [ ] Security scanner passes (no HIGH severity)
+
+---
+
+## Security Tools
+
+### Webhook Sanitizer
+
+Location: `tools/ai-review/utils/webhook_sanitizer.py`
+
+Functions:
+- `sanitize_webhook_data(data)` - Remove sensitive fields
+- `extract_minimal_context(event_type, data)` - Minimal payload
+- `validate_repository_format(repo)` - Validate owner/repo
+- `validate_webhook_signature(payload, sig, secret)` - Verify webhook
+
+### Safe Dispatch Utility
+
+Location: `tools/ai-review/utils/safe_dispatch.py`
+
+Usage:
+```bash
+python utils/safe_dispatch.py issue_comment owner/repo '{"action": "created", ...}'
+```
+
+Features:
+- Input validation
+- Size limits (10MB max)
+- Automatic sanitization
+- Comprehensive error handling
+
+### Security Scanner
+
+Location: `tools/ai-review/security/security_scanner.py`
+
+Features:
+- 17 built-in security rules
+- OWASP Top 10 coverage
+- CWE references
+- Severity classification
+- Pattern-based detection
+
+---
+
+## Best Practices Summary
+
+1. **Minimize Data**: Only pass necessary data to workflows
+2. **Validate Inputs**: Always validate external inputs
+3. **Sanitize Outputs**: Remove sensitive data before logging
+4. **Use Utilities**: Leverage `webhook_sanitizer.py` and `safe_dispatch.py`
+5. **Scan Code**: Run security scanner before committing
+6. **Rotate Secrets**: Regularly rotate API keys and tokens
+7. **Review Changes**: Manual security review for sensitive changes
+8. **Test Security**: Add tests for security-critical code
+
+---
+
+## Updates and Maintenance
+
+This security policy is reviewed quarterly and updated as needed.
+
+**Last Updated**: 2025-12-28  
+**Next Review**: 2026-03-28
@@ -1,440 +0,0 @@
-# Feature Ideas & Roadmap
-
-This document outlines recommended feature additions for OpenRabbit, ordered by value/effort ratio.
-
---
-
-## Quick Reference
-
-| Feature | Value | Effort | Time Estimate | Status |
-|---------|-------|--------|---------------|--------|
-| [@codebot help Command](#1-codebot-help-command) | HIGH | LOW | 1-2 hours | ⭐ Recommended |
-| [Automatic Label Creator](#2-automatic-label-creator) | HIGH | MEDIUM | 2-3 hours | Planned |
-| [PR Changelog Generator](#3-pr-changelog-generator) | MEDIUM | MEDIUM | 3-4 hours | Planned |
-| [Code Diff Explainer](#4-code-diff-explainer) | MEDIUM-HIGH | MEDIUM | 2-3 hours | Planned |
-| [Smart Test Suggestions](#5-smart-test-suggestions) | HIGH | HIGH | 5-6 hours | Planned |
-| [@codebot review-again](#6-codebot-review-again) | MEDIUM | LOW | 1-2 hours | Planned |
-| [Dependency Update Advisor](#7-dependency-update-advisor) | VERY HIGH | HIGH | 6-8 hours | Planned |
-
---
-
-## 1. @codebot help Command
-
-**⭐ HIGHEST PRIORITY - Quick Win**
-
-### Problem
-Users have no way to discover what commands are available. They don't know what the bot can do without reading documentation.
-
-### Solution
-Add a `@codebot help` command that lists all available commands with descriptions and examples.
-
-### Implementation
- Add `help` to `config.yml` commands list
- Add `_command_help()` method to IssueAgent
- Format response with all commands + descriptions
-
-### Example Output
-```markdown
-@username
-
-**Available @codebot Commands:**
-
-**Issue Triage & Analysis:**
- `@codebot triage` - Full issue triage with auto-labeling and priority assignment
- `@codebot summarize` - Generate 2-3 sentence summary
- `@codebot explain` - Detailed explanation of the issue
- `@codebot suggest` - Solution suggestions or next steps
-
-**Interactive Chat:**
- `@codebot [question]` - Ask questions about the codebase
-
-**Codebase Analysis:**
- `@codebot codebase` - Trigger full codebase health analysis
-
-**Utility:**
- `@codebot help` - Show this message
-
-**Examples:**
- `@codebot explain` - Get detailed explanation
- `@codebot how does authentication work?` - Chat about codebase
-```
-
-### Impact
- Immediate UX improvement
- Reduces support burden
- Makes all future commands discoverable
- Foundation for growth
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/issue_agent.py`
-
---
-
-## 2. Automatic Label Creator
-
-### Problem
-Major setup pain point: users must manually create 10+ labels (`priority: high`, `type: bug`, etc.). Bot silently fails to apply labels if they don't exist.
-
-### Solution
-Add `@codebot setup-labels` command that:
-1. Checks which required labels are missing
-2. Creates them with proper colors
-3. Or provides CLI commands for manual creation
-
-### Implementation
- Add `setup-labels` command
- Query repository labels via Gitea API
- Compare against required labels in config
- Auto-create missing labels or show creation commands
-
-### Example Output
-```markdown
-@username
-
-**Label Setup Analysis:**
-
-**Missing Labels:**
- `priority: high` (color: #d73a4a)
- `priority: medium` (color: #fbca04)
- `type: bug` (color: #d73a4a)
-
-**Creating labels...**
-✅ Created `priority: high`
-✅ Created `priority: medium`
-✅ Created `type: bug`
-
-All required labels are now set up!
-```
-
-### Impact
- Removes major setup friction
- Ensures auto-labeling works immediately
- Better onboarding experience
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/issue_agent.py`
- `/tools/ai-review/clients/gitea_client.py` (add create_label method)
-
---
-
-## 3. PR Changelog Generator
-
-### Problem
-Developers spend time writing release notes and changelogs. Bot already analyzes PR content.
-
-### Solution
-Add `@codebot changelog` command that generates human-readable changelog from PR.
-
-### Implementation
- Add `changelog` command for PRs
- Analyze PR diff + commit messages
- Generate bullet-point summary
- Format for CHANGELOG.md
-
-### Example Output
-```markdown
-@username
-
-**Changelog for PR #123:**
-
-### Added
- User authentication system with JWT tokens
- Password reset functionality via email
-
-### Changed
- Updated database schema for user table
- Refactored login endpoint for better error handling
-
-### Fixed
- Session timeout bug causing premature logouts
- Security vulnerability in password validation
-
-### Technical Details
- 15 files changed, 450 insertions, 120 deletions
- Main components: auth/, api/users/, database/
-```
-
-### Impact
- Saves time on release documentation
- Consistent changelog format
- Can copy-paste into CHANGELOG.md
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/pr_agent.py`
-
---
-
-## 4. Code Diff Explainer
-
-### Problem
-Complex PRs are hard to understand, especially for:
- Non-expert reviewers
- New team members
- Documentation purposes
-
-### Solution
-Add `@codebot explain-diff` command that breaks down what changed in simple terms.
-
-### Implementation
- Add `explain-diff` command for PRs
- Analyze PR diff section by section
- Generate plain-language explanations
- Identify architectural changes
-
-### Example Output
-```markdown
-@username
-
-**PR Explanation:**
-
-**Overview:**
-This PR adds user authentication to the application using JWT tokens.
-
-**Key Changes:**
-
-1. **auth/jwt.py** (new file)
-   - Creates JSON Web Tokens for authenticated users
-   - Validates tokens on protected endpoints
-   - Handles token expiration (24 hour lifetime)
-
-2. **api/users.py** (modified)
-   - Added `/login` endpoint that accepts username/password
-   - Returns JWT token on successful authentication
-   - Added `/logout` endpoint to invalidate tokens
-
-3. **database/schema.sql** (modified)
-   - Added `users` table with columns: id, username, password_hash, email
-   - Passwords are hashed using bcrypt (secure)
-
-**Architecture Impact:**
- Introduces authentication layer across all API endpoints
- Adds dependency on PyJWT library
- Requires database migration to create users table
-```
-
-### Impact
- Makes code review accessible
- Great for learning/onboarding
- Documentation generation
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/pr_agent.py`
- `/tools/ai-review/prompts/` (add explain_diff.md)
-
---
-
-## 5. Smart Test Suggestions
-
-### Problem
-Test coverage is critical but developers often miss edge cases or forget to update tests.
-
-### Solution
-Add `@codebot suggest-tests` command that:
-1. Analyzes changed functions/classes
-2. Identifies what needs testing
-3. Suggests specific test cases
-
-### Implementation
- Add `suggest-tests` command for PRs
- Parse changed code to identify functions
- Use LLM to suggest test scenarios
- Could integrate with coverage reports
-
-### Example Output
-```markdown
-@username
-
-**Test Suggestions for PR #123:**
-
-### auth/jwt.py - `create_token()` function
-
-**Recommended Test Cases:**
-1. ✅ Valid user creates token successfully
-2. ⚠️ **Missing:** Token expiration after 24 hours
-3. ⚠️ **Missing:** Invalid user ID handling
-4. ⚠️ **Missing:** Token creation with special characters in username
-
-### api/users.py - `/login` endpoint
-
-**Recommended Test Cases:**
-1. ✅ Successful login with correct credentials
-2. ⚠️ **Missing:** Login with wrong password
-3. ⚠️ **Missing:** Login with non-existent user
-4. ⚠️ **Missing:** SQL injection attempt in username field
-5. ⚠️ **Missing:** Rate limiting after failed attempts
-
-**Coverage Impact:**
- Current coverage: ~60%
- With suggested tests: ~85%
-```
-
-### Impact
- Improves test coverage
- Catches edge cases
- Reduces production bugs
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/pr_agent.py`
- `/tools/ai-review/prompts/` (add test_suggestions.md)
-
---
-
-## 6. @codebot review-again
-
-### Problem
-Current workflow: developer fixes issues → pushes commit → bot auto-reviews. Sometimes developers want re-review without creating new commits (e.g., after only changing comments).
-
-### Solution
-Add `@codebot review-again` command that re-runs PR review on current state.
-
-### Implementation
- Add `review-again` command for PRs
- Re-run PR agent on current diff
- Update existing review comment
- Compare with previous review (show what changed)
-
-### Example Output
-```markdown
-@username
-
-**Re-review Complete:**
-
-**Previous Review:** 5 issues (2 HIGH, 3 MEDIUM)
-**Current Review:** 1 issue (1 MEDIUM)
-
-✅ Fixed: SQL injection in login endpoint
-✅ Fixed: Hardcoded JWT secret
-⚠️ Remaining: Missing error handling in password reset
-
-**Status:** Changes Required → Approved (pending fix)
-```
-
-### Impact
- Smoother PR workflow
- No unnecessary commits
- Faster feedback loop
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/pr_agent.py`
-
---
-
-## 7. Dependency Update Advisor
-
-### Problem
-Security vulnerabilities often exist in dependencies. Keeping dependencies up-to-date is critical but tedious.
-
-### Solution
-Add `@codebot check-deps` command that:
-1. Parses requirements.txt, package.json, etc.
-2. Checks for outdated packages
-3. Warns about CVEs
-4. Suggests upgrade commands
-
-### Implementation
- Add `check-deps` command
- Support multiple package formats (pip, npm, cargo, go)
- Integrate with vulnerability databases (CVE, npm audit)
- Generate upgrade instructions
-
-### Example Output
-```markdown
-@username
-
-**Dependency Analysis:**
-
-### Outdated Packages (5)
-
-| Package | Current | Latest | Severity |
-|---------|---------|--------|----------|
-| requests | 2.28.0 | 2.31.0 | �� HIGH - CVE-2023-32681 |
-| django | 3.2.0 | 4.2.8 | 🟡 MEDIUM - Multiple CVEs |
-| flask | 2.0.0 | 3.0.0 | 🟢 LOW - New features |
-| pyyaml | 5.4.1 | 6.0.1 | 🔴 HIGH - CVE-2022-38752 |
-| sqlalchemy | 1.4.0 | 2.0.23 | 🟢 LOW - Performance improvements |
-
-### Recommended Actions
-
-**Immediate (Security Vulnerabilities):**
-```bash
-pip install --upgrade requests==2.31.0
-pip install --upgrade pyyaml==6.0.1
-pip install --upgrade django==4.2.8
-```
-
-**Optional (Feature Updates):**
-```bash
-pip install --upgrade flask==3.0.0
-pip install --upgrade sqlalchemy==2.0.23
-```
-
-### Breaking Changes to Review
- **Django 4.x:** Requires Python 3.8+, check compatibility
- **Flask 3.x:** Async support added, review async patterns
- **SQLAlchemy 2.x:** ORM API changes, review queries
-
-### Resources
- [requests CVE-2023-32681](https://nvd.nist.gov/vuln/detail/CVE-2023-32681)
- [pyyaml CVE-2022-38752](https://nvd.nist.gov/vuln/detail/CVE-2022-38752)
-```
-
-### Impact
- Critical for security
- Keeps projects up-to-date
- Prevents technical debt
- Reduces manual checking
-
-### Files to Modify
- `/tools/ai-review/config.yml`
- `/tools/ai-review/agents/issue_agent.py`
- Add new module: `/tools/ai-review/dependency_checker.py`
-
-### External APIs Needed
- PyPI JSON API for Python packages
- npm registry API for JavaScript
- NVD (National Vulnerability Database) for CVEs
- Or use `pip-audit`, `npm audit` CLI tools
-
---
-
-## Implementation Priority
-
-### Phase 1: Quick Wins (1-3 hours total)
-1. `@codebot help` command
-2. `@codebot review-again` command
-
-### Phase 2: High Impact (5-8 hours total)
-3. Automatic Label Creator
-4. Code Diff Explainer
-
-### Phase 3: Strategic Features (10-15 hours total)
-5. Smart Test Suggestions
-6. PR Changelog Generator
-7. Dependency Update Advisor
-
---
-
-## Contributing
-
-Have an idea for a new feature? Please:
-1. Check if it's already listed here
-2. Consider value/effort ratio
-3. Open an issue describing:
-   - Problem it solves
-   - Proposed solution
-   - Expected impact
-   - Example use case
-
---
-
-## See Also
-
- [future_roadmap.md](future_roadmap.md) - Long-term vision (SAST, RAG, etc.)
- [configuration.md](configuration.md) - How to configure existing features
- [agents.md](agents.md) - Current agent capabilities
@@ -1,82 +0,0 @@
-# Future Features Roadmap
-
-This document outlines the strategic plan for evolving the AI Code Review system. These features are proposed for future implementation to enhance security coverage, context awareness, and user interaction.
-
---
-
-## Phase 1: Advanced Security Scanning
-
-Expand the current 17-rule regex scanner with dedicated industry-standard tools for **Static Application Security Testing (SAST)** and **Software Composition Analysis (SCA)**.
-
-### Proposed Integrations
-
-| Tool | Type | Purpose | Implementation Plan |
-|------|------|---------|---------------------|
-| **Bandit** | SAST | Analyze Python code for common vulnerability patterns (e.g., `exec`, weak crypto). | Run `bandit -r . -f json` and parse results into the review report. |
-| **Semgrep** | SAST | Polyglot scanning with custom rule support. | Integrate `semgrep --config=p/security-audit` for broader language support (JS, Go, Java). |
-| **Safety** | SCA | Check installed dependencies against known vulnerability databases. | Run `safety check --json` during CI to flag vulnerable packages in `requirements.txt`. |
-| **Trivy** | SCA/Container | Scan container images (Dockerfiles) and filesystem. | Add a workflow step to run Trivy for container-based projects. |
-
-**Impact:** significantly reduces false negatives and covers dependency chain risks (Supply Chain Security).
-
---
-
-## Phase 2: "Chat with Codebase" (RAG)
-
-Move beyond single-file context by implementing **Retrieval-Augmented Generation (RAG)**. This allows the AI to answer questions like *"Where is authentication handled?"* by searching the entire codebase semantically.
-
-### Architecture
-
-1.  **Vector Database:**
-    *   **ChromaDB** or **Qdrant**: Lightweight, open-source choices for storing code embeddings.
-2.  **Embeddings Model:**
-    *   **OpenAI `text-embedding-3-small`** or **FastEmbed**: To convert code chunks (functions/classes) into vectors.
-3.  **Workflow:**
-    *   **Index:** Run a nightly job to parse the codebase -> chunk it -> embed it -> store in Vector DB.
-    *   **Query:** When `@ai-bot` receives a question, convert the question to a vector -> search Vector DB -> inject relevant snippets into the LLM prompt.
-
-**Impact:** Enables high-accuracy architectural advice and deep-dive explanations spanning multiple files.
-
---
-
-## Phase 3: Interactive Code Repair
-
-Transform the bot from a passive reviewer into an active collaborator.
-
-### Features
-
-*   **`@ai-bot apply <suggestion_id>`**:
-    *   The bot generates a secure `git patch` for a specific recommendation.
-    *   The system commits the patch directly to the PR branch.
-*   **Refactoring Assistance**:
-    *   Command: `@ai-bot refactor this function to use dependency injection`.
-    *   Bot proposes the changed code block and offers to commit it.
-
-**Risk Mitigation:**
-*   Require human approval (comment reply) before any commit is pushed.
-*   Run tests automatically after bot commits.
-
---
-
-## Phase 4: Enterprise Dashboard
-
-Provide a high-level view of engineering health across the organization.
-
-### Metrics to Visualize
-
-*   **Security Health:** Trend of High/Critical issues over time.
-*   **Code Quality:** Technical debt accumulation vs. reduction rate.
-*   **Review Velocity:** Average time to AI review vs. Human review.
-*   **Bot Usage:** Most frequent commands and value-add interactions.
-
-### Tech Stack
-*   **Prometheus** (already implemented) + **Grafana**: For time-series tracking.
-*   **Streamlit** / **Next.js**: For a custom management console to configure rules and view logs.
-
---
-
-## Strategic Recommendations
-
-1.  **Immediate Win:** Implement **Bandit** integration. It is low-effort (Python library) and high-value (detects real vulnerabilities).
-2.  **High Impact:** **Safety** dependency scanning. Vulnerable dependencies are the #1 attack vector for modern apps.
-3.  **Long Term:** Work on **Vector DB** integration only after the core review logic is flawless, as it introduces significant infrastructure complexity.
@@ -1,163 +0,0 @@
-# Security Scanning
-
-The security scanner detects vulnerabilities aligned with OWASP Top 10.
-
-## Supported Rules
-
-### A01:2021 – Broken Access Control
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC001 | HIGH | Hardcoded credentials (passwords, API keys) |
-| SEC002 | HIGH | Exposed private keys |
-
-### A02:2021 – Cryptographic Failures
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC003 | MEDIUM | Weak hash algorithms (MD5, SHA1) |
-| SEC004 | MEDIUM | Non-cryptographic random for security |
-
-### A03:2021 – Injection
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC005 | HIGH | SQL injection via string formatting |
-| SEC006 | HIGH | Command injection in subprocess |
-| SEC007 | HIGH | eval() usage |
-| SEC008 | MEDIUM | XSS via innerHTML |
-
-### A04:2021 – Insecure Design
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC009 | MEDIUM | Debug mode enabled |
-
-### A05:2021 – Security Misconfiguration
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC010 | MEDIUM | CORS wildcard (*) |
-| SEC011 | HIGH | SSL verification disabled |
-
-### A07:2021 – Authentication Failures
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC012 | HIGH | Hardcoded JWT secrets |
-
-### A08:2021 – Integrity Failures
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC013 | MEDIUM | Pickle deserialization |
-
-### A09:2021 – Logging Failures
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC014 | MEDIUM | Logging sensitive data |
-
-### A10:2021 – Server-Side Request Forgery
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC015 | MEDIUM | SSRF via dynamic URLs |
-
-### Additional Rules
-
-| Rule | Severity | Description |
-|------|----------|-------------|
-| SEC016 | LOW | Hardcoded IP addresses |
-| SEC017 | MEDIUM | Security-related TODO/FIXME |
-
-## Usage
-
-### In PR Reviews
-
-Security scanning runs automatically during PR review:
-
-```yaml
-agents:
-  pr:
-    security_scan: true
-```
-
-### Standalone
-
-```python
-from security import SecurityScanner
-
-scanner = SecurityScanner()
-
-# Scan file content
-for finding in scanner.scan_content(code, "file.py"):
-    print(f"[{finding.severity}] {finding.rule_name}")
-    print(f"  Line {finding.line}: {finding.code_snippet}")
-    print(f"  {finding.description}")
-
-# Scan git diff
-for finding in scanner.scan_diff(diff):
-    print(f"{finding.file}:{finding.line} - {finding.rule_name}")
-```
-
-### Get Summary
-
-```python
-findings = list(scanner.scan_content(code, "file.py"))
-summary = scanner.get_summary(findings)
-
-print(f"Total: {summary['total']}")
-print(f"HIGH: {summary['by_severity']['HIGH']}")
-print(f"Categories: {summary['by_category']}")
-```
-
-## Custom Rules
-
-Create `security/security_rules.yml`:
-
-```yaml
-rules:
-  - id: "CUSTOM001"
-    name: "Custom Pattern"
-    pattern: "dangerous_function\\s*\\("
-    severity: "HIGH"
-    category: "Custom"
-    cwe: "CWE-xxx"
-    description: "Usage of dangerous function detected"
-    recommendation: "Use safe_function() instead"
-```
-
-Load custom rules:
-
-```python
-scanner = SecurityScanner(rules_file="security/custom_rules.yml")
-```
-
-## CI Integration
-
-Fail CI on HIGH severity findings:
-
-```yaml
-security:
-  fail_on_high: true
-```
-
-Or in code:
-
-```python
-findings = list(scanner.scan_diff(diff))
-high_count = sum(1 for f in findings if f.severity == "HIGH")
-
-if high_count > 0:
-    sys.exit(1)
-```
-
-## CWE References
-
-All rules include CWE (Common Weakness Enumeration) references:
-
- [CWE-78](https://cwe.mitre.org/data/definitions/78.html): OS Command Injection
- [CWE-79](https://cwe.mitre.org/data/definitions/79.html): XSS
- [CWE-89](https://cwe.mitre.org/data/definitions/89.html): SQL Injection
- [CWE-798](https://cwe.mitre.org/data/definitions/798.html): Hardcoded Credentials