From b24ae0dcda9b2a04ee6fb44aa7f09c8f76f261f7 Mon Sep 17 00:00:00 2001 From: latte Date: Fri, 16 Jan 2026 11:15:51 +0000 Subject: [PATCH] remove documentation that are no longer needed --- MILESTONE_2_STATUS.md | 287 --------------------- SECURITY_FIXES_SUMMARY.md | 378 --------------------------- SECURITY_QUICK_REFERENCE.md | 167 ------------ CLAUDE.md => docs/CLAUDE.md | 0 SECURITY.md => docs/SECURITY.md | 0 docs/feature-ideas.md | 440 -------------------------------- docs/future_roadmap.md | 82 ------ docs/security.md | 163 ------------ 8 files changed, 1517 deletions(-) delete mode 100644 MILESTONE_2_STATUS.md delete mode 100644 SECURITY_FIXES_SUMMARY.md delete mode 100644 SECURITY_QUICK_REFERENCE.md rename CLAUDE.md => docs/CLAUDE.md (100%) rename SECURITY.md => docs/SECURITY.md (100%) delete mode 100644 docs/feature-ideas.md delete mode 100644 docs/future_roadmap.md delete mode 100644 docs/security.md diff --git a/MILESTONE_2_STATUS.md b/MILESTONE_2_STATUS.md deleted file mode 100644 index 2d1d6fe..0000000 --- a/MILESTONE_2_STATUS.md +++ /dev/null @@ -1,287 +0,0 @@ -# Milestone 2 - Documentation & Deployment Status - -**Date:** 2025-12-29 -**Status:** ✅ COMPLETE - Ready for Merge - ---- - -## Executive Summary - -All three Milestone 2 features have been fully implemented, tested, and documented. Documentation verification confirms 100% completion of all required items. The features are ready for merging to main branch and production deployment. - ---- - -## Feature Implementation Status - -### 1. PR Summary Generator (`@codebot summarize`) -**Branch:** `feature/pr-summary-generator` (merged to dev) -**Status:** ✅ Complete - -**Implementation:** -- ✅ Prompt template: `tools/ai-review/prompts/pr_summary.md` -- ✅ PR Agent methods: `_generate_pr_summary()`, `_format_pr_summary()` -- ✅ Auto-summary on empty PRs (configurable) -- ✅ Manual trigger via `@codebot summarize` command -- ✅ Config: `agents.pr.auto_summary` settings - -**Testing:** -- ✅ TestPRSummaryGeneration class - 10 tests -- ✅ Prompt formatting validation -- ✅ Command detection (case-insensitive) -- ✅ PR vs Issue distinction -- ✅ Output structure validation - -**Documentation:** -- ✅ README.md - User guide with examples -- ✅ CLAUDE.md - Developer implementation guide -- ✅ Workflow routing configured - ---- - -### 2. PR Changelog Generator (`@codebot changelog`) -**Branch:** `feature/pr-changelog-generator` (merged to dev) -**Status:** ✅ Complete - -**Implementation:** -- ✅ Prompt template: `tools/ai-review/prompts/changelog.md` -- ✅ PR Agent methods: `_handle_changelog_command()`, `_format_changelog()` -- ✅ Keep a Changelog format output -- ✅ Breaking changes detection -- ✅ Manual trigger only (no auto-generation) - -**Testing:** -- ✅ TestChangelogGeneration class - 9 tests -- ✅ Prompt formatting validation -- ✅ Command detection (case-insensitive) -- ✅ PR-only validation -- ✅ Empty section handling - -**Documentation:** -- ✅ README.md - User guide with Keep a Changelog example -- ✅ CLAUDE.md - Developer implementation guide -- ✅ Workflow routing configured - ---- - -### 3. Code Diff Explainer (`@codebot explain-diff`) -**Branch:** `feature/code-diff-explainer` (merged to dev) -**Status:** ✅ Complete - -**Implementation:** -- ✅ Prompt template: `tools/ai-review/prompts/explain_diff.md` -- ✅ PR Agent methods: `_handle_explain_diff_command()`, `_format_diff_explanation()` -- ✅ Plain-language translation engine -- ✅ Architecture impact analysis -- ✅ Breaking changes detection - -**Testing:** -- ✅ TestDiffExplanation class - 9 tests -- ✅ Prompt formatting validation -- ✅ Command detection (case-insensitive) -- ✅ PR-only validation -- ✅ Empty section handling - -**Documentation:** -- ✅ README.md - User guide with plain-language examples -- ✅ CLAUDE.md - Developer implementation guide with translation rules -- ✅ Workflow routing configured - ---- - -## Documentation Verification Results - -### User Documentation (README.md) -✅ **Complete** - All features documented: - -| Section | Status | Location | -|---------|--------|----------| -| Feature table | ✅ Complete | Lines 11-15 | -| Command reference | ✅ Complete | Lines 182-196 | -| PR Summary section | ✅ Complete | Lines 198-237 | -| Changelog section | ✅ Complete | Lines 238-284 | -| Diff Explainer section | ✅ Complete | Lines 285-331 | - -**Features Included:** -- Features, benefits, and use cases -- Example outputs for each command -- When to use guidance -- Integration with existing commands - -### Developer Documentation (CLAUDE.md) -✅ **Complete** - All implementation details documented: - -| Section | Status | Location | -|---------|--------|----------| -| PR Summary Generation | ✅ Complete | Line 420 | -| PR Changelog Generation | ✅ Complete | Line 473 | -| Code Diff Explainer | ✅ Complete | Line 537 | -| Workflow Routing | ✅ Complete | Lines 79-110 | -| Prompt Templates | ✅ Complete | Lines 112-124 | - -**Content Includes:** -- Architecture overview -- Implementation details -- JSON structure examples -- Prompt engineering guidelines -- Common use cases -- Workflow safety notes - -### Configuration Documentation -✅ **Complete** - `config.yml` properly configured: - -```yaml -interaction: - commands: - - summarize # ✅ Documented - - changelog # ✅ Documented - - explain-diff # ✅ Documented - -agents: - pr: - auto_summary: - enabled: true - post_as_comment: true -``` - ---- - -## Workflow Routing Verification - -### Critical Fix: Workflow Duplication Prevention -✅ **Fixed** - All workflows are mutually exclusive to prevent 10+ duplicate runs - -**ai-comment-reply.yml:** -- Handles ONLY specific commands: `help`, `explain`, `suggest`, `security`, `summarize`, `changelog`, `explain-diff`, `review-again`, `setup-labels` -- ✅ Includes all three Milestone 2 commands - -**ai-chat.yml:** -- Handles free-form questions (fallback) -- ✅ Excludes all specific commands including `summarize`, `changelog`, `explain-diff` - -**ai-issue-triage.yml:** -- Handles ONLY `@codebot triage` command -- ✅ No conflicts with Milestone 2 features - -**Result:** Each `@codebot` command triggers exactly ONE workflow (no duplicates). - ---- - -## Testing Status - -### Unit Tests -✅ **Complete** - 28 new tests added (54 total in test suite) - -| Test Class | Tests | Coverage | -|------------|-------|----------| -| TestPRSummaryGeneration | 10 | ✅ Prompt, formatting, detection, output | -| TestChangelogGeneration | 9 | ✅ Prompt, formatting, detection, output | -| TestDiffExplanation | 9 | ✅ Prompt, formatting, detection, output | - -**Test Coverage:** -- ✅ Prompt file existence -- ✅ Prompt formatting (double curly braces for JSON) -- ✅ Command detection (case-insensitive) -- ✅ PR vs Issue distinction -- ✅ Output structure validation -- ✅ Empty section handling -- ✅ Config validation - -### Integration Testing -⚠️ **Pending** - Requires manual testing in live environment - -**Recommended Tests:** -1. Create a PR and test `@codebot summarize` -2. Test `@codebot changelog` on a PR with mixed changes -3. Test `@codebot explain-diff` on a PR with technical changes -4. Verify no workflow duplication occurs - ---- - -## Deployment Readiness - -### Pre-Deployment Checklist -- ✅ All features implemented and merged to dev -- ✅ All documentation complete (README.md + CLAUDE.md) -- ✅ Configuration files updated -- ✅ Workflow routing verified (no duplicates) -- ✅ Unit tests complete (28 new tests) -- ✅ Prompt templates created and validated -- ⚠️ Manual integration testing pending -- ⚠️ Final merge to main pending - -### Deployment Steps - -**1. Manual testing on dev branch:** -- Test each command in a live PR -- Verify no workflow duplication -- Validate output formatting - -**2. Merge to main:** -```bash -git checkout main -git merge dev -git push origin main -``` - -**3. Team communication:** -- Announce new features with examples -- Update team documentation -- Gather feedback - ---- - -## Files Modified/Created - -### New Prompt Templates (3) -- `tools/ai-review/prompts/pr_summary.md` -- `tools/ai-review/prompts/changelog.md` -- `tools/ai-review/prompts/explain_diff.md` - -### Modified Files -- `tools/ai-review/agents/pr_agent.py` - Added 6 new methods -- `tools/ai-review/config.yml` - Added commands and auto_summary config -- `.gitea/workflows/ai-comment-reply.yml` - Added 3 commands to routing -- `.gitea/workflows/ai-chat.yml` - Excluded 3 commands from routing -- `README.md` - Added 3 feature sections with examples -- `CLAUDE.md` - Added 3 implementation guides -- `tests/test_ai_review.py` - Added 28 new tests in 3 test classes - ---- - -## Known Issues - -**None** - All features are working as designed. - ---- - -## Recommendations - -### Priority: High -1. ⚠️ **Manual integration testing** - Test in live environment before main merge -2. ⚠️ **Team announcement** - Communicate new features to team - -### Priority: Medium -3. Monitor API usage after deployment (new commands will increase LLM calls) -4. Gather user feedback on plain-language explanations -5. Consider adding video demos/GIFs for each feature - -### Priority: Low -6. Performance testing under load (multiple simultaneous requests) -7. Security review of prompt injection risks -8. A/B testing for prompt effectiveness - ---- - -## Conclusion - -**Milestone 2 is 100% complete and ready for deployment.** - -All three features are fully implemented, thoroughly tested, and comprehensively documented. The workflow routing issue that was causing 10+ duplicate runs has been resolved. The codebase is in a production-ready state. - -**Next Action:** Manual integration testing on dev branch before final production deployment to main. - ---- - -**Verified by:** Claude Code (Automated Documentation Review) -**Verification Date:** 2025-12-29 -**Status:** All features merged to dev branch and ready for production diff --git a/SECURITY_FIXES_SUMMARY.md b/SECURITY_FIXES_SUMMARY.md deleted file mode 100644 index f1508c7..0000000 --- a/SECURITY_FIXES_SUMMARY.md +++ /dev/null @@ -1,378 +0,0 @@ -# Security Fixes Summary - -This document summarizes the security improvements made to OpenRabbit in response to the AI code review findings. - -## Date -2025-12-28 - -## Issues Fixed - -### HIGH Severity Issues (1 Fixed) - -#### 1. Full Issue and Comment JSON Data Exposed in Environment Variables -**File**: `.gitea/workflows/ai-comment-reply.yml:40` - -**Problem**: -Full issue and comment JSON data were passed as environment variables (`EVENT_ISSUE_JSON`, `EVENT_COMMENT_JSON`), which could expose sensitive information (emails, private data, tokens) in logs or environment dumps. - -**Fix**: -- Removed full webhook data from environment variables -- Created minimal event payload with only essential fields (issue number, comment body) -- Implemented `utils/safe_dispatch.py` for secure event processing -- Created `utils/webhook_sanitizer.py` with data sanitization utilities - -**Impact**: Prevents sensitive user data from being exposed in CI/CD logs and environment variables. - ---- - -### MEDIUM Severity Issues (4 Fixed) - -#### 1. Boolean String Comparison Issues -**File**: `.gitea/workflows/ai-comment-reply.yml:44` - -**Problem**: -Check for PR used string comparison on `IS_PR` environment variable which could cause unexpected behavior. - -**Fix**: -- Moved boolean expression directly into shell script: `IS_PR="${{ gitea.event.issue.pull_request != null }}"` -- Added validation to ensure variable is set before use - -#### 2. Complex Inline Python Script -**File**: `.gitea/workflows/ai-comment-reply.yml:47` - -**Problem**: -Inline Python script embedded in shell script mixed multiple responsibilities (JSON parsing, dispatcher setup, agent registration). - -**Fix**: -- Extracted to separate module: `tools/ai-review/utils/safe_dispatch.py` -- Separated concerns: validation, sanitization, and dispatch -- Added comprehensive error handling and logging -- Made code testable and reusable - -#### 3. No Input Validation or Sanitization -**File**: `.gitea/workflows/ai-comment-reply.yml:47` - -**Problem**: -Inline Python code didn't validate or sanitize loaded JSON data before dispatching. - -**Fix**: -- Created `utils/webhook_sanitizer.py` with three key functions: - - `sanitize_webhook_data()` - Removes sensitive fields (emails, tokens, secrets) - - `validate_repository_format()` - Validates and sanitizes repo names (prevents path traversal, shell injection) - - `extract_minimal_context()` - Extracts only necessary fields from webhooks -- Added size limits (10MB max event size) -- Added JSON validation - -#### 4. Repository String Split Without Validation -**File**: `.gitea/workflows/ai-comment-reply.yml:54` - -**Problem**: -Repository string was split into owner and repo_name without validation. - -**Fix**: -- Added regex validation: `^[a-zA-Z0-9_-]+/[a-zA-Z0-9_-]+$` -- Added path traversal detection (`..` in names) -- Added shell injection prevention (`;`, `|`, `&`, `` ` ``, etc.) -- Comprehensive error messages - ---- - -### LOW Severity Issues (2 Fixed) - -#### 1. Missing Code Comments -**File**: `.gitea/workflows/ai-comment-reply.yml:47` - -**Fix**: Added comprehensive comments explaining each step in the workflow. - -#### 2. No Tests for New Dispatch Logic -**File**: `.gitea/workflows/ai-comment-reply.yml:62` - -**Fix**: Created comprehensive test suite (see below). - ---- - -## New Security Infrastructure - -### 1. Webhook Sanitization Utilities -**File**: `tools/ai-review/utils/webhook_sanitizer.py` - -**Features**: -- **Sensitive Field Removal**: Automatically redacts emails, tokens, API keys, passwords, private keys -- **Field Truncation**: Limits large text fields (body, description) to prevent log flooding -- **Nested Sanitization**: Recursively sanitizes nested dicts and lists -- **Minimal Context Extraction**: Extracts only essential fields for each event type -- **Repository Validation**: - - Format validation (owner/repo) - - Path traversal prevention - - Shell injection prevention -- **Webhook Signature Validation**: HMAC validation for future webhook integration - -**Sensitive Fields Redacted**: -```python -SENSITIVE_FIELDS = { - "email", "private_email", "email_addresses", - "token", "access_token", "refresh_token", "api_key", - "secret", "password", "private_key", "ssh_key", - "phone", "phone_number", "address", "ssn", "credit_card", - "installation_id", "node_id", -} -``` - -### 2. Safe Dispatch Utility -**File**: `tools/ai-review/utils/safe_dispatch.py` - -**Features**: -- Input validation (repository format, JSON structure) -- Data sanitization before dispatch -- Size limits (10MB max) -- Comprehensive error handling -- Logging with sanitized data -- Exit codes for CI/CD integration - -**Usage**: -```bash -python utils/safe_dispatch.py issue_comment owner/repo '{"action": "created", ...}' -``` - -### 3. Pre-commit Security Hooks -**File**: `.pre-commit-config.yaml` - -**Hooks**: -1. **Security Scanner** (`security/pre_commit_scan.py`) - Scans Python files for vulnerabilities -2. **Workflow Validator** (`security/validate_workflows.py`) - Validates workflow files for security anti-patterns -3. **Secret Detector** (`security/check_secrets.py`) - Detects hardcoded secrets -4. **YAML Linting** - Validates YAML syntax -5. **Bandit** - Python security linter - -**Anti-patterns Detected**: -- Full webhook data in environment variables (`toJSON(github.event)`) -- Unvalidated repository inputs -- Direct user input in shell without escaping -- Inline Python with environment variable JSON parsing - -### 4. Security Documentation -**File**: `SECURITY.md` - -**Contents**: -- Workflow security best practices -- Input validation requirements -- Secret management guidelines -- Security scanning procedures -- Vulnerability reporting process -- Security checklist for contributors - -**Key Sections**: -- ✅ Good vs ❌ Bad examples for workflows -- Boolean comparison patterns -- Webhook data handling -- Pre-commit hook setup -- CI failure thresholds - ---- - -## Test Coverage - -### 1. Security Utilities Tests -**File**: `tests/test_security_utils.py` - -**Test Coverage**: -- Email field redaction -- Token and secret redaction -- Large body truncation -- Nested data sanitization -- List sanitization -- Minimal context extraction for different event types -- Repository format validation -- Path traversal rejection -- Shell injection rejection -- Edge cases (empty dicts, mixed types, case-insensitive matching) - -**Test Count**: 20+ test cases - -### 2. Safe Dispatch Tests -**File**: `tests/test_safe_dispatch.py` - -**Test Coverage**: -- Valid JSON loading -- Invalid JSON rejection -- Size limit enforcement -- Successful dispatch -- Error handling -- Repository validation -- Path traversal prevention -- Shell injection prevention -- Data sanitization verification -- Exception handling - -**Test Count**: 12+ test cases - -### 3. Manual Validation -All security utilities tested manually: -```bash -✓ Sanitization works: True -✓ Valid repo accepted: True -✓ Malicious repo rejected -✓ Minimal extraction works: True -``` - ---- - -## Updated Files - -### Core Security Files (New) -1. `tools/ai-review/utils/webhook_sanitizer.py` - Sanitization utilities -2. `tools/ai-review/utils/safe_dispatch.py` - Safe dispatch wrapper -3. `tools/ai-review/security/pre_commit_scan.py` - Pre-commit security scanner -4. `tools/ai-review/security/validate_workflows.py` - Workflow validator -5. `tools/ai-review/security/check_secrets.py` - Secret detector -6. `tests/test_security_utils.py` - Security utility tests -7. `tests/test_safe_dispatch.py` - Safe dispatch tests - -### Documentation (New/Updated) -1. `SECURITY.md` - Comprehensive security guidelines (NEW) -2. `CLAUDE.md` - Added security best practices section (UPDATED) -3. `.pre-commit-config.yaml` - Pre-commit hook configuration (NEW) -4. `SECURITY_FIXES_SUMMARY.md` - This document (NEW) - -### Workflow Files (Updated) -1. `.gitea/workflows/ai-comment-reply.yml` - Secure webhook handling - ---- - -## Security Improvements by the Numbers - -- **7 vulnerabilities fixed** (1 HIGH, 4 MEDIUM, 2 LOW) -- **7 new security modules** created -- **32+ new test cases** added -- **4 pre-commit hooks** implemented -- **50+ sensitive field patterns** detected and redacted -- **17 built-in security scanner rules** (existing) -- **10MB event size limit** enforced -- **100% code coverage** for security utilities - ---- - -## Prevention Measures for Future Development - -### 1. Pre-commit Hooks -Developers will be alerted BEFORE committing: -- Hardcoded secrets -- Workflow security anti-patterns -- Security vulnerabilities in code - -### 2. Documentation -Comprehensive security guidelines ensure developers: -- Know what NOT to do -- Have working examples of secure patterns -- Understand the security model - -### 3. Reusable Utilities -Centralized security utilities prevent re-implementing: -- Input validation -- Data sanitization -- Repository format checking - -### 4. Automated Testing -Security utility tests ensure: -- Sanitization works correctly -- Validation catches malicious inputs -- No regressions in security features - -### 5. CI/CD Integration -Workflows now: -- Validate all inputs -- Use minimal data -- Log safely -- Fail fast on security issues - ---- - -## Security Principles Applied - -1. **Principle of Least Privilege**: Only pass necessary data to workflows -2. **Defense in Depth**: Multiple layers (validation, sanitization, size limits) -3. **Fail Securely**: Validation errors cause immediate failure -4. **Security by Default**: Pre-commit hooks catch issues automatically -5. **Input Validation**: All external inputs validated and sanitized -6. **Data Minimization**: Extract only essential fields from webhooks -7. **Separation of Concerns**: Security logic in dedicated, testable modules - ---- - -## Attack Vectors Prevented - -### 1. Information Disclosure -- ✅ User emails no longer exposed in logs -- ✅ Tokens and API keys redacted from event data -- ✅ Private repository URLs sanitized - -### 2. Path Traversal -- ✅ Repository names validated (no `..` allowed) -- ✅ Prevents access to `/etc/passwd` and other system files - -### 3. Shell Injection -- ✅ Dangerous characters blocked (`;`, `|`, `&`, `` ` ``, `$()`) -- ✅ Repository names validated before shell execution - -### 4. Log Injection -- ✅ Large fields truncated to prevent log flooding -- ✅ User input properly escaped in JSON - -### 5. Denial of Service -- ✅ Event size limited to 10MB -- ✅ Recursion depth limited in sanitization - -### 6. Secret Exposure -- ✅ Pre-commit hooks detect hardcoded secrets -- ✅ Workflow validator prevents secret leakage - ---- - -## Verification Steps - -To verify the security fixes: - -```bash -# 1. Test webhook sanitization -cd tools/ai-review -python -c "from utils.webhook_sanitizer import sanitize_webhook_data; print(sanitize_webhook_data({'user': {'email': 'test@example.com'}}))" -# Should output: {'user': {'email': '[REDACTED]'}} - -# 2. Test repository validation -python -c "from utils.webhook_sanitizer import validate_repository_format; validate_repository_format('owner/repo; rm -rf /')" -# Should raise ValueError - -# 3. Install and run pre-commit hooks -pip install pre-commit -pre-commit install -pre-commit run --all-files - -# 4. Test workflow validation -python tools/ai-review/security/validate_workflows.py .gitea/workflows/ai-comment-reply.yml -# Should pass with no errors -``` - ---- - -## Recommendations for Ongoing Security - -1. **Review SECURITY.md** before making workflow changes -2. **Run pre-commit hooks** on all commits (automatic after `pre-commit install`) -3. **Update security rules** as new vulnerability patterns emerge -4. **Rotate secrets** regularly in CI/CD settings -5. **Monitor logs** for validation errors (may indicate attack attempts) -6. **Keep dependencies updated** (especially security-related packages) -7. **Conduct security reviews** for significant changes - ---- - -## Contact - -For security concerns or questions about these fixes: -- Review: `SECURITY.md` -- Report vulnerabilities: [security contact] -- Documentation: `CLAUDE.md` (Security Best Practices section) - ---- - -**Status**: ✅ All security issues resolved and prevention measures in place. diff --git a/SECURITY_QUICK_REFERENCE.md b/SECURITY_QUICK_REFERENCE.md deleted file mode 100644 index 570f03b..0000000 --- a/SECURITY_QUICK_REFERENCE.md +++ /dev/null @@ -1,167 +0,0 @@ -# Security Quick Reference Card - -Quick reference for common security tasks in OpenRabbit development. - -## ❌ Common Security Mistakes - -### 1. Exposing Full Webhook Data -```yaml -# ❌ NEVER DO THIS -env: - EVENT_DATA: ${{ toJSON(github.event) }} # Exposes emails, tokens! -``` - -### 2. Unvalidated User Input -```python -# ❌ NEVER DO THIS -owner, repo = repo_string.split('/') # No validation! -``` - -### 3. Hardcoded Secrets -```python -# ❌ NEVER DO THIS -api_key = "sk-1234567890abcdef" # Hardcoded secret! -``` - ---- - -## ✅ Secure Patterns - -### 1. Workflow Event Handling -```yaml -# ✅ Use minimal data extraction -run: | - EVENT_DATA=$(cat < chunk it -> embed it -> store in Vector DB. - * **Query:** When `@ai-bot` receives a question, convert the question to a vector -> search Vector DB -> inject relevant snippets into the LLM prompt. - -**Impact:** Enables high-accuracy architectural advice and deep-dive explanations spanning multiple files. - ---- - -## Phase 3: Interactive Code Repair - -Transform the bot from a passive reviewer into an active collaborator. - -### Features - -* **`@ai-bot apply `**: - * The bot generates a secure `git patch` for a specific recommendation. - * The system commits the patch directly to the PR branch. -* **Refactoring Assistance**: - * Command: `@ai-bot refactor this function to use dependency injection`. - * Bot proposes the changed code block and offers to commit it. - -**Risk Mitigation:** -* Require human approval (comment reply) before any commit is pushed. -* Run tests automatically after bot commits. - ---- - -## Phase 4: Enterprise Dashboard - -Provide a high-level view of engineering health across the organization. - -### Metrics to Visualize - -* **Security Health:** Trend of High/Critical issues over time. -* **Code Quality:** Technical debt accumulation vs. reduction rate. -* **Review Velocity:** Average time to AI review vs. Human review. -* **Bot Usage:** Most frequent commands and value-add interactions. - -### Tech Stack -* **Prometheus** (already implemented) + **Grafana**: For time-series tracking. -* **Streamlit** / **Next.js**: For a custom management console to configure rules and view logs. - ---- - -## Strategic Recommendations - -1. **Immediate Win:** Implement **Bandit** integration. It is low-effort (Python library) and high-value (detects real vulnerabilities). -2. **High Impact:** **Safety** dependency scanning. Vulnerable dependencies are the #1 attack vector for modern apps. -3. **Long Term:** Work on **Vector DB** integration only after the core review logic is flawless, as it introduces significant infrastructure complexity. diff --git a/docs/security.md b/docs/security.md deleted file mode 100644 index 5e56789..0000000 --- a/docs/security.md +++ /dev/null @@ -1,163 +0,0 @@ -# Security Scanning - -The security scanner detects vulnerabilities aligned with OWASP Top 10. - -## Supported Rules - -### A01:2021 – Broken Access Control - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC001 | HIGH | Hardcoded credentials (passwords, API keys) | -| SEC002 | HIGH | Exposed private keys | - -### A02:2021 – Cryptographic Failures - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC003 | MEDIUM | Weak hash algorithms (MD5, SHA1) | -| SEC004 | MEDIUM | Non-cryptographic random for security | - -### A03:2021 – Injection - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC005 | HIGH | SQL injection via string formatting | -| SEC006 | HIGH | Command injection in subprocess | -| SEC007 | HIGH | eval() usage | -| SEC008 | MEDIUM | XSS via innerHTML | - -### A04:2021 – Insecure Design - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC009 | MEDIUM | Debug mode enabled | - -### A05:2021 – Security Misconfiguration - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC010 | MEDIUM | CORS wildcard (*) | -| SEC011 | HIGH | SSL verification disabled | - -### A07:2021 – Authentication Failures - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC012 | HIGH | Hardcoded JWT secrets | - -### A08:2021 – Integrity Failures - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC013 | MEDIUM | Pickle deserialization | - -### A09:2021 – Logging Failures - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC014 | MEDIUM | Logging sensitive data | - -### A10:2021 – Server-Side Request Forgery - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC015 | MEDIUM | SSRF via dynamic URLs | - -### Additional Rules - -| Rule | Severity | Description | -|------|----------|-------------| -| SEC016 | LOW | Hardcoded IP addresses | -| SEC017 | MEDIUM | Security-related TODO/FIXME | - -## Usage - -### In PR Reviews - -Security scanning runs automatically during PR review: - -```yaml -agents: - pr: - security_scan: true -``` - -### Standalone - -```python -from security import SecurityScanner - -scanner = SecurityScanner() - -# Scan file content -for finding in scanner.scan_content(code, "file.py"): - print(f"[{finding.severity}] {finding.rule_name}") - print(f" Line {finding.line}: {finding.code_snippet}") - print(f" {finding.description}") - -# Scan git diff -for finding in scanner.scan_diff(diff): - print(f"{finding.file}:{finding.line} - {finding.rule_name}") -``` - -### Get Summary - -```python -findings = list(scanner.scan_content(code, "file.py")) -summary = scanner.get_summary(findings) - -print(f"Total: {summary['total']}") -print(f"HIGH: {summary['by_severity']['HIGH']}") -print(f"Categories: {summary['by_category']}") -``` - -## Custom Rules - -Create `security/security_rules.yml`: - -```yaml -rules: - - id: "CUSTOM001" - name: "Custom Pattern" - pattern: "dangerous_function\\s*\\(" - severity: "HIGH" - category: "Custom" - cwe: "CWE-xxx" - description: "Usage of dangerous function detected" - recommendation: "Use safe_function() instead" -``` - -Load custom rules: - -```python -scanner = SecurityScanner(rules_file="security/custom_rules.yml") -``` - -## CI Integration - -Fail CI on HIGH severity findings: - -```yaml -security: - fail_on_high: true -``` - -Or in code: - -```python -findings = list(scanner.scan_diff(diff)) -high_count = sum(1 for f in findings if f.severity == "HIGH") - -if high_count > 0: - sys.exit(1) -``` - -## CWE References - -All rules include CWE (Common Weakness Enumeration) references: - -- [CWE-78](https://cwe.mitre.org/data/definitions/78.html): OS Command Injection -- [CWE-79](https://cwe.mitre.org/data/definitions/79.html): XSS -- [CWE-89](https://cwe.mitre.org/data/definitions/89.html): SQL Injection -- [CWE-798](https://cwe.mitre.org/data/definitions/798.html): Hardcoded Credentials