# AegisGitea MCP - Architecture Documentation --- ## System Overview ``` ┌─────────────────────────────────────────────────────────────────────┐ │ ChatGPT Business │ │ (AI Assistant Interface) │ │ │ │ User: "Show me the files in my-repo" │ └────────────────────────────┬────────────────────────────────────────┘ │ HTTPS (MCP over SSE) │ Tool: get_file_tree(owner, repo) ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ Reverse Proxy (Traefik/Nginx) │ │ TLS Termination │ └────────────────────────────┬────────────────────────────────────────┘ │ HTTP ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ AegisGitea MCP Server (Docker) │ │ │ │ ┌───────────────────────────────────────────────────────────────┐ │ │ │ FastAPI Application │ │ │ │ │ │ │ │ Endpoints: │ │ │ │ - GET /health (Health check) │ │ │ │ - GET /mcp/tools (List available tools) │ │ │ │ - POST /mcp/tool/call (Execute tool) │ │ │ │ - GET /mcp/sse (Server-sent events) │ │ │ └───────────────────────┬───────────────────────────────────────┘ │ │ │ │ │ ┌───────────────────────┴───────────────────────────────────────┐ │ │ │ MCP Protocol Handler │ │ │ │ - Tool validation │ │ │ │ - Request/response mapping │ │ │ │ - Correlation ID management │ │ │ └───────────────────────┬───────────────────────────────────────┘ │ │ │ │ │ ┌───────────────────────┴───────────────────────────────────────┐ │ │ │ Tool Implementations │ │ │ │ │ │ │ │ - list_repositories() - get_repository_info() │ │ │ │ - get_file_tree() - get_file_contents() │ │ │ └───────────────────────┬───────────────────────────────────────┘ │ │ │ │ │ ┌──────────────┬────────┴────────┬─────────────────────────────┐ │ │ │ │ │ │ │ │ │ ┌───────────▼───────┐ ┌─────▼──────┐ ┌────────────────┐ │ │ │ │ │ Gitea Client │ │ Config │ │ Audit Logger │ │ │ │ │ │ - Auth │ │ Manager │ │ - Structured │ │ │ │ │ │ - API calls │ │ - Env vars│ │ - JSON logs │ │ │ │ │ │ - Error handling│ │ - Defaults│ │ - Correlation │ │ │ │ │ └───────────┬───────┘ └────────────┘ └────────┬───────┘ │ │ │ │ │ │ │ │ │ └──────────────┼────────────────────────────────────┼─────────┘ │ │ │ │ │ └─────────────────┼────────────────────────────────────┼───────────┘ │ Gitea API │ │ (Authorization: token XXX) │ Audit Logs ▼ ▼ ┌─────────────────────────────────────┐ ┌──────────────────────────┐ │ Gitea Instance │ │ Persistent Volume │ │ (Self-hosted VCS) │ │ /var/log/aegis-mcp/ │ │ │ │ audit.log │ │ Repositories: │ └──────────────────────────┘ │ ┌─────────────────────────────┐ │ │ │ org/repo-1 (bot has access)│ │ │ │ org/repo-2 (bot has access)│ │ │ │ org/private (NO ACCESS) │ │ │ └─────────────────────────────┘ │ │ │ │ Bot User: aegis-bot │ │ Permissions: Read-only │ └─────────────────────────────────────┘ ``` --- ## Component Responsibilities ### 1. ChatGPT (External) **Responsibility**: Initiate explicit tool calls based on user requests - Receives MCP tool definitions - Constructs tool call requests - Presents results to user - Human-in-the-loop decision making ### 2. Reverse Proxy **Responsibility**: TLS termination and routing - Terminates HTTPS connections - Routes to MCP server container - Handles SSL certificates - Optional: IP filtering, rate limiting ### 3. AegisGitea MCP Server (Core) **Responsibility**: MCP protocol implementation and policy enforcement #### 3a. FastAPI Application - HTTP server with async support - Server-Sent Events endpoint - Health and status endpoints - Request routing #### 3b. MCP Protocol Handler - Tool definition management - Request validation - Response formatting - Correlation ID tracking #### 3c. Tool Implementations - Repository discovery - File tree navigation - File content retrieval - Bounded, single-purpose operations #### 3d. Gitea Client - Async HTTP client for Gitea API - Bot user authentication - Error handling and retries - Response parsing #### 3e. Config Manager - Environment variable loading - Validation with Pydantic - Default values - Type safety #### 3f. Audit Logger - Structured JSON logging - Correlation ID tracking - Timestamp (UTC) - Append-only logs ### 4. Gitea Instance **Responsibility**: Authorization and data storage - Source of truth for permissions - Repository data storage - Bot user management - Access control enforcement ### 5. Persistent Volume **Responsibility**: Audit log storage - Durable storage for audit logs - Survives container restarts - Accessible for review/analysis --- ## Data Flow: Tool Invocation ``` 1. User Request ├─> "Show me files in org/my-repo" └─> ChatGPT decides to call: get_file_tree(owner="org", repo="my-repo") 2. MCP Request ├─> POST /mcp/tool/call ├─> Body: {"tool": "get_file_tree", "arguments": {"owner": "org", "repo": "my-repo"}} └─> Generate correlation_id: uuid4() 3. Audit Log (Entry) ├─> Log: tool_invocation ├─> tool_name: "get_file_tree" ├─> repository: "org/my-repo" └─> status: "pending" 4. Gitea API Call ├─> GET /api/v1/repos/org/my-repo/git/trees/main ├─> Header: Authorization: token XXX └─> Response: {"tree": [...files...]} 5. Authorization Check ├─> 200 OK → Bot has access ├─> 403 Forbidden → Log access_denied, raise error └─> 404 Not Found → Repository doesn't exist or no access 6. Response Processing ├─> Extract file tree ├─> Transform to simplified format └─> Apply size/count limits 7. Audit Log (Success) ├─> Log: tool_invocation ├─> status: "success" └─> params: {"count": 42} 8. MCP Response ├─> 200 OK ├─> Body: {"success": true, "result": {...files...}} └─> correlation_id: same as request 9. ChatGPT Processing ├─> Receive file tree data ├─> Format for user presentation └─> "Here are the files in org/my-repo: ..." ``` --- ## Security Boundaries ``` ┌───────────────────────────────────────────────────────────────┐ │ Trust Boundary 1 │ │ (Internet ↔ MCP Server) │ │ │ │ Controls: │ │ - HTTPS/TLS encryption │ │ - Reverse proxy authentication (optional) │ │ - Rate limiting │ │ - Firewall rules │ └───────────────────────────────────────────────────────────────┘ ┌───────────────────────────────────────────────────────────────┐ │ Trust Boundary 2 │ │ (MCP Server ↔ Gitea API) │ │ │ │ Controls: │ │ - Bot user token authentication │ │ - Gitea's access control (authoritative) │ │ - API request timeouts │ │ - Input validation │ └───────────────────────────────────────────────────────────────┘ ┌───────────────────────────────────────────────────────────────┐ │ Trust Boundary 3 │ │ (Container ↔ Host System) │ │ │ │ Controls: │ │ - Non-root container user │ │ - Resource limits (CPU, memory) │ │ - No new privileges │ │ - Read-only filesystem (where possible) │ └───────────────────────────────────────────────────────────────┘ ``` --- ## Authorization Flow ``` ┌──────────────────────────────────────────────────────────────────┐ │ AI requests access to "org/private-repo" │ └────────────────────────┬─────────────────────────────────────────┘ │ ▼ ┌───────────────────────────────────────┐ │ MCP Server: Forward to Gitea API │ │ with bot user token │ └───────────────┬───────────────────────┘ │ ▼ ┌───────────────────────────────────────┐ │ Gitea: Check bot user permissions │ │ for "org/private-repo" │ └───────────────┬───────────────────────┘ │ ┌───────┴────────┐ │ │ Bot is collaborator? │ │ │ ┌────────▼─────┐ ┌──────▼──────┐ │ YES │ │ NO │ │ (Read access)│ │ (No access) │ └────────┬─────┘ └──────┬──────┘ │ │ ▼ ▼ ┌───────────────┐ ┌─────────────────┐ │ Return data │ │ Return 403 │ │ Log: success │ │ Log: denied │ └───────────────┘ └─────────────────┘ ``` **Key Insight**: The MCP server never makes authorization decisions - it only forwards requests and respects Gitea's response. --- ## Failure Modes & Handling ### 1. Gitea Unavailable - **Detection**: HTTP connection error - **Response**: Return error to ChatGPT - **Logging**: Log connection failure - **Recovery**: Automatic retry on next request ### 2. Invalid Bot Token - **Detection**: 401 Unauthorized from Gitea - **Response**: Log security event, return auth error - **Logging**: High-severity security log - **Recovery**: Operator must rotate token ### 3. Bot Lacks Permission - **Detection**: 403 Forbidden from Gitea - **Response**: Return authorization error - **Logging**: Access denied event - **Recovery**: Grant permission in Gitea UI ### 4. File Too Large - **Detection**: File size exceeds MAX_FILE_SIZE_BYTES - **Response**: Return size limit error - **Logging**: Security event (potential abuse) - **Recovery**: Increase limit or reject request ### 5. Network Timeout - **Detection**: Request exceeds REQUEST_TIMEOUT_SECONDS - **Response**: Return timeout error - **Logging**: Log timeout event - **Recovery**: Automatic retry possible ### 6. Rate Limit Exceeded - **Detection**: Too many requests per minute - **Response**: Return 429 Too Many Requests - **Logging**: Log rate limit event - **Recovery**: Wait and retry --- ## Scaling Considerations ### Vertical Scaling (Single Instance) - **Current**: 128-512 MB RAM, minimal CPU - **Bottleneck**: Gitea API response time - **Max throughput**: ~100-200 requests/second ### Horizontal Scaling (Multiple Instances) - **Stateless design**: Each instance independent - **Load balancing**: Standard HTTP load balancer - **Shared state**: None (all state in Gitea) - **Audit logs**: Each instance writes to own log (or use centralized logging) ### Performance Optimization (Future) - Add Redis caching layer - Implement connection pooling - Use HTTP/2 for Gitea API - Batch multiple file reads --- ## Observability ### Metrics to Monitor 1. **Request rate**: Requests per minute 2. **Error rate**: Failed requests / total requests 3. **Response time**: P50, P95, P99 latency 4. **Gitea API health**: Success rate to Gitea 5. **Auth failures**: 401/403 responses ### Logs to Track 1. **Audit logs**: Every tool invocation 2. **Access denied**: Permission violations 3. **Security events**: Rate limits, size limits 4. **Errors**: Exceptions and failures ### Alerts to Configure 1. **High error rate**: > 5% errors 2. **Auth failures**: Any 401 responses 3. **Gitea unreachable**: Connection failures 4. **Disk space**: Audit logs filling disk --- ## Future Enhancements ### Phase 2: Extended Context ``` New Tools: ├── get_commits(owner, repo, limit) ├── get_commit_diff(owner, repo, sha) ├── list_issues(owner, repo) ├── get_issue(owner, repo, number) ├── list_pull_requests(owner, repo) └── get_pull_request(owner, repo, number) ``` ### Phase 3: Advanced Features ``` Capabilities: ├── Caching layer (Redis) ├── Webhook support for real-time updates ├── OAuth2 flow instead of static tokens ├── Per-client rate limiting ├── Multi-tenant support (multiple bot users) └── GraphQL API for more efficient queries ``` --- ## Deployment Patterns ### Pattern 1: Single Homelab Instance ``` [Homelab Server] ├── Gitea container ├── AegisGitea MCP container └── Caddy reverse proxy └── Exposes HTTPS endpoint ``` ### Pattern 2: Kubernetes Deployment ``` [Kubernetes Cluster] ├── Namespace: aegis-mcp ├── Deployment: aegis-mcp (3 replicas) ├── Service: ClusterIP ├── Ingress: HTTPS with cert-manager └── PersistentVolume: Audit logs ``` ### Pattern 3: Cloud Deployment ``` [AWS/GCP/Azure] ├── Container service (ECS/Cloud Run/ACI) ├── Load balancer (ALB/Cloud Load Balancing) ├── Secrets manager (Secrets Manager/Secret Manager/Key Vault) └── Log aggregation (CloudWatch/Cloud Logging/Monitor) ``` --- ## Testing Strategy ### Unit Tests - Configuration loading - Gitea client methods - Tool implementations - Audit logging ### Integration Tests - Full MCP protocol flow - Gitea API interactions (mocked) - Error handling paths ### End-to-End Tests - Real Gitea instance - Real bot user - Real tool invocations --- ## Conclusion This architecture prioritizes: 1. **Security**: Read-only, auditable, fail-safe 2. **Simplicity**: Straightforward data flow 3. **Maintainability**: Clear separation of concerns 4. **Observability**: Comprehensive logging The design is intentionally boring and predictable - perfect for a security-critical system.