This commit is contained in:
2026-01-29 19:53:36 +01:00
parent 1bda2013bb
commit a9708b33e2
27 changed files with 3745 additions and 4 deletions

456
ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,456 @@
# AegisGitea MCP - Architecture Documentation
---
## System Overview
```
┌─────────────────────────────────────────────────────────────────────┐
│ ChatGPT Business │
│ (AI Assistant Interface) │
│ │
│ User: "Show me the files in my-repo" │
└────────────────────────────┬────────────────────────────────────────┘
│ HTTPS (MCP over SSE)
│ Tool: get_file_tree(owner, repo)
┌─────────────────────────────────────────────────────────────────────┐
│ Reverse Proxy (Traefik/Nginx) │
│ TLS Termination │
└────────────────────────────┬────────────────────────────────────────┘
│ HTTP
┌─────────────────────────────────────────────────────────────────────┐
│ AegisGitea MCP Server (Docker) │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ │ │ │
│ │ Endpoints: │ │
│ │ - GET /health (Health check) │ │
│ │ - GET /mcp/tools (List available tools) │ │
│ │ - POST /mcp/tool/call (Execute tool) │ │
│ │ - GET /mcp/sse (Server-sent events) │ │
│ └───────────────────────┬───────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┴───────────────────────────────────────┐ │
│ │ MCP Protocol Handler │ │
│ │ - Tool validation │ │
│ │ - Request/response mapping │ │
│ │ - Correlation ID management │ │
│ └───────────────────────┬───────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┴───────────────────────────────────────┐ │
│ │ Tool Implementations │ │
│ │ │ │
│ │ - list_repositories() - get_repository_info() │ │
│ │ - get_file_tree() - get_file_contents() │ │
│ └───────────────────────┬───────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┬────────┴────────┬─────────────────────────────┐ │
│ │ │ │ │ │
│ │ ┌───────────▼───────┐ ┌─────▼──────┐ ┌────────────────┐ │ │
│ │ │ Gitea Client │ │ Config │ │ Audit Logger │ │ │
│ │ │ - Auth │ │ Manager │ │ - Structured │ │ │
│ │ │ - API calls │ │ - Env vars│ │ - JSON logs │ │ │
│ │ │ - Error handling│ │ - Defaults│ │ - Correlation │ │ │
│ │ └───────────┬───────┘ └────────────┘ └────────┬───────┘ │ │
│ │ │ │ │ │
│ └──────────────┼────────────────────────────────────┼─────────┘ │
│ │ │ │
└─────────────────┼────────────────────────────────────┼───────────┘
│ Gitea API │
│ (Authorization: token XXX) │ Audit Logs
▼ ▼
┌─────────────────────────────────────┐ ┌──────────────────────────┐
│ Gitea Instance │ │ Persistent Volume │
│ (Self-hosted VCS) │ │ /var/log/aegis-mcp/ │
│ │ │ audit.log │
│ Repositories: │ └──────────────────────────┘
│ ┌─────────────────────────────┐ │
│ │ org/repo-1 (bot has access)│ │
│ │ org/repo-2 (bot has access)│ │
│ │ org/private (NO ACCESS) │ │
│ └─────────────────────────────┘ │
│ │
│ Bot User: aegis-bot │
│ Permissions: Read-only │
└─────────────────────────────────────┘
```
---
## Component Responsibilities
### 1. ChatGPT (External)
**Responsibility**: Initiate explicit tool calls based on user requests
- Receives MCP tool definitions
- Constructs tool call requests
- Presents results to user
- Human-in-the-loop decision making
### 2. Reverse Proxy
**Responsibility**: TLS termination and routing
- Terminates HTTPS connections
- Routes to MCP server container
- Handles SSL certificates
- Optional: IP filtering, rate limiting
### 3. AegisGitea MCP Server (Core)
**Responsibility**: MCP protocol implementation and policy enforcement
#### 3a. FastAPI Application
- HTTP server with async support
- Server-Sent Events endpoint
- Health and status endpoints
- Request routing
#### 3b. MCP Protocol Handler
- Tool definition management
- Request validation
- Response formatting
- Correlation ID tracking
#### 3c. Tool Implementations
- Repository discovery
- File tree navigation
- File content retrieval
- Bounded, single-purpose operations
#### 3d. Gitea Client
- Async HTTP client for Gitea API
- Bot user authentication
- Error handling and retries
- Response parsing
#### 3e. Config Manager
- Environment variable loading
- Validation with Pydantic
- Default values
- Type safety
#### 3f. Audit Logger
- Structured JSON logging
- Correlation ID tracking
- Timestamp (UTC)
- Append-only logs
### 4. Gitea Instance
**Responsibility**: Authorization and data storage
- Source of truth for permissions
- Repository data storage
- Bot user management
- Access control enforcement
### 5. Persistent Volume
**Responsibility**: Audit log storage
- Durable storage for audit logs
- Survives container restarts
- Accessible for review/analysis
---
## Data Flow: Tool Invocation
```
1. User Request
├─> "Show me files in org/my-repo"
└─> ChatGPT decides to call: get_file_tree(owner="org", repo="my-repo")
2. MCP Request
├─> POST /mcp/tool/call
├─> Body: {"tool": "get_file_tree", "arguments": {"owner": "org", "repo": "my-repo"}}
└─> Generate correlation_id: uuid4()
3. Audit Log (Entry)
├─> Log: tool_invocation
├─> tool_name: "get_file_tree"
├─> repository: "org/my-repo"
└─> status: "pending"
4. Gitea API Call
├─> GET /api/v1/repos/org/my-repo/git/trees/main
├─> Header: Authorization: token XXX
└─> Response: {"tree": [...files...]}
5. Authorization Check
├─> 200 OK → Bot has access
├─> 403 Forbidden → Log access_denied, raise error
└─> 404 Not Found → Repository doesn't exist or no access
6. Response Processing
├─> Extract file tree
├─> Transform to simplified format
└─> Apply size/count limits
7. Audit Log (Success)
├─> Log: tool_invocation
├─> status: "success"
└─> params: {"count": 42}
8. MCP Response
├─> 200 OK
├─> Body: {"success": true, "result": {...files...}}
└─> correlation_id: same as request
9. ChatGPT Processing
├─> Receive file tree data
├─> Format for user presentation
└─> "Here are the files in org/my-repo: ..."
```
---
## Security Boundaries
```
┌───────────────────────────────────────────────────────────────┐
│ Trust Boundary 1 │
│ (Internet ↔ MCP Server) │
│ │
│ Controls: │
│ - HTTPS/TLS encryption │
│ - Reverse proxy authentication (optional) │
│ - Rate limiting │
│ - Firewall rules │
└───────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────┐
│ Trust Boundary 2 │
│ (MCP Server ↔ Gitea API) │
│ │
│ Controls: │
│ - Bot user token authentication │
│ - Gitea's access control (authoritative) │
│ - API request timeouts │
│ - Input validation │
└───────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────┐
│ Trust Boundary 3 │
│ (Container ↔ Host System) │
│ │
│ Controls: │
│ - Non-root container user │
│ - Resource limits (CPU, memory) │
│ - No new privileges │
│ - Read-only filesystem (where possible) │
└───────────────────────────────────────────────────────────────┘
```
---
## Authorization Flow
```
┌──────────────────────────────────────────────────────────────────┐
│ AI requests access to "org/private-repo" │
└────────────────────────┬─────────────────────────────────────────┘
┌───────────────────────────────────────┐
│ MCP Server: Forward to Gitea API │
│ with bot user token │
└───────────────┬───────────────────────┘
┌───────────────────────────────────────┐
│ Gitea: Check bot user permissions │
│ for "org/private-repo" │
└───────────────┬───────────────────────┘
┌───────┴────────┐
│ │
Bot is collaborator? │
│ │
┌────────▼─────┐ ┌──────▼──────┐
│ YES │ │ NO │
│ (Read access)│ │ (No access) │
└────────┬─────┘ └──────┬──────┘
│ │
▼ ▼
┌───────────────┐ ┌─────────────────┐
│ Return data │ │ Return 403 │
│ Log: success │ │ Log: denied │
└───────────────┘ └─────────────────┘
```
**Key Insight**: The MCP server never makes authorization decisions - it only forwards requests and respects Gitea's response.
---
## Failure Modes & Handling
### 1. Gitea Unavailable
- **Detection**: HTTP connection error
- **Response**: Return error to ChatGPT
- **Logging**: Log connection failure
- **Recovery**: Automatic retry on next request
### 2. Invalid Bot Token
- **Detection**: 401 Unauthorized from Gitea
- **Response**: Log security event, return auth error
- **Logging**: High-severity security log
- **Recovery**: Operator must rotate token
### 3. Bot Lacks Permission
- **Detection**: 403 Forbidden from Gitea
- **Response**: Return authorization error
- **Logging**: Access denied event
- **Recovery**: Grant permission in Gitea UI
### 4. File Too Large
- **Detection**: File size exceeds MAX_FILE_SIZE_BYTES
- **Response**: Return size limit error
- **Logging**: Security event (potential abuse)
- **Recovery**: Increase limit or reject request
### 5. Network Timeout
- **Detection**: Request exceeds REQUEST_TIMEOUT_SECONDS
- **Response**: Return timeout error
- **Logging**: Log timeout event
- **Recovery**: Automatic retry possible
### 6. Rate Limit Exceeded
- **Detection**: Too many requests per minute
- **Response**: Return 429 Too Many Requests
- **Logging**: Log rate limit event
- **Recovery**: Wait and retry
---
## Scaling Considerations
### Vertical Scaling (Single Instance)
- **Current**: 128-512 MB RAM, minimal CPU
- **Bottleneck**: Gitea API response time
- **Max throughput**: ~100-200 requests/second
### Horizontal Scaling (Multiple Instances)
- **Stateless design**: Each instance independent
- **Load balancing**: Standard HTTP load balancer
- **Shared state**: None (all state in Gitea)
- **Audit logs**: Each instance writes to own log (or use centralized logging)
### Performance Optimization (Future)
- Add Redis caching layer
- Implement connection pooling
- Use HTTP/2 for Gitea API
- Batch multiple file reads
---
## Observability
### Metrics to Monitor
1. **Request rate**: Requests per minute
2. **Error rate**: Failed requests / total requests
3. **Response time**: P50, P95, P99 latency
4. **Gitea API health**: Success rate to Gitea
5. **Auth failures**: 401/403 responses
### Logs to Track
1. **Audit logs**: Every tool invocation
2. **Access denied**: Permission violations
3. **Security events**: Rate limits, size limits
4. **Errors**: Exceptions and failures
### Alerts to Configure
1. **High error rate**: > 5% errors
2. **Auth failures**: Any 401 responses
3. **Gitea unreachable**: Connection failures
4. **Disk space**: Audit logs filling disk
---
## Future Enhancements
### Phase 2: Extended Context
```
New Tools:
├── get_commits(owner, repo, limit)
├── get_commit_diff(owner, repo, sha)
├── list_issues(owner, repo)
├── get_issue(owner, repo, number)
├── list_pull_requests(owner, repo)
└── get_pull_request(owner, repo, number)
```
### Phase 3: Advanced Features
```
Capabilities:
├── Caching layer (Redis)
├── Webhook support for real-time updates
├── OAuth2 flow instead of static tokens
├── Per-client rate limiting
├── Multi-tenant support (multiple bot users)
└── GraphQL API for more efficient queries
```
---
## Deployment Patterns
### Pattern 1: Single Homelab Instance
```
[Homelab Server]
├── Gitea container
├── AegisGitea MCP container
└── Caddy reverse proxy
└── Exposes HTTPS endpoint
```
### Pattern 2: Kubernetes Deployment
```
[Kubernetes Cluster]
├── Namespace: aegis-mcp
├── Deployment: aegis-mcp (3 replicas)
├── Service: ClusterIP
├── Ingress: HTTPS with cert-manager
└── PersistentVolume: Audit logs
```
### Pattern 3: Cloud Deployment
```
[AWS/GCP/Azure]
├── Container service (ECS/Cloud Run/ACI)
├── Load balancer (ALB/Cloud Load Balancing)
├── Secrets manager (Secrets Manager/Secret Manager/Key Vault)
└── Log aggregation (CloudWatch/Cloud Logging/Monitor)
```
---
## Testing Strategy
### Unit Tests
- Configuration loading
- Gitea client methods
- Tool implementations
- Audit logging
### Integration Tests
- Full MCP protocol flow
- Gitea API interactions (mocked)
- Error handling paths
### End-to-End Tests
- Real Gitea instance
- Real bot user
- Real tool invocations
---
## Conclusion
This architecture prioritizes:
1. **Security**: Read-only, auditable, fail-safe
2. **Simplicity**: Straightforward data flow
3. **Maintainability**: Clear separation of concerns
4. **Observability**: Comprehensive logging
The design is intentionally boring and predictable - perfect for a security-critical system.