457 lines
20 KiB
Markdown
457 lines
20 KiB
Markdown
# AegisGitea MCP - Architecture Documentation
|
|
|
|
---
|
|
|
|
## System Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ ChatGPT Business │
|
|
│ (AI Assistant Interface) │
|
|
│ │
|
|
│ User: "Show me the files in my-repo" │
|
|
└────────────────────────────┬────────────────────────────────────────┘
|
|
│ HTTPS (MCP over SSE)
|
|
│ Tool: get_file_tree(owner, repo)
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ Reverse Proxy (Traefik/Nginx) │
|
|
│ TLS Termination │
|
|
└────────────────────────────┬────────────────────────────────────────┘
|
|
│ HTTP
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────┐
|
|
│ AegisGitea MCP Server (Docker) │
|
|
│ │
|
|
│ ┌───────────────────────────────────────────────────────────────┐ │
|
|
│ │ FastAPI Application │ │
|
|
│ │ │ │
|
|
│ │ Endpoints: │ │
|
|
│ │ - GET /health (Health check) │ │
|
|
│ │ - GET /mcp/tools (List available tools) │ │
|
|
│ │ - POST /mcp/tool/call (Execute tool) │ │
|
|
│ │ - GET /mcp/sse (Server-sent events) │ │
|
|
│ └───────────────────────┬───────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌───────────────────────┴───────────────────────────────────────┐ │
|
|
│ │ MCP Protocol Handler │ │
|
|
│ │ - Tool validation │ │
|
|
│ │ - Request/response mapping │ │
|
|
│ │ - Correlation ID management │ │
|
|
│ └───────────────────────┬───────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌───────────────────────┴───────────────────────────────────────┐ │
|
|
│ │ Tool Implementations │ │
|
|
│ │ │ │
|
|
│ │ - list_repositories() - get_repository_info() │ │
|
|
│ │ - get_file_tree() - get_file_contents() │ │
|
|
│ └───────────────────────┬───────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────┬────────┴────────┬─────────────────────────────┐ │
|
|
│ │ │ │ │ │
|
|
│ │ ┌───────────▼───────┐ ┌─────▼──────┐ ┌────────────────┐ │ │
|
|
│ │ │ Gitea Client │ │ Config │ │ Audit Logger │ │ │
|
|
│ │ │ - Auth │ │ Manager │ │ - Structured │ │ │
|
|
│ │ │ - API calls │ │ - Env vars│ │ - JSON logs │ │ │
|
|
│ │ │ - Error handling│ │ - Defaults│ │ - Correlation │ │ │
|
|
│ │ └───────────┬───────┘ └────────────┘ └────────┬───────┘ │ │
|
|
│ │ │ │ │ │
|
|
│ └──────────────┼────────────────────────────────────┼─────────┘ │
|
|
│ │ │ │
|
|
└─────────────────┼────────────────────────────────────┼───────────┘
|
|
│ Gitea API │
|
|
│ (Authorization: token XXX) │ Audit Logs
|
|
▼ ▼
|
|
┌─────────────────────────────────────┐ ┌──────────────────────────┐
|
|
│ Gitea Instance │ │ Persistent Volume │
|
|
│ (Self-hosted VCS) │ │ /var/log/aegis-mcp/ │
|
|
│ │ │ audit.log │
|
|
│ Repositories: │ └──────────────────────────┘
|
|
│ ┌─────────────────────────────┐ │
|
|
│ │ org/repo-1 (bot has access)│ │
|
|
│ │ org/repo-2 (bot has access)│ │
|
|
│ │ org/private (NO ACCESS) │ │
|
|
│ └─────────────────────────────┘ │
|
|
│ │
|
|
│ Bot User: aegis-bot │
|
|
│ Permissions: Read-only │
|
|
└─────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Component Responsibilities
|
|
|
|
### 1. ChatGPT (External)
|
|
**Responsibility**: Initiate explicit tool calls based on user requests
|
|
|
|
- Receives MCP tool definitions
|
|
- Constructs tool call requests
|
|
- Presents results to user
|
|
- Human-in-the-loop decision making
|
|
|
|
### 2. Reverse Proxy
|
|
**Responsibility**: TLS termination and routing
|
|
|
|
- Terminates HTTPS connections
|
|
- Routes to MCP server container
|
|
- Handles SSL certificates
|
|
- Optional: IP filtering, rate limiting
|
|
|
|
### 3. AegisGitea MCP Server (Core)
|
|
**Responsibility**: MCP protocol implementation and policy enforcement
|
|
|
|
#### 3a. FastAPI Application
|
|
- HTTP server with async support
|
|
- Server-Sent Events endpoint
|
|
- Health and status endpoints
|
|
- Request routing
|
|
|
|
#### 3b. MCP Protocol Handler
|
|
- Tool definition management
|
|
- Request validation
|
|
- Response formatting
|
|
- Correlation ID tracking
|
|
|
|
#### 3c. Tool Implementations
|
|
- Repository discovery
|
|
- File tree navigation
|
|
- File content retrieval
|
|
- Bounded, single-purpose operations
|
|
|
|
#### 3d. Gitea Client
|
|
- Async HTTP client for Gitea API
|
|
- Bot user authentication
|
|
- Error handling and retries
|
|
- Response parsing
|
|
|
|
#### 3e. Config Manager
|
|
- Environment variable loading
|
|
- Validation with Pydantic
|
|
- Default values
|
|
- Type safety
|
|
|
|
#### 3f. Audit Logger
|
|
- Structured JSON logging
|
|
- Correlation ID tracking
|
|
- Timestamp (UTC)
|
|
- Append-only logs
|
|
|
|
### 4. Gitea Instance
|
|
**Responsibility**: Authorization and data storage
|
|
|
|
- Source of truth for permissions
|
|
- Repository data storage
|
|
- Bot user management
|
|
- Access control enforcement
|
|
|
|
### 5. Persistent Volume
|
|
**Responsibility**: Audit log storage
|
|
|
|
- Durable storage for audit logs
|
|
- Survives container restarts
|
|
- Accessible for review/analysis
|
|
|
|
---
|
|
|
|
## Data Flow: Tool Invocation
|
|
|
|
```
|
|
1. User Request
|
|
├─> "Show me files in org/my-repo"
|
|
└─> ChatGPT decides to call: get_file_tree(owner="org", repo="my-repo")
|
|
|
|
2. MCP Request
|
|
├─> POST /mcp/tool/call
|
|
├─> Body: {"tool": "get_file_tree", "arguments": {"owner": "org", "repo": "my-repo"}}
|
|
└─> Generate correlation_id: uuid4()
|
|
|
|
3. Audit Log (Entry)
|
|
├─> Log: tool_invocation
|
|
├─> tool_name: "get_file_tree"
|
|
├─> repository: "org/my-repo"
|
|
└─> status: "pending"
|
|
|
|
4. Gitea API Call
|
|
├─> GET /api/v1/repos/org/my-repo/git/trees/main
|
|
├─> Header: Authorization: token XXX
|
|
└─> Response: {"tree": [...files...]}
|
|
|
|
5. Authorization Check
|
|
├─> 200 OK → Bot has access
|
|
├─> 403 Forbidden → Log access_denied, raise error
|
|
└─> 404 Not Found → Repository doesn't exist or no access
|
|
|
|
6. Response Processing
|
|
├─> Extract file tree
|
|
├─> Transform to simplified format
|
|
└─> Apply size/count limits
|
|
|
|
7. Audit Log (Success)
|
|
├─> Log: tool_invocation
|
|
├─> status: "success"
|
|
└─> params: {"count": 42}
|
|
|
|
8. MCP Response
|
|
├─> 200 OK
|
|
├─> Body: {"success": true, "result": {...files...}}
|
|
└─> correlation_id: same as request
|
|
|
|
9. ChatGPT Processing
|
|
├─> Receive file tree data
|
|
├─> Format for user presentation
|
|
└─> "Here are the files in org/my-repo: ..."
|
|
```
|
|
|
|
---
|
|
|
|
## Security Boundaries
|
|
|
|
```
|
|
┌───────────────────────────────────────────────────────────────┐
|
|
│ Trust Boundary 1 │
|
|
│ (Internet ↔ MCP Server) │
|
|
│ │
|
|
│ Controls: │
|
|
│ - HTTPS/TLS encryption │
|
|
│ - Reverse proxy authentication (optional) │
|
|
│ - Rate limiting │
|
|
│ - Firewall rules │
|
|
└───────────────────────────────────────────────────────────────┘
|
|
|
|
┌───────────────────────────────────────────────────────────────┐
|
|
│ Trust Boundary 2 │
|
|
│ (MCP Server ↔ Gitea API) │
|
|
│ │
|
|
│ Controls: │
|
|
│ - Bot user token authentication │
|
|
│ - Gitea's access control (authoritative) │
|
|
│ - API request timeouts │
|
|
│ - Input validation │
|
|
└───────────────────────────────────────────────────────────────┘
|
|
|
|
┌───────────────────────────────────────────────────────────────┐
|
|
│ Trust Boundary 3 │
|
|
│ (Container ↔ Host System) │
|
|
│ │
|
|
│ Controls: │
|
|
│ - Non-root container user │
|
|
│ - Resource limits (CPU, memory) │
|
|
│ - No new privileges │
|
|
│ - Read-only filesystem (where possible) │
|
|
└───────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Authorization Flow
|
|
|
|
```
|
|
┌──────────────────────────────────────────────────────────────────┐
|
|
│ AI requests access to "org/private-repo" │
|
|
└────────────────────────┬─────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌───────────────────────────────────────┐
|
|
│ MCP Server: Forward to Gitea API │
|
|
│ with bot user token │
|
|
└───────────────┬───────────────────────┘
|
|
│
|
|
▼
|
|
┌───────────────────────────────────────┐
|
|
│ Gitea: Check bot user permissions │
|
|
│ for "org/private-repo" │
|
|
└───────────────┬───────────────────────┘
|
|
│
|
|
┌───────┴────────┐
|
|
│ │
|
|
Bot is collaborator? │
|
|
│ │
|
|
┌────────▼─────┐ ┌──────▼──────┐
|
|
│ YES │ │ NO │
|
|
│ (Read access)│ │ (No access) │
|
|
└────────┬─────┘ └──────┬──────┘
|
|
│ │
|
|
▼ ▼
|
|
┌───────────────┐ ┌─────────────────┐
|
|
│ Return data │ │ Return 403 │
|
|
│ Log: success │ │ Log: denied │
|
|
└───────────────┘ └─────────────────┘
|
|
```
|
|
|
|
**Key Insight**: The MCP server never makes authorization decisions - it only forwards requests and respects Gitea's response.
|
|
|
|
---
|
|
|
|
## Failure Modes & Handling
|
|
|
|
### 1. Gitea Unavailable
|
|
- **Detection**: HTTP connection error
|
|
- **Response**: Return error to ChatGPT
|
|
- **Logging**: Log connection failure
|
|
- **Recovery**: Automatic retry on next request
|
|
|
|
### 2. Invalid Bot Token
|
|
- **Detection**: 401 Unauthorized from Gitea
|
|
- **Response**: Log security event, return auth error
|
|
- **Logging**: High-severity security log
|
|
- **Recovery**: Operator must rotate token
|
|
|
|
### 3. Bot Lacks Permission
|
|
- **Detection**: 403 Forbidden from Gitea
|
|
- **Response**: Return authorization error
|
|
- **Logging**: Access denied event
|
|
- **Recovery**: Grant permission in Gitea UI
|
|
|
|
### 4. File Too Large
|
|
- **Detection**: File size exceeds MAX_FILE_SIZE_BYTES
|
|
- **Response**: Return size limit error
|
|
- **Logging**: Security event (potential abuse)
|
|
- **Recovery**: Increase limit or reject request
|
|
|
|
### 5. Network Timeout
|
|
- **Detection**: Request exceeds REQUEST_TIMEOUT_SECONDS
|
|
- **Response**: Return timeout error
|
|
- **Logging**: Log timeout event
|
|
- **Recovery**: Automatic retry possible
|
|
|
|
### 6. Rate Limit Exceeded
|
|
- **Detection**: Too many requests per minute
|
|
- **Response**: Return 429 Too Many Requests
|
|
- **Logging**: Log rate limit event
|
|
- **Recovery**: Wait and retry
|
|
|
|
---
|
|
|
|
## Scaling Considerations
|
|
|
|
### Vertical Scaling (Single Instance)
|
|
- **Current**: 128-512 MB RAM, minimal CPU
|
|
- **Bottleneck**: Gitea API response time
|
|
- **Max throughput**: ~100-200 requests/second
|
|
|
|
### Horizontal Scaling (Multiple Instances)
|
|
- **Stateless design**: Each instance independent
|
|
- **Load balancing**: Standard HTTP load balancer
|
|
- **Shared state**: None (all state in Gitea)
|
|
- **Audit logs**: Each instance writes to own log (or use centralized logging)
|
|
|
|
### Performance Optimization (Future)
|
|
- Add Redis caching layer
|
|
- Implement connection pooling
|
|
- Use HTTP/2 for Gitea API
|
|
- Batch multiple file reads
|
|
|
|
---
|
|
|
|
## Observability
|
|
|
|
### Metrics to Monitor
|
|
1. **Request rate**: Requests per minute
|
|
2. **Error rate**: Failed requests / total requests
|
|
3. **Response time**: P50, P95, P99 latency
|
|
4. **Gitea API health**: Success rate to Gitea
|
|
5. **Auth failures**: 401/403 responses
|
|
|
|
### Logs to Track
|
|
1. **Audit logs**: Every tool invocation
|
|
2. **Access denied**: Permission violations
|
|
3. **Security events**: Rate limits, size limits
|
|
4. **Errors**: Exceptions and failures
|
|
|
|
### Alerts to Configure
|
|
1. **High error rate**: > 5% errors
|
|
2. **Auth failures**: Any 401 responses
|
|
3. **Gitea unreachable**: Connection failures
|
|
4. **Disk space**: Audit logs filling disk
|
|
|
|
---
|
|
|
|
## Future Enhancements
|
|
|
|
### Phase 2: Extended Context
|
|
```
|
|
New Tools:
|
|
├── get_commits(owner, repo, limit)
|
|
├── get_commit_diff(owner, repo, sha)
|
|
├── list_issues(owner, repo)
|
|
├── get_issue(owner, repo, number)
|
|
├── list_pull_requests(owner, repo)
|
|
└── get_pull_request(owner, repo, number)
|
|
```
|
|
|
|
### Phase 3: Advanced Features
|
|
```
|
|
Capabilities:
|
|
├── Caching layer (Redis)
|
|
├── Webhook support for real-time updates
|
|
├── OAuth2 flow instead of static tokens
|
|
├── Per-client rate limiting
|
|
├── Multi-tenant support (multiple bot users)
|
|
└── GraphQL API for more efficient queries
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment Patterns
|
|
|
|
### Pattern 1: Single Homelab Instance
|
|
```
|
|
[Homelab Server]
|
|
├── Gitea container
|
|
├── AegisGitea MCP container
|
|
└── Caddy reverse proxy
|
|
└── Exposes HTTPS endpoint
|
|
```
|
|
|
|
### Pattern 2: Kubernetes Deployment
|
|
```
|
|
[Kubernetes Cluster]
|
|
├── Namespace: aegis-mcp
|
|
├── Deployment: aegis-mcp (3 replicas)
|
|
├── Service: ClusterIP
|
|
├── Ingress: HTTPS with cert-manager
|
|
└── PersistentVolume: Audit logs
|
|
```
|
|
|
|
### Pattern 3: Cloud Deployment
|
|
```
|
|
[AWS/GCP/Azure]
|
|
├── Container service (ECS/Cloud Run/ACI)
|
|
├── Load balancer (ALB/Cloud Load Balancing)
|
|
├── Secrets manager (Secrets Manager/Secret Manager/Key Vault)
|
|
└── Log aggregation (CloudWatch/Cloud Logging/Monitor)
|
|
```
|
|
|
|
---
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
- Configuration loading
|
|
- Gitea client methods
|
|
- Tool implementations
|
|
- Audit logging
|
|
|
|
### Integration Tests
|
|
- Full MCP protocol flow
|
|
- Gitea API interactions (mocked)
|
|
- Error handling paths
|
|
|
|
### End-to-End Tests
|
|
- Real Gitea instance
|
|
- Real bot user
|
|
- Real tool invocations
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
This architecture prioritizes:
|
|
1. **Security**: Read-only, auditable, fail-safe
|
|
2. **Simplicity**: Straightforward data flow
|
|
3. **Maintainability**: Clear separation of concerns
|
|
4. **Observability**: Comprehensive logging
|
|
|
|
The design is intentionally boring and predictable - perfect for a security-critical system.
|