# AegisGitea MCP - Architecture Documentation

---

## System Overview

```
┌─────────────────────────────────────────────────────────────────────┐
│                         ChatGPT Business                            │
│                     (AI Assistant Interface)                        │
│                                                                     │
│  User: "Show me the files in my-repo"                              │
└────────────────────────────┬────────────────────────────────────────┘
                             │ HTTPS (MCP over SSE)
                             │ Tool: get_file_tree(owner, repo)
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      Reverse Proxy (Traefik/Nginx)                  │
│                        TLS Termination                              │
└────────────────────────────┬────────────────────────────────────────┘
                             │ HTTP
                             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    AegisGitea MCP Server (Docker)                   │
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐ │
│  │                 FastAPI Application                           │ │
│  │                                                               │ │
│  │  Endpoints:                                                   │ │
│  │  - GET  /health              (Health check)                  │ │
│  │  - GET  /mcp/tools           (List available tools)          │ │
│  │  - POST /mcp/tool/call       (Execute tool)                  │ │
│  │  - GET  /mcp/sse             (Server-sent events)            │ │
│  └───────────────────────┬───────────────────────────────────────┘ │
│                          │                                         │
│  ┌───────────────────────┴───────────────────────────────────────┐ │
│  │              MCP Protocol Handler                             │ │
│  │  - Tool validation                                            │ │
│  │  - Request/response mapping                                   │ │
│  │  - Correlation ID management                                  │ │
│  └───────────────────────┬───────────────────────────────────────┘ │
│                          │                                         │
│  ┌───────────────────────┴───────────────────────────────────────┐ │
│  │                   Tool Implementations                        │ │
│  │                                                               │ │
│  │  - list_repositories()     - get_repository_info()           │ │
│  │  - get_file_tree()         - get_file_contents()             │ │
│  └───────────────────────┬───────────────────────────────────────┘ │
│                          │                                         │
│  ┌──────────────┬────────┴────────┬─────────────────────────────┐ │
│  │              │                 │                             │ │
│  │  ┌───────────▼───────┐  ┌─────▼──────┐  ┌────────────────┐ │ │
│  │  │  Gitea Client     │  │   Config   │  │ Audit Logger   │ │ │
│  │  │  - Auth          │  │  Manager   │  │ - Structured   │ │ │
│  │  │  - API calls     │  │  - Env vars│  │ - JSON logs    │ │ │
│  │  │  - Error handling│  │  - Defaults│  │ - Correlation  │ │ │
│  │  └───────────┬───────┘  └────────────┘  └────────┬───────┘ │ │
│  │              │                                    │         │ │
│  └──────────────┼────────────────────────────────────┼─────────┘ │
│                 │                                    │           │
└─────────────────┼────────────────────────────────────┼───────────┘
                  │ Gitea API                          │
                  │ (Authorization: token XXX)         │ Audit Logs
                  ▼                                    ▼
┌─────────────────────────────────────┐  ┌──────────────────────────┐
│         Gitea Instance              │  │   Persistent Volume      │
│     (Self-hosted VCS)               │  │   /var/log/aegis-mcp/    │
│                                     │  │   audit.log              │
│  Repositories:                      │  └──────────────────────────┘
│  ┌─────────────────────────────┐   │
│  │ org/repo-1  (bot has access)│   │
│  │ org/repo-2  (bot has access)│   │
│  │ org/private (NO ACCESS)     │   │
│  └─────────────────────────────┘   │
│                                     │
│  Bot User: aegis-bot                │
│  Permissions: Read-only             │
└─────────────────────────────────────┘
```

---

## Component Responsibilities

### 1. ChatGPT (External)
**Responsibility**: Initiate explicit tool calls based on user requests

- Receives MCP tool definitions
- Constructs tool call requests
- Presents results to user
- Human-in-the-loop decision making

### 2. Reverse Proxy
**Responsibility**: TLS termination and routing

- Terminates HTTPS connections
- Routes to MCP server container
- Handles SSL certificates
- Optional: IP filtering, rate limiting

### 3. AegisGitea MCP Server (Core)
**Responsibility**: MCP protocol implementation and policy enforcement

#### 3a. FastAPI Application
- HTTP server with async support
- Server-Sent Events endpoint
- Health and status endpoints
- Request routing

#### 3b. MCP Protocol Handler
- Tool definition management
- Request validation
- Response formatting
- Correlation ID tracking

#### 3c. Tool Implementations
- Repository discovery
- File tree navigation
- File content retrieval
- Bounded, single-purpose operations

#### 3d. Gitea Client
- Async HTTP client for Gitea API
- Bot user authentication
- Error handling and retries
- Response parsing

#### 3e. Config Manager
- Environment variable loading
- Validation with Pydantic
- Default values
- Type safety

#### 3f. Audit Logger
- Structured JSON logging
- Correlation ID tracking
- Timestamp (UTC)
- Append-only logs

### 4. Gitea Instance
**Responsibility**: Authorization and data storage

- Source of truth for permissions
- Repository data storage
- Bot user management
- Access control enforcement

### 5. Persistent Volume
**Responsibility**: Audit log storage

- Durable storage for audit logs
- Survives container restarts
- Accessible for review/analysis

---

## Data Flow: Tool Invocation

```
1. User Request
   ├─> "Show me files in org/my-repo"
   └─> ChatGPT decides to call: get_file_tree(owner="org", repo="my-repo")

2. MCP Request
   ├─> POST /mcp/tool/call
   ├─> Body: {"tool": "get_file_tree", "arguments": {"owner": "org", "repo": "my-repo"}}
   └─> Generate correlation_id: uuid4()

3. Audit Log (Entry)
   ├─> Log: tool_invocation
   ├─> tool_name: "get_file_tree"
   ├─> repository: "org/my-repo"
   └─> status: "pending"

4. Gitea API Call
   ├─> GET /api/v1/repos/org/my-repo/git/trees/main
   ├─> Header: Authorization: token XXX
   └─> Response: {"tree": [...files...]}

5. Authorization Check
   ├─> 200 OK → Bot has access
   ├─> 403 Forbidden → Log access_denied, raise error
   └─> 404 Not Found → Repository doesn't exist or no access

6. Response Processing
   ├─> Extract file tree
   ├─> Transform to simplified format
   └─> Apply size/count limits

7. Audit Log (Success)
   ├─> Log: tool_invocation
   ├─> status: "success"
   └─> params: {"count": 42}

8. MCP Response
   ├─> 200 OK
   ├─> Body: {"success": true, "result": {...files...}}
   └─> correlation_id: same as request

9. ChatGPT Processing
   ├─> Receive file tree data
   ├─> Format for user presentation
   └─> "Here are the files in org/my-repo: ..."
```

---

## Security Boundaries

```
┌───────────────────────────────────────────────────────────────┐
│                      Trust Boundary 1                         │
│                   (Internet ↔ MCP Server)                     │
│                                                               │
│  Controls:                                                    │
│  - HTTPS/TLS encryption                                       │
│  - Reverse proxy authentication (optional)                    │
│  - Rate limiting                                              │
│  - Firewall rules                                             │
└───────────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────────┐
│                      Trust Boundary 2                         │
│                 (MCP Server ↔ Gitea API)                      │
│                                                               │
│  Controls:                                                    │
│  - Bot user token authentication                              │
│  - Gitea's access control (authoritative)                     │
│  - API request timeouts                                       │
│  - Input validation                                           │
└───────────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────────┐
│                      Trust Boundary 3                         │
│              (Container ↔ Host System)                        │
│                                                               │
│  Controls:                                                    │
│  - Non-root container user                                    │
│  - Resource limits (CPU, memory)                              │
│  - No new privileges                                          │
│  - Read-only filesystem (where possible)                      │
└───────────────────────────────────────────────────────────────┘
```

---

## Authorization Flow

```
┌──────────────────────────────────────────────────────────────────┐
│  AI requests access to "org/private-repo"                       │
└────────────────────────┬─────────────────────────────────────────┘
                         │
                         ▼
         ┌───────────────────────────────────────┐
         │ MCP Server: Forward to Gitea API      │
         │ with bot user token                   │
         └───────────────┬───────────────────────┘
                         │
                         ▼
         ┌───────────────────────────────────────┐
         │ Gitea: Check bot user permissions    │
         │ for "org/private-repo"                │
         └───────────────┬───────────────────────┘
                         │
                 ┌───────┴────────┐
                 │                │
         Bot is collaborator?     │
                 │                │
        ┌────────▼─────┐   ┌──────▼──────┐
        │ YES          │   │ NO          │
        │ (Read access)│   │ (No access) │
        └────────┬─────┘   └──────┬──────┘
                 │                │
                 ▼                ▼
         ┌───────────────┐  ┌─────────────────┐
         │ Return data   │  │ Return 403      │
         │ Log: success  │  │ Log: denied     │
         └───────────────┘  └─────────────────┘
```

**Key Insight**: The MCP server never makes authorization decisions - it only forwards requests and respects Gitea's response.

---

## Failure Modes & Handling

### 1. Gitea Unavailable
- **Detection**: HTTP connection error
- **Response**: Return error to ChatGPT
- **Logging**: Log connection failure
- **Recovery**: Automatic retry on next request

### 2. Invalid Bot Token
- **Detection**: 401 Unauthorized from Gitea
- **Response**: Log security event, return auth error
- **Logging**: High-severity security log
- **Recovery**: Operator must rotate token

### 3. Bot Lacks Permission
- **Detection**: 403 Forbidden from Gitea
- **Response**: Return authorization error
- **Logging**: Access denied event
- **Recovery**: Grant permission in Gitea UI

### 4. File Too Large
- **Detection**: File size exceeds MAX_FILE_SIZE_BYTES
- **Response**: Return size limit error
- **Logging**: Security event (potential abuse)
- **Recovery**: Increase limit or reject request

### 5. Network Timeout
- **Detection**: Request exceeds REQUEST_TIMEOUT_SECONDS
- **Response**: Return timeout error
- **Logging**: Log timeout event
- **Recovery**: Automatic retry possible

### 6. Rate Limit Exceeded
- **Detection**: Too many requests per minute
- **Response**: Return 429 Too Many Requests
- **Logging**: Log rate limit event
- **Recovery**: Wait and retry

---

## Scaling Considerations

### Vertical Scaling (Single Instance)
- **Current**: 128-512 MB RAM, minimal CPU
- **Bottleneck**: Gitea API response time
- **Max throughput**: ~100-200 requests/second

### Horizontal Scaling (Multiple Instances)
- **Stateless design**: Each instance independent
- **Load balancing**: Standard HTTP load balancer
- **Shared state**: None (all state in Gitea)
- **Audit logs**: Each instance writes to own log (or use centralized logging)

### Performance Optimization (Future)
- Add Redis caching layer
- Implement connection pooling
- Use HTTP/2 for Gitea API
- Batch multiple file reads

---

## Observability

### Metrics to Monitor
1. **Request rate**: Requests per minute
2. **Error rate**: Failed requests / total requests
3. **Response time**: P50, P95, P99 latency
4. **Gitea API health**: Success rate to Gitea
5. **Auth failures**: 401/403 responses

### Logs to Track
1. **Audit logs**: Every tool invocation
2. **Access denied**: Permission violations
3. **Security events**: Rate limits, size limits
4. **Errors**: Exceptions and failures

### Alerts to Configure
1. **High error rate**: > 5% errors
2. **Auth failures**: Any 401 responses
3. **Gitea unreachable**: Connection failures
4. **Disk space**: Audit logs filling disk

---

## Future Enhancements

### Phase 2: Extended Context
```
New Tools:
├── get_commits(owner, repo, limit)
├── get_commit_diff(owner, repo, sha)
├── list_issues(owner, repo)
├── get_issue(owner, repo, number)
├── list_pull_requests(owner, repo)
└── get_pull_request(owner, repo, number)
```

### Phase 3: Advanced Features
```
Capabilities:
├── Caching layer (Redis)
├── Webhook support for real-time updates
├── OAuth2 flow instead of static tokens
├── Per-client rate limiting
├── Multi-tenant support (multiple bot users)
└── GraphQL API for more efficient queries
```

---

## Deployment Patterns

### Pattern 1: Single Homelab Instance
```
[Homelab Server]
├── Gitea container
├── AegisGitea MCP container
└── Caddy reverse proxy
    └── Exposes HTTPS endpoint
```

### Pattern 2: Kubernetes Deployment
```
[Kubernetes Cluster]
├── Namespace: aegis-mcp
├── Deployment: aegis-mcp (3 replicas)
├── Service: ClusterIP
├── Ingress: HTTPS with cert-manager
└── PersistentVolume: Audit logs
```

### Pattern 3: Cloud Deployment
```
[AWS/GCP/Azure]
├── Container service (ECS/Cloud Run/ACI)
├── Load balancer (ALB/Cloud Load Balancing)
├── Secrets manager (Secrets Manager/Secret Manager/Key Vault)
└── Log aggregation (CloudWatch/Cloud Logging/Monitor)
```

---

## Testing Strategy

### Unit Tests
- Configuration loading
- Gitea client methods
- Tool implementations
- Audit logging

### Integration Tests
- Full MCP protocol flow
- Gitea API interactions (mocked)
- Error handling paths

### End-to-End Tests
- Real Gitea instance
- Real bot user
- Real tool invocations

---

## Conclusion

This architecture prioritizes:
1. **Security**: Read-only, auditable, fail-safe
2. **Simplicity**: Straightforward data flow
3. **Maintainability**: Clear separation of concerns
4. **Observability**: Comprehensive logging

The design is intentionally boring and predictable - perfect for a security-critical system.