323 lines
9.7 KiB
Markdown
323 lines
9.7 KiB
Markdown
# AegisGitea MCP
|
|
|
|
**A private, security-first MCP server for controlled AI access to self-hosted Gitea**
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
AegisGitea MCP is a Model Context Protocol (MCP) server that enables controlled, auditable, read-only AI access to a self-hosted Gitea environment.
|
|
|
|
The system allows ChatGPT (Business / Developer environment) to inspect repositories, code, commits, issues, and pull requests **only through explicit MCP tool calls**, while all access control is dynamically managed through a dedicated bot user inside Gitea itself.
|
|
|
|
### Core Principles
|
|
|
|
- **Strong separation of concerns**: Clear boundaries between AI, MCP server, and Gitea
|
|
- **Least-privilege access**: Bot user has minimal necessary permissions
|
|
- **Full auditability**: Every AI action is logged with context
|
|
- **Dynamic authorization**: Access control via Gitea permissions (no redeployment needed)
|
|
- **Privacy-first**: Designed for homelab and private infrastructure
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ ChatGPT (Business/Developer) │
|
|
│ - Initiates explicit MCP tool calls │
|
|
│ - Human-in-the-loop decision making │
|
|
└────────────────────┬────────────────────────────────────────┘
|
|
│ HTTPS (MCP over SSE)
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ AegisGitea MCP Server (Python, Docker) │
|
|
│ - Implements MCP protocol │
|
|
│ - Translates tool calls → Gitea API requests │
|
|
│ - Enforces access, logging, and safety constraints │
|
|
│ - Provides bounded, single-purpose tools │
|
|
└────────────────────┬────────────────────────────────────────┘
|
|
│ Gitea API (Bot User Token)
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Gitea Instance (Docker) │
|
|
│ - Source of truth for authorization │
|
|
│ - Hosts dedicated read-only bot user │
|
|
│ - Determines AI-visible repositories dynamically │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Trust Model
|
|
|
|
| Component | Responsibility |
|
|
|-----------|----------------|
|
|
| **Gitea** | Authorization (what the AI can see) |
|
|
| **MCP Server** | Policy enforcement (how the AI accesses data) |
|
|
| **ChatGPT** | Decision initiation (when the AI acts) |
|
|
| **Human** | Final decision authority (why the AI acts) |
|
|
|
|
---
|
|
|
|
## Features
|
|
|
|
### Phase 1 — Foundation (Current)
|
|
|
|
- MCP protocol handling with SSE lifecycle
|
|
- Secure Gitea API communication via bot user token
|
|
- Health and readiness endpoints
|
|
- ChatGPT MCP registration flow
|
|
|
|
### Phase 2 — Authorization & Data Access (Planned)
|
|
|
|
- Repository discovery based on bot user permissions
|
|
- File tree and content retrieval with size limits
|
|
- Dynamic access control (changes in Gitea apply instantly)
|
|
|
|
### Phase 3 — Audit & Hardening (Planned)
|
|
|
|
- Comprehensive audit logging (timestamp, tool, repo, path, correlation ID)
|
|
- Request correlation and tracing
|
|
- Input validation and rate limiting
|
|
- Defensive bounds on all operations
|
|
|
|
### Phase 4 — Extended Context (Future)
|
|
|
|
- Commit history and diff inspection
|
|
- Issue and pull request visibility
|
|
- Full contextual understanding while maintaining read-only guarantees
|
|
|
|
---
|
|
|
|
## Authorization Model
|
|
|
|
### Bot User Strategy
|
|
|
|
A dedicated Gitea bot user represents "the AI":
|
|
|
|
- The MCP server authenticates as this user using a read-only token
|
|
- The bot user's repository permissions define AI visibility
|
|
- **No admin privileges**
|
|
- **No write permissions**
|
|
- **No implicit access**
|
|
|
|
This allows dynamic enable/disable of AI access **without restarting or reconfiguring** the MCP server.
|
|
|
|
**Example:**
|
|
```bash
|
|
# Grant AI access to a repository
|
|
git clone https://gitea.example.com/org/repo.git
|
|
cd repo
|
|
# Add bot user as collaborator with Read permission in Gitea UI
|
|
|
|
# Revoke AI access
|
|
# Remove bot user from repository in Gitea UI
|
|
```
|
|
|
|
---
|
|
|
|
## MCP Tool Design
|
|
|
|
All tools are:
|
|
|
|
- **Explicit**: Single-purpose, no hidden behavior
|
|
- **Deterministic**: Same input always produces same output
|
|
- **Bounded**: Size limits, path constraints, no wildcards
|
|
- **Auditable**: Full logging of every invocation
|
|
|
|
### Tool Categories
|
|
|
|
1. **Repository Discovery**
|
|
- List repositories visible to bot user
|
|
- Get repository metadata
|
|
|
|
2. **File Operations**
|
|
- Get file tree for a repository
|
|
- Read file contents (with size limits)
|
|
|
|
3. **Commit History** (Phase 4)
|
|
- List commits for a repository
|
|
- Get commit details and diffs
|
|
|
|
4. **Issues & PRs** (Phase 4)
|
|
- List issues and pull requests
|
|
- Read issue/PR details and comments
|
|
|
|
### Explicit Constraints
|
|
|
|
- No wildcard search tools
|
|
- No full-text indexing
|
|
- No recursive "read everything" operations
|
|
- No hidden or implicit data access
|
|
|
|
---
|
|
|
|
## Audit & Observability
|
|
|
|
Every MCP tool invocation logs:
|
|
|
|
- **Timestamp** (UTC)
|
|
- **Tool name**
|
|
- **Repository identifier**
|
|
- **Target** (path / commit / issue)
|
|
- **Correlation ID**
|
|
|
|
Logs are:
|
|
|
|
- Append-only
|
|
- Human-readable JSON
|
|
- Machine-parseable
|
|
- Stored locally by default
|
|
|
|
**Audit Philosophy**: The system must answer "What exactly did the AI see, and when?" without ambiguity.
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Prerequisites
|
|
|
|
- Docker and Docker Compose
|
|
- Self-hosted Gitea instance
|
|
- Gitea bot user with read-only access token
|
|
|
|
### Quick Start
|
|
|
|
```bash
|
|
# Clone repository
|
|
git clone https://gitea.example.com/your-org/AegisGitea-MCP.git
|
|
cd AegisGitea-MCP
|
|
|
|
# Configure environment
|
|
cp .env.example .env
|
|
# Edit .env with your Gitea URL and bot token
|
|
|
|
# Start MCP server
|
|
docker-compose up -d
|
|
|
|
# Check logs
|
|
docker-compose logs -f aegis-mcp
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Description | Required |
|
|
|----------|-------------|----------|
|
|
| `GITEA_URL` | Base URL of Gitea instance | Yes |
|
|
| `GITEA_TOKEN` | Bot user access token | Yes |
|
|
| `MCP_HOST` | MCP server listen host | No (default: 0.0.0.0) |
|
|
| `MCP_PORT` | MCP server listen port | No (default: 8080) |
|
|
| `LOG_LEVEL` | Logging verbosity | No (default: INFO) |
|
|
| `AUDIT_LOG_PATH` | Audit log file path | No (default: /var/log/aegis-mcp/audit.log) |
|
|
|
|
### Security Considerations
|
|
|
|
1. **Never expose the MCP server publicly** — use a reverse proxy with TLS
|
|
2. **Rotate bot tokens regularly**
|
|
3. **Monitor audit logs** for unexpected access patterns
|
|
4. **Keep Docker images updated**
|
|
5. **Use a dedicated bot user** — never use a personal account token
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
# Create virtual environment
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
|
|
# Install dependencies
|
|
pip install -r requirements-dev.txt
|
|
|
|
# Run tests
|
|
pytest tests/
|
|
|
|
# Run server locally
|
|
python -m aegis_gitea_mcp.server
|
|
```
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
AegisGitea-MCP/
|
|
├── src/
|
|
│ └── aegis_gitea_mcp/
|
|
│ ├── __init__.py
|
|
│ ├── server.py # MCP server entry point
|
|
│ ├── mcp_protocol.py # MCP protocol implementation
|
|
│ ├── gitea_client.py # Gitea API client
|
|
│ ├── audit.py # Audit logging
|
|
│ ├── config.py # Configuration management
|
|
│ └── tools/ # MCP tool implementations
|
|
│ ├── __init__.py
|
|
│ ├── repository.py # Repository discovery tools
|
|
│ └── files.py # File access tools
|
|
├── tests/
|
|
│ ├── test_mcp_protocol.py
|
|
│ ├── test_gitea_client.py
|
|
│ └── test_tools.py
|
|
├── docker/
|
|
│ ├── Dockerfile
|
|
│ └── docker-compose.yml
|
|
├── .env.example
|
|
├── pyproject.toml
|
|
├── requirements.txt
|
|
├── requirements-dev.txt
|
|
└── README.md
|
|
```
|
|
|
|
---
|
|
|
|
## Non-Goals
|
|
|
|
Explicitly **out of scope**:
|
|
|
|
- No write access to Gitea (no commits, comments, merges, edits)
|
|
- No autonomous or background scanning
|
|
- No global search or unrestricted crawling
|
|
- No public exposure of repositories or credentials
|
|
- No coupling to GitHub or external VCS platforms
|
|
|
|
---
|
|
|
|
## Roadmap
|
|
|
|
- [x] Project initialization and architecture design
|
|
- [ ] **Phase 1**: MCP server foundation and Gitea integration
|
|
- [ ] **Phase 2**: Repository discovery and file access tools
|
|
- [ ] **Phase 3**: Audit logging and security hardening
|
|
- [ ] **Phase 4**: Commit history, issues, and PR support
|
|
|
|
---
|
|
|
|
## Contributing
|
|
|
|
This project prioritizes security and privacy. Contributions should:
|
|
|
|
1. Maintain read-only guarantees
|
|
2. Add comprehensive audit logging for new tools
|
|
3. Include tests for authorization and boundary cases
|
|
4. Document security implications
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT License - See LICENSE file for details
|
|
|
|
---
|
|
|
|
## Acknowledgments
|
|
|
|
Built on the [Model Context Protocol](https://modelcontextprotocol.io/) by Anthropic.
|
|
|
|
---
|
|
|
|
## Support
|
|
|
|
For issues, questions, or security concerns, please open an issue in the Gitea repository.
|
|
|
|
**Remember**: This is designed to be **boring, predictable, and safe** — not clever, not magical, and not autonomous.
|