# AegisGitea MCP **A private, security-first MCP server for controlled AI access to self-hosted Gitea** --- ## Overview AegisGitea MCP is a Model Context Protocol (MCP) server that enables controlled, auditable, read-only AI access to a self-hosted Gitea environment. The system allows ChatGPT (Business / Developer environment) to inspect repositories, code, commits, issues, and pull requests **only through explicit MCP tool calls**, while all access control is dynamically managed through a dedicated bot user inside Gitea itself. ### Core Principles - **Strong separation of concerns**: Clear boundaries between AI, MCP server, and Gitea - **Least-privilege access**: Bot user has minimal necessary permissions - **Full auditability**: Every AI action is logged with context - **Dynamic authorization**: Access control via Gitea permissions (no redeployment needed) - **Privacy-first**: Designed for homelab and private infrastructure --- ## Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ ChatGPT (Business/Developer) │ │ - Initiates explicit MCP tool calls │ │ - Human-in-the-loop decision making │ └────────────────────┬────────────────────────────────────────┘ │ HTTPS (MCP over SSE) ▼ ┌─────────────────────────────────────────────────────────────┐ │ AegisGitea MCP Server (Python, Docker) │ │ - Implements MCP protocol │ │ - Translates tool calls → Gitea API requests │ │ - Enforces access, logging, and safety constraints │ │ - Provides bounded, single-purpose tools │ └────────────────────┬────────────────────────────────────────┘ │ Gitea API (Bot User Token) ▼ ┌─────────────────────────────────────────────────────────────┐ │ Gitea Instance (Docker) │ │ - Source of truth for authorization │ │ - Hosts dedicated read-only bot user │ │ - Determines AI-visible repositories dynamically │ └─────────────────────────────────────────────────────────────┘ ``` ### Trust Model | Component | Responsibility | |-----------|----------------| | **Gitea** | Authorization (what the AI can see) | | **MCP Server** | Policy enforcement (how the AI accesses data) | | **ChatGPT** | Decision initiation (when the AI acts) | | **Human** | Final decision authority (why the AI acts) | --- ## Features ### Phase 1 — Foundation (Current) - MCP protocol handling with SSE lifecycle - Secure Gitea API communication via bot user token - Health and readiness endpoints - ChatGPT MCP registration flow ### Phase 2 — Authorization & Data Access (Planned) - Repository discovery based on bot user permissions - File tree and content retrieval with size limits - Dynamic access control (changes in Gitea apply instantly) ### Phase 3 — Audit & Hardening (Planned) - Comprehensive audit logging (timestamp, tool, repo, path, correlation ID) - Request correlation and tracing - Input validation and rate limiting - Defensive bounds on all operations ### Phase 4 — Extended Context (Future) - Commit history and diff inspection - Issue and pull request visibility - Full contextual understanding while maintaining read-only guarantees --- ## Authorization Model ### Bot User Strategy A dedicated Gitea bot user represents "the AI": - The MCP server authenticates as this user using a read-only token - The bot user's repository permissions define AI visibility - **No admin privileges** - **No write permissions** - **No implicit access** This allows dynamic enable/disable of AI access **without restarting or reconfiguring** the MCP server. **Example:** ```bash # Grant AI access to a repository git clone https://gitea.example.com/org/repo.git cd repo # Add bot user as collaborator with Read permission in Gitea UI # Revoke AI access # Remove bot user from repository in Gitea UI ``` --- ## MCP Tool Design All tools are: - **Explicit**: Single-purpose, no hidden behavior - **Deterministic**: Same input always produces same output - **Bounded**: Size limits, path constraints, no wildcards - **Auditable**: Full logging of every invocation ### Tool Categories 1. **Repository Discovery** - List repositories visible to bot user - Get repository metadata 2. **File Operations** - Get file tree for a repository - Read file contents (with size limits) 3. **Commit History** (Phase 4) - List commits for a repository - Get commit details and diffs 4. **Issues & PRs** (Phase 4) - List issues and pull requests - Read issue/PR details and comments ### Explicit Constraints - No wildcard search tools - No full-text indexing - No recursive "read everything" operations - No hidden or implicit data access --- ## Audit & Observability Every MCP tool invocation logs: - **Timestamp** (UTC) - **Tool name** - **Repository identifier** - **Target** (path / commit / issue) - **Correlation ID** Logs are: - Append-only - Human-readable JSON - Machine-parseable - Stored locally by default **Audit Philosophy**: The system must answer "What exactly did the AI see, and when?" without ambiguity. --- ## Deployment ### Prerequisites - Docker and Docker Compose - Self-hosted Gitea instance - Gitea bot user with read-only access token ### Quick Start ```bash # Clone repository git clone https://gitea.example.com/your-org/AegisGitea-MCP.git cd AegisGitea-MCP # Configure environment cp .env.example .env # Edit .env with your Gitea URL and bot token # Start MCP server docker-compose up -d # Check logs docker-compose logs -f aegis-mcp ``` ### Environment Variables | Variable | Description | Required | |----------|-------------|----------| | `GITEA_URL` | Base URL of Gitea instance | Yes | | `GITEA_TOKEN` | Bot user access token | Yes | | `MCP_HOST` | MCP server listen host | No (default: 0.0.0.0) | | `MCP_PORT` | MCP server listen port | No (default: 8080) | | `LOG_LEVEL` | Logging verbosity | No (default: INFO) | | `AUDIT_LOG_PATH` | Audit log file path | No (default: /var/log/aegis-mcp/audit.log) | ### Security Considerations 1. **Never expose the MCP server publicly** — use a reverse proxy with TLS 2. **Rotate bot tokens regularly** 3. **Monitor audit logs** for unexpected access patterns 4. **Keep Docker images updated** 5. **Use a dedicated bot user** — never use a personal account token --- ## Development ### Setup ```bash # Create virtual environment python3 -m venv venv source venv/bin/activate # Install dependencies pip install -r requirements-dev.txt # Run tests pytest tests/ # Run server locally python -m aegis_gitea_mcp.server ``` ### Project Structure ``` AegisGitea-MCP/ ├── src/ │ └── aegis_gitea_mcp/ │ ├── __init__.py │ ├── server.py # MCP server entry point │ ├── mcp_protocol.py # MCP protocol implementation │ ├── gitea_client.py # Gitea API client │ ├── audit.py # Audit logging │ ├── config.py # Configuration management │ └── tools/ # MCP tool implementations │ ├── __init__.py │ ├── repository.py # Repository discovery tools │ └── files.py # File access tools ├── tests/ │ ├── test_mcp_protocol.py │ ├── test_gitea_client.py │ └── test_tools.py ├── docker/ │ ├── Dockerfile │ └── docker-compose.yml ├── .env.example ├── pyproject.toml ├── requirements.txt ├── requirements-dev.txt └── README.md ``` --- ## Non-Goals Explicitly **out of scope**: - No write access to Gitea (no commits, comments, merges, edits) - No autonomous or background scanning - No global search or unrestricted crawling - No public exposure of repositories or credentials - No coupling to GitHub or external VCS platforms --- ## Roadmap - [x] Project initialization and architecture design - [ ] **Phase 1**: MCP server foundation and Gitea integration - [ ] **Phase 2**: Repository discovery and file access tools - [ ] **Phase 3**: Audit logging and security hardening - [ ] **Phase 4**: Commit history, issues, and PR support --- ## Contributing This project prioritizes security and privacy. Contributions should: 1. Maintain read-only guarantees 2. Add comprehensive audit logging for new tools 3. Include tests for authorization and boundary cases 4. Document security implications --- ## License MIT License - See LICENSE file for details --- ## Acknowledgments Built on the [Model Context Protocol](https://modelcontextprotocol.io/) by Anthropic. --- ## Support For issues, questions, or security concerns, please open an issue in the Gitea repository. **Remember**: This is designed to be **boring, predictable, and safe** — not clever, not magical, and not autonomous.