diff --git a/API-Reference.md b/API-Reference.md new file mode 100644 index 0000000..1691279 --- /dev/null +++ b/API-Reference.md @@ -0,0 +1,118 @@ +# API Reference + +## Core Endpoints + +- `GET /`: server metadata. +- `GET /health`: health probe. +- `GET /metrics`: Prometheus metrics (when enabled). + +## OAuth Discovery and Token Exchange + +- `GET /.well-known/oauth-protected-resource` + - Returns OAuth protected resource metadata used by MCP clients. +- `GET /.well-known/oauth-authorization-server` + - Returns OAuth authorization server metadata. +- `POST /register` + - Registers an OAuth client and persists the client metadata. +- `POST /oauth/token` + - Proxies OAuth authorization-code token exchange to Gitea. + +## MCP Endpoints + +- `GET /mcp/tools`: list tool definitions. +- `GET /mcp` and `POST /mcp`: streamable HTTP transport. +- `GET /mcp/sse` and `POST /mcp/sse`: MCP SSE transport alias. +- `POST /mcp/tool/call`: direct tool-call endpoint. + +Authentication requirements: + +- MCP tool execution requires `Authorization: Bearer `. +- Missing or invalid tokens return `401` with: + - `WWW-Authenticate: Bearer resource_metadata="", scope="read:repository"` + +Scope requirements: + +- Read tools require `read:repository`. +- Write tools require `write:repository`. +- Insufficient scope returns `403`. + +## Automation Endpoints + +- `POST /automation/webhook`: ingest policy-controlled webhook events. +- `POST /automation/jobs/run`: run policy-controlled automation jobs. + +## Read Tools + +- `list_repositories` +- `get_repository_info` (`owner`, `repo`) +- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`) +- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`) +- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`) +- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`) +- `get_commit_diff` (`owner`, `repo`, `sha`) +- `compare_refs` (`owner`, `repo`, `base`, `head`) +- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`) +- `get_issue` (`owner`, `repo`, `issue_number`) +- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`) +- `get_pull_request` (`owner`, `repo`, `pull_number`) +- `list_labels` (`owner`, `repo`, optional `page`, `limit`) +- `list_tags` (`owner`, `repo`, optional `page`, `limit`) +- `list_releases` (`owner`, `repo`, optional `page`, `limit`) +- `list_pull_request_files` (`owner`, `repo`, `pull_number`, optional `page`, `limit`) +- `list_pull_request_commits` (`owner`, `repo`, `pull_number`, optional `page`, `limit`) +- `list_issue_comments` (`owner`, `repo`, `issue_number`, optional `page`, `limit`) +- `list_branches` (`owner`, `repo`, optional `page`, `limit`) +- `get_branch` (`owner`, `repo`, `branch`) +- `get_release` (`owner`, `repo`, `release_id`) +- `get_latest_release` (`owner`, `repo`) +- `list_milestones` (`owner`, `repo`, optional `state`, `page`, `limit`) +- `get_commit_status` (`owner`, `repo`, `sha`) +- `list_org_repositories` (`org`, optional `page`, `limit`) +- `list_organizations` (optional `page`, `limit`) +- `get_repo_languages` (`owner`, `repo`) +- `list_repo_topics` (`owner`, `repo`) + +## Write Tools (Write Mode Required) + +- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`, `milestone`) +- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`, `milestone`) +- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`) +- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`) +- `add_labels` (`owner`, `repo`, `issue_number`, `labels` by name) +- `remove_labels` (`owner`, `repo`, `issue_number`, `labels` by name) +- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`) +- `create_label` (`owner`, `repo`, `name`, `color` hex e.g. `#00aabb`, optional `description`, `exclusive`) +- `update_label` (`owner`, `repo`, `name`, one or more of `new_name`, `color`, `description`) +- `create_pull_request` (`owner`, `repo`, `title`, `head`, `base`, optional `body`) +- `create_release` (`owner`, `repo`, `tag_name`, optional `name`, `body`, `draft`, `prerelease`, `target`) +- `edit_release` (`owner`, `repo`, `release_id`, one or more of `name`, `body`, `draft`, `prerelease`) +- `create_branch` (`owner`, `repo`, `new_branch_name`, optional `old_branch_name`) +- `create_milestone` (`owner`, `repo`, `title`, optional `description`, `due_on`) +- `edit_issue_comment` (`owner`, `repo`, `comment_id`, `body`) + +Not supported by design: merge, branch/label/release deletion, force push, repo/admin +management. + +Note: `create_issue`, `add_labels`, and `remove_labels` accept label **names**; the +server resolves them to Gitea label ids and returns a clear error for unknown labels. + +Note: the `milestone` argument on `create_issue`/`update_issue` accepts either a numeric +milestone **id** or a milestone **title** (resolved case-insensitively against open and +closed milestones; unknown titles return a clear error). On `update_issue`, `milestone: 0` +clears the issue's milestone. Gitea Projects (Kanban boards) are intentionally unsupported: +the Gitea REST API exposes no project endpoints. + +## Validation and Limits + +- All tool argument schemas reject unknown fields. +- List responses are capped by `MAX_TOOL_RESPONSE_ITEMS`. +- Text payloads are capped by `MAX_TOOL_RESPONSE_CHARS`. +- File reads are capped by `MAX_FILE_SIZE_BYTES`. + +## Error Model + +- Auth error: HTTP `401`. +- Policy/scope denial: HTTP `403`. +- Validation error: HTTP `400`. +- Rate limit: HTTP `429`. +- Internal errors: HTTP `500` (no stack traces in production). diff --git a/Architecture.md b/Architecture.md new file mode 100644 index 0000000..09b05b8 --- /dev/null +++ b/Architecture.md @@ -0,0 +1,175 @@ +# Architecture + +## Overview + +AegisGitea MCP is a Python 3.10+ application built on **FastAPI**. It acts as a bridge between an AI client (such as Claude, Claude Code, or Cowork) and a self-hosted Gitea instance, implementing the [Model Context Protocol (MCP)](https://modelcontextprotocol.io). + +``` +AI Client (Claude / Claude Code / Cowork) + │ + │ HTTP (Authorization: Bearer ) + ▼ +┌────────────────────────────────────────────┐ +│ FastAPI Server │ +│ server.py │ +│ - Route: GET/POST /mcp │ +│ - Route: POST /mcp/tool/call │ +│ - Route: GET /mcp/tools │ +│ - Route: GET /health │ +│ - Streamable HTTP transport │ +│ - Legacy SSE alias (GET/POST /mcp/sse) │ +└───────┬───────────────────┬────────────────┘ + │ │ + ┌────▼────┐ ┌────▼──────────────┐ + │ auth │ │ Tool dispatcher │ + │ auth.py│ │ (server.py) │ + └────┬────┘ └────────┬──────────┘ + │ │ + │ ┌────────▼──────────┐ + │ │ Tool handlers │ + │ │ tools/repo.py │ + │ └────────┬──────────┘ + │ │ + │ ┌────────▼──────────┐ + │ │ GiteaClient │ + │ │ gitea_client.py │ + │ └────────┬──────────┘ + │ │ HTTPS + │ ▼ + │ Gitea instance + │ + ┌────▼────────────────────┐ + │ AuditLogger │ + │ audit.py │ + │ /var/log/aegis-mcp/ │ + │ audit.log │ + └─────────────────────────┘ +``` + +--- + +## Source Modules + +### `server.py` + +The entry point and FastAPI application. Responsibilities: + +- Defines all HTTP routes +- Reads configuration on startup and initialises `GiteaClient` +- Applies authentication middleware to protected routes +- Dispatches tool calls to the appropriate handler function +- Handles CORS + +### `auth.py` + +API key validation. Responsibilities: + +- `APIKeyValidator` class: holds the set of valid keys, tracks failed attempts per IP +- Constant-time comparison to prevent timing side-channels +- Rate limiting: blocks IPs that exceed `MAX_AUTH_FAILURES` within `AUTH_FAILURE_WINDOW` +- Helper functions for key generation and hashing +- Singleton pattern (`get_validator()`) with test-friendly reset (`reset_validator()`) + +### `config.py` + +Pydantic `BaseSettings` model. Responsibilities: + +- Loads all configuration from environment variables or `.env` +- Validates values (log level enum, token format, key minimum length) +- Parses comma-separated `MCP_API_KEYS` into a list +- Exposes computed properties (e.g. base URL for Gitea API) + +### `gitea_client.py` + +Async HTTP client for the Gitea API. Responsibilities: + +- Wraps `httpx.AsyncClient` with bearer token authentication +- Maps HTTP status codes to typed exceptions (`GiteaAuthenticationError`, `GiteaNotFoundError`, etc.) +- Enforces file size limit before returning file contents +- Logs all API calls to the audit logger + +Key methods: + +| Method | Gitea endpoint | +|---|---| +| `get_current_user()` | `GET /api/v1/user` | +| `list_repositories()` | `GET /api/v1/user/repos` | +| `get_repository()` | `GET /api/v1/repos/{owner}/{repo}` | +| `get_file_contents()` | `GET /api/v1/repos/{owner}/{repo}/contents/{path}` | +| `get_tree()` | `GET /api/v1/repos/{owner}/{repo}/git/trees/{ref}` | + +### `audit.py` + +Structured audit logging using `structlog`. Responsibilities: + +- Initialises a `structlog` logger writing JSON to the configured log file +- `log_tool_invocation()`: records tool calls with result and correlation ID +- `log_access_denied()`: records failed authentication +- `log_security_event()`: records rate limit triggers and other security events +- Auto-generates UUID correlation IDs when none is provided + +### `mcp_protocol.py` + +MCP data models and tool registry. Responsibilities: + +- Pydantic models: `MCPTool`, `MCPToolCallRequest`, `MCPToolCallResponse`, `MCPListToolsResponse` +- `AVAILABLE_TOOLS` list: the canonical list of tools exposed to clients +- `get_tool_by_name()`: lookup helper used by the dispatcher + +### `tools/repository.py` + +Concrete tool handler functions. Responsibilities: + +- `list_repositories_tool()`: calls `GiteaClient.list_repositories()`, formats the result +- `get_repository_info_tool()`: calls `GiteaClient.get_repository()`, formats metadata +- `get_file_tree_tool()`: calls `GiteaClient.get_tree()`, flattens to a list of paths +- `get_file_contents_tool()`: calls `GiteaClient.get_file_contents()`, decodes base64 + +All handlers return a plain string. `server.py` wraps this in an `MCPToolCallResponse`. + +--- + +## Request Lifecycle + +``` +1. Client sends POST /mcp/tool/call + │ +2. FastAPI routes the request to the tool-call handler in server.py + │ +3. OAuth middleware validates the Bearer token via Gitea OIDC/JWKS or userinfo + ├── Fail → AuditLogger.log_access_denied() → HTTP 401 / 429 + └── Pass → continue + │ +4. AuditLogger.log_tool_invocation(status="pending") + │ +5. Tool dispatcher looks up the tool by name (mcp_protocol.get_tool_by_name) + │ +6. Policy engine checks read/write mode and repository/path policy + │ +7. If GITEA_TOKEN is configured, service-PAT authz checks + GET /repos/{owner}/{repo}/collaborators/{user}/permission + │ +8. Tool handler function (tools/repository.py) is called + │ +9. GiteaClient makes an async HTTP call to the Gitea API + │ +10. Result (or error) is returned to server.py + │ +11. AuditLogger.log_tool_invocation(status="success" | "error") + │ +12. MCPToolCallResponse is returned to the client +``` + +--- + +## Key Design Decisions + +**Read by default, writes opt-in.** Read tools are available by default. Write-capable tools require `WRITE_MODE=true`, repository write policy/whitelist approval, and `write:repository` authorization. + +**Gitea controls repository access.** Without `GITEA_TOKEN`, Gitea enforces repository permissions on API calls made with the user's token. With `GITEA_TOKEN`, the service PAT can only execute after the server verifies the requesting user's actual repository permission through Gitea and writes an audit denial if the check fails. + +**Public tool discovery.** `GET /mcp/tools` requires no authentication so that MCP clients can discover the available tools without credentials. All other endpoints require authentication. + +**Minimal persisted state.** The audit log is persisted for tamper-evident review. Dynamic OAuth client registrations are persisted when DCR is enabled. Rate limit counters and short-lived authz caches are in-memory and reset on restart. + +**Async throughout.** FastAPI + `httpx.AsyncClient` means all Gitea API calls are non-blocking, allowing the server to handle concurrent requests efficiently. diff --git a/Audit.md b/Audit.md new file mode 100644 index 0000000..a64f1ca --- /dev/null +++ b/Audit.md @@ -0,0 +1,53 @@ +# Audit Logging + +## Design + +Audit logs are append-only JSON lines with hash chaining: +- `prev_hash`: previous entry hash. +- `entry_hash`: hash of current entry payload + previous hash. + +This makes tampering detectable. + +## Event Types + +- `tool_invocation` +- `access_denied` +- `security_event` + +Each event includes timestamps and correlation context. + +## Integrity Validation + +Use: + +```bash +python3 scripts/validate_audit_log.py --path /var/log/aegis-mcp/audit.log +``` + +Exit code `0` indicates valid chain, non-zero indicates tamper/corruption. + +## Operational Expectations + +- Persist audit logs to durable storage. +- Protect write permissions (service account only). +- Validate integrity during incident response and release checks. + +## Rotation + +The server appends to a single audit file and does not rotate it in process — rotating +mid-stream would break the `prev_hash`/`entry_hash` chain. Manage growth externally with +`logrotate` using `copytruncate` so the open file handle keeps appending: + +``` +/var/log/aegis-mcp/audit.log { + weekly + rotate 12 + compress + missingok + notifempty + copytruncate +} +``` + +Run `scripts/validate_audit_log.py` against each rotated segment to confirm the chain +remains intact across rotations before archiving. diff --git a/Automation.md b/Automation.md new file mode 100644 index 0000000..0176590 --- /dev/null +++ b/Automation.md @@ -0,0 +1,27 @@ +# Automation + +## Scope + +Current automation capabilities: +- Webhook ingestion endpoint (`POST /automation/webhook`). +- On-demand scheduled-job execution endpoint (`POST /automation/jobs/run`). +- Dependency hygiene scan job scaffold (`dependency_hygiene_scan`). +- Stale issue detection job (`stale_issue_detection`). +- Auto issue creation job scaffold (`auto_issue_creation`, write-mode and policy required). + +Planned extensions: +- Background scheduler orchestration. + +## Control Requirements + +All automation must be: +- Policy-controlled. +- Independently disableable. +- Fully audited. +- Explicitly documented with runbook guidance. + +## Enablement + +- `AUTOMATION_ENABLED=true` to allow automation endpoints. +- `AUTOMATION_SCHEDULER_ENABLED=true` reserved for future built-in scheduler loop. +- Policy rules must allow automation pseudo-tools (`automation_*`) per repository. diff --git a/Configuration.md b/Configuration.md new file mode 100644 index 0000000..efb9f1e --- /dev/null +++ b/Configuration.md @@ -0,0 +1,72 @@ +# Configuration + +Copy `.env.example` to `.env` and set values before starting: + +```bash +cp .env.example .env +``` + +## OAuth/OIDC Settings (Primary) + +| Variable | Required | Default | Description | +|---|---|---|---| +| `GITEA_URL` | Yes | - | Base URL of your Gitea instance | +| `OAUTH_MODE` | No | `false` | Enables OAuth-oriented validation settings | +| `GITEA_OAUTH_CLIENT_ID` | Yes when `OAUTH_MODE=true` | - | OAuth client id | +| `GITEA_OAUTH_CLIENT_SECRET` | Yes when `OAUTH_MODE=true` | - | OAuth client secret | +| `OAUTH_EXPECTED_AUDIENCE` | No | empty | Additional accepted JWT audience beyond the MCP resource and Gitea client id | +| `OAUTH_CACHE_TTL_SECONDS` | No | `300` | OIDC discovery/JWKS cache TTL | +| `OAUTH_STATE_SECRET` | Yes when `OAUTH_MODE=true` | - | HMAC secret for signed OAuth state wrappers; must be at least 32 characters (e.g. `openssl rand -hex 32`) | +| `OAUTH_REDIRECT_ALLOWLIST` | No | empty | Additional allowed redirect URIs for OAuth clients | + +## MCP Server Settings + +| Variable | Required | Default | Description | +|---|---|---|---| +| `MCP_HOST` | No | `127.0.0.1` | Interface to bind to | +| `MCP_PORT` | No | `8080` | Port to listen on | +| `PUBLIC_BASE_URL` | No | empty | Public HTTPS base URL advertised in OAuth metadata (recommended behind reverse proxy) | +| `ALLOW_INSECURE_BIND` | No | `false` | Explicit opt-in required for `0.0.0.0` bind | +| `LOG_LEVEL` | No | `INFO` | `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` | +| `STARTUP_VALIDATE_GITEA` | No | `true` | Validate OIDC discovery endpoint at startup | +| `DCR_ENABLED` | No | `true` | Enable dynamic client registration at `/register` | +| `DCR_STORAGE_PATH` | No | `/var/lib/aegis-mcp/dcr_clients.json` | Persisted OAuth client registry path. Written with owner-only (`0o600`) permissions on POSIX hosts | + +## Security and Limits + +| Variable | Required | Default | Description | +|---|---|---|---| +| `MAX_AUTH_FAILURES` | No | `5` | Failed auth attempts before rate limiting | +| `AUTH_FAILURE_WINDOW` | No | `300` | Window in seconds for auth failure counting | +| `RATE_LIMIT_PER_MINUTE` | No | `60` | Per-IP request limit | +| `TOKEN_RATE_LIMIT_PER_MINUTE` | No | `120` | Per-token request limit | +| `MAX_FILE_SIZE_BYTES` | No | `1048576` | Max file payload returned by read tools | +| `MAX_TOOL_RESPONSE_ITEMS` | No | `200` | Max list items in tool responses | +| `MAX_TOOL_RESPONSE_CHARS` | No | `20000` | Max chars in text fields | +| `REQUEST_TIMEOUT_SECONDS` | No | `30` | Upstream timeout for Gitea calls | +| `SECRET_DETECTION_MODE` | No | `mask` | `off`, `mask`, `block` | +| `REPO_AUTHZ_CACHE_TTL_SECONDS` | No | `60` | TTL for cached per-user repository permission checks | + +## Write Mode + +| Variable | Required | Default | Description | +|---|---|---|---| +| `WRITE_MODE` | No | `false` | Enables write tools | +| `WRITE_REPOSITORY_WHITELIST` | Required if write mode enabled and allow-all disabled | empty | Comma-separated `owner/repo` allow list | +| `WRITE_ALLOW_ALL_TOKEN_REPOS` | No | `false` | Allow all repos accessible by token | + +## Automation + +| Variable | Required | Default | Description | +|---|---|---|---| +| `AUTOMATION_ENABLED` | No | `false` | Enables automation endpoints | +| `AUTOMATION_SCHEDULER_ENABLED` | No | `false` | Enables scheduler loop | +| `AUTOMATION_STALE_DAYS` | No | `30` | Age threshold for stale issue checks | + +## Legacy Compatibility Variables + +These are retained for compatibility but not used for OAuth-protected MCP tool execution: + +- `GITEA_TOKEN` +- `MCP_API_KEYS` +- `AUTH_ENABLED` diff --git a/Deployment.md b/Deployment.md new file mode 100644 index 0000000..e674dbc --- /dev/null +++ b/Deployment.md @@ -0,0 +1,57 @@ +# Deployment + +## Secure Defaults + +- Default bind is `127.0.0.1`. +- Binding `0.0.0.0` requires `ALLOW_INSECURE_BIND=true`. +- Write mode disabled by default. +- Policy checks run before tool execution. +- OAuth-protected MCP challenge responses are enabled by default for tool calls. + +## Local Development + +```bash +make install-dev +cp .env.example .env +make run +``` + +## Docker + +Use `docker/Dockerfile`: + +- Multi-stage image build. +- Non-root runtime user. +- Production env flags (`NODE_ENV=production`, `ENVIRONMENT=production`). +- Only required app files copied. +- Healthcheck on `/health`. + +Run examples: + +```bash +docker compose --profile prod up -d +docker compose --profile dev up -d +``` + +## CI/CD (Gitea Workflows) + +Workflows live in `.gitea/workflows/`: + +- `lint.yml`: ruff + format checks + mypy. +- `test.yml`: lint + tests + coverage fail-under `80`. +- `docker.yml`: lint + test + docker smoke-test gating; image publish on push to `main`/`dev` and on approved PR review targeting `main`/`dev`; tags include commit SHA plus `latest` (`main`) or `dev` (`dev`). + +Docker publish settings: +- `vars.PUSH_IMAGE=true` enables registry push. +- `vars.REGISTRY_IMAGE` sets the target image name (for example `registry.example.com/org/aegis-gitea-mcp`). +- `vars.REGISTRY_HOST` is optional and overrides the login host detection. +- `secrets.REGISTRY_USER` and `secrets.REGISTRY_TOKEN` are required when push is enabled. + +## Production Recommendations + +- Place MCP behind TLS reverse proxy. +- Set `PUBLIC_BASE_URL=https://` so OAuth metadata advertises HTTPS endpoints. +- Restrict inbound traffic to expected clients. +- Persist and monitor audit logs. +- Monitor `/metrics` and auth-failure events. +- Rotate OAuth client credentials when required. diff --git a/Getting-Started.md b/Getting-Started.md new file mode 100644 index 0000000..09759f8 --- /dev/null +++ b/Getting-Started.md @@ -0,0 +1,130 @@ +# Getting Started + +## Prerequisites + +- Python 3.10 or higher +- A running Gitea instance +- A Gitea OAuth2 application for this MCP server +- `make` (optional but recommended) + +## 1. Install + +```bash +git clone +cd AegisGitea-MCP + +# Install production dependencies +make install + +# Or install with dev dependencies (for testing and linting) +make install-dev +``` + +To install manually without `make`: + +```bash +python -m venv venv +source venv/bin/activate # Linux/macOS +# or: venv\Scripts\activate # Windows + +pip install -e . +# dev: pip install -e ".[dev]" +``` + +## 2. Create a Gitea OAuth2 Application + +1. In Gitea, open **User Settings > Applications**. +2. Create an OAuth2 application for AegisGitea-MCP. +3. Set the redirect URI to `https:///oauth/callback`. +4. Copy the client ID and client secret. + +## 3. Configure + +Copy the example environment file and fill in your values: + +```bash +cp .env.example .env +``` + +Minimum OAuth settings in `.env`: + +```env +GITEA_URL=https://gitea.example.com +OAUTH_MODE=true +GITEA_OAUTH_CLIENT_ID= +GITEA_OAUTH_CLIENT_SECRET= +PUBLIC_BASE_URL=https:// +OAUTH_STATE_SECRET= +``` + +`GITEA_TOKEN` is optional. If it is set, use a narrowly scoped service PAT and only grant it repository access you are prepared to expose after per-user authorization checks. If it is not set, Gitea REST calls use the authenticated user's OAuth token directly. + +See [Configuration](Configuration) for the full list of settings. + +## 4. Optional Standard API Key Mode + +For non-OAuth deployments, configure `GITEA_TOKEN` and `MCP_API_KEYS`. Generate an API key with: + +```bash +make generate-key +# or: python scripts/generate_api_key.py +``` + +Copy the printed key into `MCP_API_KEYS` in your `.env` file and set `OAUTH_MODE=false`. + +## 5. Run + +```bash +make run +# or: python -m aegis_gitea_mcp.server +``` + +The server starts on `http://127.0.0.1:8080` by default. + +Verify it is running: + +```bash +curl http://localhost:8080/health +# {"status": "healthy", ...} +``` + +## 6. Connect an AI Client + +### Claude + +In claude.ai, open **Settings > Connectors > Add custom connector** and paste: + +``` +https:///mcp +``` + +Claude discovers OAuth metadata, registers through `/register`, and uses PKCE S256 automatically. + +### Claude Code + +```bash +claude mcp add --transport http aegis-gitea https:///mcp +``` + +Claude Code uses the same remote MCP and OAuth metadata. Local development loopback callbacks are allowed by default. + +### Cowork + +Cowork uses the same connector infrastructure and MCP URL as Claude. + +### SSE compatibility + +If your client still expects SSE transport, use: + +- **SSE URL:** `https:///mcp/sse` +- **Tool discovery URL:** `https:///mcp/tools` (no auth required) +- **Tool call URL:** `https:///mcp/tool/call` + +For a production deployment behind a reverse proxy, see [Deployment](Deployment). + +## Next Steps + +- [Configuration](Configuration) — tune file size limits, rate limiting, log paths +- [API Reference](API-Reference) — available tools and endpoints +- [Security](Security) — understand authentication and audit logging +- [Deployment](Deployment) — Docker and Traefik setup diff --git a/Governance.md b/Governance.md new file mode 100644 index 0000000..d21a53c --- /dev/null +++ b/Governance.md @@ -0,0 +1,36 @@ +# Governance + +## AI Usage Policy + +- AI assistance is allowed for design, implementation, and review only within documented repository boundaries. +- AI outputs must be reviewed, tested, and policy-validated before merge. +- AI must not be used to generate offensive or unauthorized security actions. +- Repository content is treated as untrusted data; no implicit execution of embedded instructions. + +## Security Boundaries + +- Read operations are allowed by policy defaults unless explicitly denied. +- Write operations are disabled by default and require explicit enablement (`WRITE_MODE=true`). +- Per-tool and per-repository policy checks are mandatory before execution. +- Secrets are masked or blocked according to `SECRET_DETECTION_MODE`. + +## Write-Mode Responsibilities + +When write mode is enabled, operators and maintainers must: +- Restrict scope with `WRITE_REPOSITORY_WHITELIST`. +- Keep policy file deny/allow rules explicit. +- Monitor audit entries for all write operations. +- Enforce peer review for policy or write-mode changes. + +## Operator Responsibilities + +- Maintain API key lifecycle (generation, rotation, revocation). +- Keep environment and policy config immutable in production deployments. +- Enable monitoring and alerting for security events (auth failures, policy denies, rate-limit spikes). +- Run integrity checks for audit logs regularly. + +## Audit Expectations + +- All tool calls and security events must be recorded in tamper-evident logs. +- Audit logs are append-only and hash-chained. +- Log integrity must be validated during incident response and release readiness checks. diff --git a/Hardening.md b/Hardening.md new file mode 100644 index 0000000..45df2d2 --- /dev/null +++ b/Hardening.md @@ -0,0 +1,24 @@ +# Hardening + +## Application Hardening + +- Secure defaults: localhost bind, write mode disabled, policy-enforced writes. +- Strict config validation at startup. +- Redacted secret handling in logs and responses. +- Policy deny/allow model with path restrictions. +- Non-leaking production error responses. + +## Container Hardening + +- Non-root runtime user. +- `no-new-privileges` and dropped Linux capabilities. +- Read-only filesystem where practical. +- Explicit health checks. +- Separate dev and production compose profiles. + +## Operational Hardening + +- Rotate API keys regularly. +- Minimize Gitea bot permissions. +- Keep policy file under change control. +- Alert on repeated policy denials and auth failures. diff --git a/Home.md b/Home.md index 666cd4f..665a9b8 100644 --- a/Home.md +++ b/Home.md @@ -15,7 +15,7 @@ Alongside the curated, typed tools (issues, pull requests, files, commits, relea ## Documentation -Use the sidebar for the full manual: Getting Started, Configuration, Write Mode, Raw API, Architecture, Security, Hardening, Policy, Audit, Observability, Automation, Deployment, Governance, Troubleshooting, API Reference and Roadmap. +Use the sidebar for the full manual: Getting Started, Configuration, Write Mode, Architecture, Security, Hardening, Policy, Audit, Observability, Automation, Deployment, Governance, Troubleshooting, API Reference and Roadmap. --- diff --git a/Observability.md b/Observability.md new file mode 100644 index 0000000..392134d --- /dev/null +++ b/Observability.md @@ -0,0 +1,48 @@ +# Observability + +## Logging + +- Structured JSON logs. +- Request correlation via `X-Request-ID`. +- Security events and policy denials are audit logged. + +### Structured event helpers + +`logging_utils` exposes reusable helpers so endpoints emit consistent, +secret-safe structured events instead of ad-hoc inline logging: + +- `log_event(logger, level, event, **context)` — emit a named event with a + `context` mapping; keys in `SENSITIVE_CONTEXT_KEYS` (e.g. `token`, + `authorization`, `password`) are masked as `***`. +- `log_nullable_field(logger, event, field, value)` — record whether a parsed + response field is `None` and its runtime type, without dumping its contents. +- `sanitize_context(context)` — the masking primitive used by the above. + +The `context` mapping is serialized into the JSON log payload under a `context` +key. These run at `DEBUG`, so they are silent unless `LOG_LEVEL=DEBUG`. + +`get_issue` is instrumented with these helpers (`get_issue.start`, +`get_issue.payload_shape`, `get_issue.field_check`) to make nullable-field +parsing failures diagnosable. The same pattern can be reused for other +parsing-heavy endpoints (`get_pull_request`, `list_issues`, `get_commit_diff`). + +## Metrics + +Prometheus-compatible endpoint: `GET /metrics`. + +Current metrics: +- `aegis_http_requests_total{method,path,status}` +- `aegis_tool_calls_total{tool,status}` +- `aegis_tool_duration_seconds_sum{tool}` +- `aegis_tool_duration_seconds_count{tool}` + +## Tracing and Correlation + +- Request IDs propagate in response header (`X-Request-ID`). +- Tool-level correlation IDs included in MCP responses. + +## Operational Guidance + +- Alert on spikes in 401/403/429 rates. +- Alert on repeated `access_denied` and auth-rate-limit events. +- Track tool latency trends for incident triage. diff --git a/Policy.md b/Policy.md new file mode 100644 index 0000000..f4574d3 --- /dev/null +++ b/Policy.md @@ -0,0 +1,52 @@ +# Policy Engine + +## Overview + +Aegis uses a YAML policy engine to authorize tool execution before any Gitea API call is made. + +## Behavior Summary + +- Global tool allow/deny supported. +- Per-repository tool allow/deny supported. +- Optional repository path allow/deny supported. +- Write operations are denied by default. +- Write operations also require `WRITE_MODE=true` and either: + - `WRITE_REPOSITORY_WHITELIST` match, or + - `WRITE_ALLOW_ALL_TOKEN_REPOS=true`. + +## Example Configuration + +```yaml +defaults: + read: allow + write: deny + +tools: + deny: + - search_code + +repositories: + acme/service-a: + tools: + allow: + - get_file_contents + - list_commits + paths: + allow: + - src/* + deny: + - src/secrets/* +``` + +## Failure Behavior + +- Invalid YAML or invalid schema: startup failure (fail closed). +- Denied tool call: HTTP `403` + audit `access_denied` entry. +- Path traversal attempt in path-scoped tools: denied by validation/policy checks. + +## Operational Guidance + +- Keep policy files version-controlled and code-reviewed. +- Prefer explicit deny entries for sensitive tools. +- Use repository-specific allow lists for high-risk environments. +- Test policy updates in staging before production rollout. diff --git a/Roadmap.md b/Roadmap.md new file mode 100644 index 0000000..80a5b60 --- /dev/null +++ b/Roadmap.md @@ -0,0 +1,72 @@ +# Roadmap + +## High-Level Evolution Plan + +1. Hardened read-only gateway baseline. +2. Policy-driven authorization and observability. +3. Controlled write-mode rollout. +4. Automation and event-driven workflows. +5. Continuous hardening and enterprise controls. + +## Threat Model Updates + +- Primary threats: credential theft, over-permissioned automation, prompt injection via repo data, policy bypass, audit tampering. +- Secondary threats: denial-of-service, misconfiguration drift, unsafe deployment defaults. + +## Security Model + +- API key authentication + auth failure throttling. +- Per-IP and per-token request rate limits. +- Secret detection and outbound sanitization. +- Tamper-evident audit logs with integrity verification. +- No production stack-trace disclosure. + +## Policy Model + +- YAML policy with global and per-repository allow/deny rules. +- Optional path restrictions for file-oriented tools. +- Default write deny. +- Write-mode repository whitelist enforcement. + +## Capability Matrix Concept + +- `Read` capabilities: enabled by default but policy-filtered. +- `Write` capabilities: disabled by default, policy + whitelist gated. +- `Automation` capabilities: disabled by default, policy-controlled. + +## Audit Log Design + +- JSON lines. +- `prev_hash` + `entry_hash` chain. +- Correlation/request IDs for traceability. +- Validation script for chain integrity. + +## Write-Mode Architecture + +- Separate write tool set with strict schemas. +- Global toggle (`WRITE_MODE`) + per-repo whitelist. +- Policy engine still authoritative. +- No merge, branch deletion, or force push endpoints. + +## Deployment Architecture + +- Non-root container runtime. +- Read-only filesystem where practical. +- Explicit opt-in for insecure bind. +- Separate dev and prod compose profiles. + +## Observability Architecture + +- Structured JSON logs with request correlation. +- Prometheus-compatible `/metrics` endpoint. +- Tool execution counters and duration aggregates. + +## Risk Analysis + +- Highest risk: write-mode misuse and policy misconfiguration. +- Mitigations: deny-by-default, whitelist, audit chain, tests, docs, reviews. + +## Extensibility Notes + +- Add new tools only through schema + policy + docs + tests path. +- Keep transport-agnostic execution core for webhook/scheduler integrations. diff --git a/Security.md b/Security.md new file mode 100644 index 0000000..9836f19 --- /dev/null +++ b/Security.md @@ -0,0 +1,58 @@ +# Security + +## Core Controls + +- OAuth2/OIDC bearer-token authentication for MCP tool execution. +- OIDC discovery + JWKS validation cache for JWT tokens. +- Userinfo validation fallback for opaque OAuth tokens. +- Scope enforcement: + - `read:repository` for read tools. + - `write:repository` for write tools. +- Policy engine checks before tool execution. +- Per-IP and per-token rate limiting. +- Strict schema validation (`extra=forbid`). +- Tamper-evident audit logging with hash chaining. +- Secret sanitization for logs and tool output. +- Production-safe error responses (no internal stack traces). + +## Threat Model + +### Why shared bot tokens are dangerous + +- A single leaked bot token can expose all repositories that bot can access. +- Access is not naturally bounded per end user. +- Blast radius is large and cross-tenant. + +### Why token-in-URL is insecure + +- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines. +- Bearer tokens must be passed in `Authorization` headers only. + +### Why per-user OAuth reduces lateral access + +- Each MCP request executes with the signed-in user token. +- Gitea authorization stays source-of-truth for repository visibility. +- A compromised token is limited to that user’s permissions. + +## Prompt Injection Hardening + +Repository content is treated as untrusted data. + +- Tool outputs are bounded and sanitized. +- No instructions from repository text are executed. +- Text fields are size-limited before returning to LLM clients. + +## Secret Detection + +Detected classes include: + +- API key and token patterns. +- JWT-like tokens. +- Private key block markers. +- Common provider credential formats. + +Behavior: + +- `SECRET_DETECTION_MODE=mask`: redact in place. +- `SECRET_DETECTION_MODE=block`: replace secret-bearing values. +- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended). diff --git a/Troubleshooting.md b/Troubleshooting.md new file mode 100644 index 0000000..ebde992 --- /dev/null +++ b/Troubleshooting.md @@ -0,0 +1,76 @@ +# Troubleshooting + +## "Internal server error (-32603)" from Claude + +**Symptom:** Claude shows `Internal server error` with JSON-RPC error code `-32603` when trying to use Gitea tools. + +**Cause:** In user-token mode, the OAuth token stored by the client may have been issued without Gitea API scopes (e.g. `read:repository`). In service-PAT mode, the call may fail because the authenticated user does not have the required repository permission or the permission probe cannot be completed. + +**Fix:** +1. In Gitea: Go to **Settings > Applications > Authorized OAuth2 Applications** and revoke the MCP application. +2. In Claude: disconnect the MCP server and authenticate again. +3. Re-authorize: Use the MCP connector again. It will trigger a fresh OAuth flow. For repository-targeted calls in service-PAT mode, also verify the signed-in Gitea user has read/write access to the target repository. + +**Verification:** Check the server logs for `oauth_auth_summary`. A working token shows: +``` +oauth_auth_summary: api_probe=pass login=alice +``` +A scopeless token shows: +``` +oauth_token_lacks_api_scope: status=403 login=alice +``` + +## "Gitea rejected the API call" (403) + +**Symptom:** Tool calls return 403 with a message about re-authorizing. + +**Cause:** The OAuth token does not have the required API scope in user-token mode, or the per-user repository permission check denied the request in service-PAT mode. + +**Fix:** Revoke and re-authorize if the token lacks API scope. If the error mentions repository permission, grant the signed-in Gitea user the required repository access or use a repository they can access. + +## Claude caches stale tokens + +**Symptom:** After fixing the OAuth configuration, Claude still sends the old token. + +**Cause:** The client caches access tokens and doesn't automatically re-authenticate when the server configuration changes. + +**Fix:** +1. Disconnect the server in the client. +2. Start a new conversation and use the integration again - this forces a fresh OAuth flow. + +## How OAuth scopes work with Gitea + +Gitea's OAuth2/OIDC implementation uses **granular scopes** for API access: + +| Scope | Access | +|-------|--------| +| `read:repository` | Read repositories, issues, PRs, files | +| `write:repository` | Create/edit issues, PRs, comments, files | +| `openid` | OIDC identity (login, email) | + +When an OAuth application requests authorization, the `scope` parameter in the authorize URL determines what permissions the resulting token has. If only OIDC scopes are requested (e.g. `openid profile email`), the token can establish identity but may not be usable for direct Gitea REST calls. When `GITEA_TOKEN` is configured, the server uses OIDC for identity and checks the user's repository permission before using the service PAT. + +The MCP server's OAuth metadata controls which scopes the client requests. Ensure it includes: +```yaml +scopes: + read:repository: "Read access to Gitea repositories" + write:repository: "Write access to Gitea repositories" +``` + +## Reading the `oauth_auth_summary` log + +Every authenticated request emits a structured log line: + +| Field | Description | +|-------|-------------| +| `token_type` | `jwt` or `opaque` | +| `scopes_observed` | Scopes extracted from the token/userinfo | +| `scopes_effective` | Final scopes after implicit grants | +| `api_probe` | `pass`, `fail:403`, `fail:401`, `skip:cached`, `skip:error` | +| `login` | Gitea username | + +- `api_probe=pass` — token works for Gitea API calls +- `api_probe=fail:403` — token lacks API scopes, request rejected with re-auth guidance +- `api_probe=skip:cached` — previous probe passed, cached result used +- `api_probe=skip:error` — network error during probe, request allowed to proceed +- `repository_permission_denied` in the audit log — the user lacks required read/write permission for a service-PAT call diff --git a/Write-Mode.md b/Write-Mode.md new file mode 100644 index 0000000..11b7a58 --- /dev/null +++ b/Write-Mode.md @@ -0,0 +1,55 @@ +# Write Mode + +## Threat Model + +Write mode introduces mutation risk (issue/PR changes, metadata updates). Risks include unauthorized action, accidental mass updates, and audit evasion. + +## Default Posture + +- `WRITE_MODE=false` by default. +- When enabled, writes require repository whitelist membership by default. +- Optional opt-in: `WRITE_ALLOW_ALL_TOKEN_REPOS=true` allows writes to any repo the token can access. +- Policy engine remains authoritative and may deny specific write tools. + +## Supported Write Tools + +- `create_issue` (optional `milestone` id or title) +- `update_issue` (optional `milestone`; `0` clears it) +- `create_issue_comment` +- `create_pr_comment` +- `edit_issue_comment` +- `add_labels` +- `remove_labels` +- `assign_issue` +- `create_label` +- `update_label` +- `create_pull_request` +- `create_release` +- `edit_release` +- `create_branch` +- `create_milestone` + +Not supported (explicitly forbidden): merge actions, branch/label/release deletion, +force push, repo/admin management, and repository content writes (file create/edit, +commits). Gitea Projects (Kanban boards) are unsupported because the Gitea REST API +exposes no project endpoints. + +## Enablement Steps + +1. Set `WRITE_MODE=true`. +2. Choose one: + - `WRITE_REPOSITORY_WHITELIST=owner/repo,...` (recommended) + - `WRITE_ALLOW_ALL_TOKEN_REPOS=true` (broader scope) +3. Review policy file for write-tool scope. +4. Verify audit logging and alerting before rollout. + +## Safe Operations + +- Start with one repository in whitelist. +- Use narrowly scoped bot credentials. +- Require peer review for whitelist/policy changes. +- Disable write mode during incident response if abuse is suspected. + +## Risk Tradeoffs + +Write mode improves automation and triage speed but increases blast radius. Use least privilege, tight policy, and strong monitoring. diff --git a/_Footer.md b/_Footer.md new file mode 100644 index 0000000..3139826 --- /dev/null +++ b/_Footer.md @@ -0,0 +1 @@ +_Generated from the docs/ directory. Edit the docs, not the wiki, then re-run the wiki sync._ diff --git a/_Sidebar.md b/_Sidebar.md new file mode 100644 index 0000000..f297f94 --- /dev/null +++ b/_Sidebar.md @@ -0,0 +1,27 @@ +### AegisGitea-MCP + +**Start** +- [Home](Home) +- [Getting Started](Getting-Started) +- [Configuration](Configuration) + +**Operating** +- [Write Mode](Write-Mode) +- [Automation](Automation) +- [Deployment](Deployment) + +**Internals** +- [Architecture](Architecture) +- [Policy](Policy) +- [Audit](Audit) +- [Observability](Observability) + +**Security** +- [Security](Security) +- [Hardening](Hardening) +- [Governance](Governance) + +**Reference** +- [API Reference](API-Reference) +- [Troubleshooting](Troubleshooting) +- [Roadmap](Roadmap)