Add OAuth2/OIDC per-user Gitea authentication
Introduce a GiteaOAuthValidator for JWT and userinfo validation and fallbacks, add /oauth/token proxy, and thread per-user tokens through the request context and automation paths. Update config and .env.example for OAuth-first mode, add OpenAPI, extensive unit/integration tests, GitHub/Gitea CI workflows, docs, and lint/test enforcement (>=80% cov).
This commit is contained in:
@@ -1,49 +1,69 @@
|
||||
# API Reference
|
||||
|
||||
## Endpoints
|
||||
## Core Endpoints
|
||||
|
||||
- `GET /`: server metadata.
|
||||
- `GET /health`: health probe.
|
||||
- `GET /metrics`: Prometheus metrics (when enabled).
|
||||
- `POST /automation/webhook`: ingest policy-controlled webhook events.
|
||||
- `POST /automation/jobs/run`: run policy-controlled automation jobs.
|
||||
|
||||
## OAuth Discovery and Token Exchange
|
||||
|
||||
- `GET /.well-known/oauth-protected-resource`
|
||||
- Returns OAuth protected resource metadata used by MCP clients.
|
||||
- `GET /.well-known/oauth-authorization-server`
|
||||
- Returns OAuth authorization server metadata.
|
||||
- `POST /oauth/token`
|
||||
- Proxies OAuth authorization-code token exchange to Gitea.
|
||||
|
||||
## MCP Endpoints
|
||||
|
||||
- `GET /mcp/tools`: list tool definitions.
|
||||
- `POST /mcp/tool/call`: execute a tool (`Authorization: Bearer <api-key>` required except in explicitly disabled auth mode).
|
||||
- `POST /mcp/tool/call`: execute a tool.
|
||||
- `GET /mcp/sse` and `POST /mcp/sse`: MCP SSE transport.
|
||||
|
||||
## Automation Jobs
|
||||
Authentication requirements:
|
||||
|
||||
`POST /automation/jobs/run` supports:
|
||||
- `dependency_hygiene_scan` (read-only scaffold).
|
||||
- `stale_issue_detection` (read-only issue age analysis).
|
||||
- `auto_issue_creation` (write-mode + whitelist + policy required).
|
||||
- MCP tool execution requires `Authorization: Bearer <token>`.
|
||||
- Missing or invalid tokens return `401` with:
|
||||
- `WWW-Authenticate: Bearer resource_metadata="<absolute metadata url>", scope="read:repository"`
|
||||
|
||||
Scope requirements:
|
||||
|
||||
- Read tools require `read:repository`.
|
||||
- Write tools require `write:repository`.
|
||||
- Insufficient scope returns `403`.
|
||||
|
||||
## Automation Endpoints
|
||||
|
||||
- `POST /automation/webhook`: ingest policy-controlled webhook events.
|
||||
- `POST /automation/jobs/run`: run policy-controlled automation jobs.
|
||||
|
||||
## Read Tools
|
||||
|
||||
- `list_repositories`.
|
||||
- `get_repository_info` (`owner`, `repo`).
|
||||
- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`).
|
||||
- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`).
|
||||
- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`).
|
||||
- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`).
|
||||
- `get_commit_diff` (`owner`, `repo`, `sha`).
|
||||
- `compare_refs` (`owner`, `repo`, `base`, `head`).
|
||||
- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`).
|
||||
- `get_issue` (`owner`, `repo`, `issue_number`).
|
||||
- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`).
|
||||
- `get_pull_request` (`owner`, `repo`, `pull_number`).
|
||||
- `list_labels` (`owner`, `repo`, optional `page`, `limit`).
|
||||
- `list_tags` (`owner`, `repo`, optional `page`, `limit`).
|
||||
- `list_releases` (`owner`, `repo`, optional `page`, `limit`).
|
||||
- `list_repositories`
|
||||
- `get_repository_info` (`owner`, `repo`)
|
||||
- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`)
|
||||
- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`)
|
||||
- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`)
|
||||
- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`)
|
||||
- `get_commit_diff` (`owner`, `repo`, `sha`)
|
||||
- `compare_refs` (`owner`, `repo`, `base`, `head`)
|
||||
- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`)
|
||||
- `get_issue` (`owner`, `repo`, `issue_number`)
|
||||
- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`)
|
||||
- `get_pull_request` (`owner`, `repo`, `pull_number`)
|
||||
- `list_labels` (`owner`, `repo`, optional `page`, `limit`)
|
||||
- `list_tags` (`owner`, `repo`, optional `page`, `limit`)
|
||||
- `list_releases` (`owner`, `repo`, optional `page`, `limit`)
|
||||
|
||||
## Write Tools (Write Mode Required)
|
||||
|
||||
- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`).
|
||||
- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`).
|
||||
- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`).
|
||||
- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`).
|
||||
- `add_labels` (`owner`, `repo`, `issue_number`, `labels`).
|
||||
- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`).
|
||||
- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`)
|
||||
- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`)
|
||||
- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`)
|
||||
- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`)
|
||||
- `add_labels` (`owner`, `repo`, `issue_number`, `labels`)
|
||||
- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`)
|
||||
|
||||
## Validation and Limits
|
||||
|
||||
@@ -54,8 +74,8 @@
|
||||
|
||||
## Error Model
|
||||
|
||||
- Policy denial: HTTP `403`.
|
||||
- Validation error: HTTP `400`.
|
||||
- Auth error: HTTP `401`.
|
||||
- Policy/scope denial: HTTP `403`.
|
||||
- Validation error: HTTP `400`.
|
||||
- Rate limit: HTTP `429`.
|
||||
- Internal errors: HTTP `500` without stack traces in production.
|
||||
- Internal errors: HTTP `500` (no stack traces in production).
|
||||
|
||||
@@ -1,23 +1,21 @@
|
||||
# Configuration
|
||||
|
||||
All configuration is done through environment variables. Copy `.env.example` to `.env` and set the values before starting the server.
|
||||
Copy `.env.example` to `.env` and set values before starting:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gitea Settings
|
||||
## OAuth/OIDC Settings (Primary)
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `GITEA_URL` | Yes | — | Base URL of your Gitea instance (e.g. `https://gitea.example.com`) |
|
||||
| `GITEA_TOKEN` | Yes | — | API token of the Gitea bot user |
|
||||
|
||||
The `GITEA_TOKEN` must be a token belonging to a user that has at least read access to all repositories you want the AI to access. The server validates the token on startup by calling the Gitea `/api/v1/user` endpoint.
|
||||
|
||||
---
|
||||
| `GITEA_URL` | Yes | - | Base URL of your Gitea instance |
|
||||
| `OAUTH_MODE` | No | `false` | Enables OAuth-oriented validation settings |
|
||||
| `GITEA_OAUTH_CLIENT_ID` | Yes when `OAUTH_MODE=true` | - | OAuth client id |
|
||||
| `GITEA_OAUTH_CLIENT_SECRET` | Yes when `OAUTH_MODE=true` | - | OAuth client secret |
|
||||
| `OAUTH_EXPECTED_AUDIENCE` | No | empty | Expected JWT audience; defaults to client id |
|
||||
| `OAUTH_CACHE_TTL_SECONDS` | No | `300` | OIDC discovery/JWKS cache TTL |
|
||||
|
||||
## MCP Server Settings
|
||||
|
||||
@@ -25,84 +23,44 @@ The `GITEA_TOKEN` must be a token belonging to a user that has at least read acc
|
||||
|---|---|---|---|
|
||||
| `MCP_HOST` | No | `127.0.0.1` | Interface to bind to |
|
||||
| `MCP_PORT` | No | `8080` | Port to listen on |
|
||||
| `MCP_DOMAIN` | No | — | Public domain name (used for Traefik labels in Docker) |
|
||||
| `LOG_LEVEL` | No | `INFO` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |
|
||||
| `STARTUP_VALIDATE_GITEA` | No | `true` | Validate Gitea token and connectivity at startup via `/api/v1/user` |
|
||||
| `ALLOW_INSECURE_BIND` | No | `false` | Explicit opt-in required for `0.0.0.0` bind |
|
||||
| `LOG_LEVEL` | No | `INFO` | `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |
|
||||
| `STARTUP_VALIDATE_GITEA` | No | `true` | Validate OIDC discovery endpoint at startup |
|
||||
|
||||
If startup validation fails with `403 Forbidden`, the token is authenticated but lacks permission to access `/api/v1/user`. Grant the bot user token the required API scope/permissions, or temporarily set `STARTUP_VALIDATE_GITEA=false` in controlled troubleshooting environments.
|
||||
|
||||
---
|
||||
|
||||
## Authentication Settings
|
||||
## Security and Limits
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `AUTH_ENABLED` | No | `true` | Enable or disable API key authentication |
|
||||
| `MCP_API_KEYS` | Yes (if auth enabled) | — | Comma-separated list of valid API keys |
|
||||
| `MAX_AUTH_FAILURES` | No | `5` | Number of failed attempts before rate limiting an IP |
|
||||
| `AUTH_FAILURE_WINDOW` | No | `300` | Time window in seconds for counting failures |
|
||||
| `MAX_AUTH_FAILURES` | No | `5` | Failed auth attempts before rate limiting |
|
||||
| `AUTH_FAILURE_WINDOW` | No | `300` | Window in seconds for auth failure counting |
|
||||
| `RATE_LIMIT_PER_MINUTE` | No | `60` | Per-IP request limit |
|
||||
| `TOKEN_RATE_LIMIT_PER_MINUTE` | No | `120` | Per-token request limit |
|
||||
| `MAX_FILE_SIZE_BYTES` | No | `1048576` | Max file payload returned by read tools |
|
||||
| `MAX_TOOL_RESPONSE_ITEMS` | No | `200` | Max list items in tool responses |
|
||||
| `MAX_TOOL_RESPONSE_CHARS` | No | `20000` | Max chars in text fields |
|
||||
| `REQUEST_TIMEOUT_SECONDS` | No | `30` | Upstream timeout for Gitea calls |
|
||||
| `SECRET_DETECTION_MODE` | No | `mask` | `off`, `mask`, `block` |
|
||||
|
||||
### API Key Requirements
|
||||
|
||||
- Minimum length: 32 characters
|
||||
- Recommended: generate with `make generate-key` (produces 64-character hex keys)
|
||||
- Multiple keys: separate with commas — useful during key rotation
|
||||
|
||||
```env
|
||||
# Single key
|
||||
MCP_API_KEYS=abc123...
|
||||
|
||||
# Multiple keys (grace period during rotation)
|
||||
MCP_API_KEYS=newkey123...,oldkey456...
|
||||
```
|
||||
|
||||
> **Warning:** Setting `AUTH_ENABLED=false` disables all authentication. Only do this in isolated development environments.
|
||||
|
||||
---
|
||||
|
||||
## File Access Settings
|
||||
## Write Mode
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `MAX_FILE_SIZE_BYTES` | No | `1048576` | Maximum file size the server will return (bytes). Default: 1 MB |
|
||||
| `REQUEST_TIMEOUT_SECONDS` | No | `30` | Timeout for upstream Gitea API calls (seconds) |
|
||||
| `WRITE_MODE` | No | `false` | Enables write tools |
|
||||
| `WRITE_REPOSITORY_WHITELIST` | Required if write mode enabled and allow-all disabled | empty | Comma-separated `owner/repo` allow list |
|
||||
| `WRITE_ALLOW_ALL_TOKEN_REPOS` | No | `false` | Allow all repos accessible by token |
|
||||
|
||||
---
|
||||
|
||||
## Audit Logging Settings
|
||||
## Automation
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `AUDIT_LOG_PATH` | No | `/var/log/aegis-mcp/audit.log` | Absolute path for the JSON audit log file |
|
||||
| `AUTOMATION_ENABLED` | No | `false` | Enables automation endpoints |
|
||||
| `AUTOMATION_SCHEDULER_ENABLED` | No | `false` | Enables scheduler loop |
|
||||
| `AUTOMATION_STALE_DAYS` | No | `30` | Age threshold for stale issue checks |
|
||||
|
||||
The directory is created automatically if it does not exist (requires write permission).
|
||||
## Legacy Compatibility Variables
|
||||
|
||||
---
|
||||
These are retained for compatibility but not used for OAuth-protected MCP tool execution:
|
||||
|
||||
## Full Example
|
||||
|
||||
```env
|
||||
# Gitea
|
||||
GITEA_URL=https://gitea.example.com
|
||||
GITEA_TOKEN=abcdef1234567890abcdef1234567890
|
||||
|
||||
# Server
|
||||
MCP_HOST=127.0.0.1
|
||||
MCP_PORT=8080
|
||||
MCP_DOMAIN=mcp.example.com
|
||||
LOG_LEVEL=INFO
|
||||
STARTUP_VALIDATE_GITEA=true
|
||||
|
||||
# Auth
|
||||
AUTH_ENABLED=true
|
||||
MCP_API_KEYS=a1b2c3d4e5f6...64chars
|
||||
MAX_AUTH_FAILURES=5
|
||||
AUTH_FAILURE_WINDOW=300
|
||||
|
||||
# Limits
|
||||
MAX_FILE_SIZE_BYTES=1048576
|
||||
REQUEST_TIMEOUT_SECONDS=30
|
||||
|
||||
# Audit
|
||||
AUDIT_LOG_PATH=/var/log/aegis-mcp/audit.log
|
||||
```
|
||||
- `GITEA_TOKEN`
|
||||
- `MCP_API_KEYS`
|
||||
- `AUTH_ENABLED`
|
||||
|
||||
@@ -2,26 +2,29 @@
|
||||
|
||||
## Secure Defaults
|
||||
|
||||
- Default bind: `MCP_HOST=127.0.0.1`.
|
||||
- Binding `0.0.0.0` requires explicit `ALLOW_INSECURE_BIND=true`.
|
||||
- Default bind is `127.0.0.1`.
|
||||
- Binding `0.0.0.0` requires `ALLOW_INSECURE_BIND=true`.
|
||||
- Write mode disabled by default.
|
||||
- Policy file path configurable via `POLICY_FILE_PATH`.
|
||||
- Policy checks run before tool execution.
|
||||
- OAuth-protected MCP challenge responses are enabled by default for tool calls.
|
||||
|
||||
## Local Development
|
||||
|
||||
```bash
|
||||
make install-dev
|
||||
cp .env.example .env
|
||||
make generate-key
|
||||
make run
|
||||
```
|
||||
|
||||
## Docker
|
||||
|
||||
- Use `docker/Dockerfile` (non-root runtime).
|
||||
- Use compose profiles:
|
||||
- `prod`: hardened runtime profile.
|
||||
- `dev`: local development profile (localhost-only port bind).
|
||||
Use `docker/Dockerfile`:
|
||||
|
||||
- Multi-stage image build.
|
||||
- Non-root runtime user.
|
||||
- Production env flags (`NODE_ENV=production`, `ENVIRONMENT=production`).
|
||||
- Only required app files copied.
|
||||
- Healthcheck on `/health`.
|
||||
|
||||
Run examples:
|
||||
|
||||
@@ -30,17 +33,18 @@ docker compose --profile prod up -d
|
||||
docker compose --profile dev up -d
|
||||
```
|
||||
|
||||
## Environment Validation
|
||||
## CI/CD (Gitea Workflows)
|
||||
|
||||
Startup validates:
|
||||
- Required Gitea settings.
|
||||
- API keys (when auth enabled).
|
||||
- Insecure bind opt-in.
|
||||
- Write whitelist when write mode enabled (unless `WRITE_ALLOW_ALL_TOKEN_REPOS=true`).
|
||||
Workflows live in `.gitea/workflows/`:
|
||||
|
||||
- `lint.yml`: ruff + format checks + mypy.
|
||||
- `test.yml`: lint + tests + coverage fail-under `80`.
|
||||
- `docker.yml`: gated Docker build (depends on lint+test), SHA tag, `latest` on `main`.
|
||||
|
||||
## Production Recommendations
|
||||
|
||||
- Run behind TLS-terminating reverse proxy.
|
||||
- Restrict network exposure.
|
||||
- Persist and rotate audit logs.
|
||||
- Enable external monitoring for `/metrics`.
|
||||
- Place MCP behind TLS reverse proxy.
|
||||
- Restrict inbound traffic to expected clients.
|
||||
- Persist and monitor audit logs.
|
||||
- Monitor `/metrics` and auth-failure events.
|
||||
- Rotate OAuth client credentials when required.
|
||||
|
||||
@@ -2,38 +2,57 @@
|
||||
|
||||
## Core Controls
|
||||
|
||||
- API key authentication with constant-time comparison.
|
||||
- Auth failure throttling.
|
||||
- Per-IP and per-token request rate limits.
|
||||
- Strict input validation via Pydantic schemas (`extra=forbid`).
|
||||
- Policy engine authorization before tool execution.
|
||||
- Secret detection with mask/block behavior.
|
||||
- Production-safe error responses (no stack traces).
|
||||
- OAuth2/OIDC bearer-token authentication for MCP tool execution.
|
||||
- OIDC discovery + JWKS validation cache for JWT tokens.
|
||||
- Userinfo validation fallback for opaque OAuth tokens.
|
||||
- Scope enforcement:
|
||||
- `read:repository` for read tools.
|
||||
- `write:repository` for write tools.
|
||||
- Policy engine checks before tool execution.
|
||||
- Per-IP and per-token rate limiting.
|
||||
- Strict schema validation (`extra=forbid`).
|
||||
- Tamper-evident audit logging with hash chaining.
|
||||
- Secret sanitization for logs and tool output.
|
||||
- Production-safe error responses (no internal stack traces).
|
||||
|
||||
## Threat Model
|
||||
|
||||
### Why shared bot tokens are dangerous
|
||||
|
||||
- A single leaked bot token can expose all repositories that bot can access.
|
||||
- Access is not naturally bounded per end user.
|
||||
- Blast radius is large and cross-tenant.
|
||||
|
||||
### Why token-in-URL is insecure
|
||||
|
||||
- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines.
|
||||
- Bearer tokens must be passed in `Authorization` headers only.
|
||||
|
||||
### Why per-user OAuth reduces lateral access
|
||||
|
||||
- Each MCP request executes with the signed-in user token.
|
||||
- Gitea authorization stays source-of-truth for repository visibility.
|
||||
- A compromised token is limited to that user’s permissions.
|
||||
|
||||
## Prompt Injection Hardening
|
||||
|
||||
Repository content is treated strictly as data.
|
||||
Repository content is treated as untrusted data.
|
||||
|
||||
- Tool outputs are bounded and sanitized.
|
||||
- No instruction execution from repository text.
|
||||
- Untrusted content handling helpers enforce maximum output size.
|
||||
- No instructions from repository text are executed.
|
||||
- Text fields are size-limited before returning to LLM clients.
|
||||
|
||||
## Secret Detection
|
||||
|
||||
Detected classes include:
|
||||
- API keys and generic token patterns.
|
||||
|
||||
- API key and token patterns.
|
||||
- JWT-like tokens.
|
||||
- Private key block markers.
|
||||
- Common provider token formats.
|
||||
- Common provider credential formats.
|
||||
|
||||
Behavior:
|
||||
|
||||
- `SECRET_DETECTION_MODE=mask`: redact in place.
|
||||
- `SECRET_DETECTION_MODE=block`: replace secret-bearing field values.
|
||||
- `SECRET_DETECTION_MODE=block`: replace secret-bearing values.
|
||||
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
|
||||
|
||||
## Authentication and Key Lifecycle
|
||||
|
||||
- Keys must be at least 32 characters.
|
||||
- Rotate keys regularly (`scripts/rotate_api_key.py`).
|
||||
- Check key age and expiry (`scripts/check_key_age.py`).
|
||||
- Prefer dedicated bot credentials with least privilege.
|
||||
|
||||
Reference in New Issue
Block a user