Add OAuth2/OIDC per-user Gitea authentication
Some checks failed
docker / lint (push) Has been cancelled
docker / test (push) Has been cancelled
docker / docker-build (push) Has been cancelled
lint / lint (push) Has been cancelled
test / test (push) Has been cancelled

Introduce a GiteaOAuthValidator for JWT and userinfo validation and
fallbacks, add /oauth/token proxy, and thread per-user tokens through
the
request context and automation paths. Update config and .env.example for
OAuth-first mode, add OpenAPI, extensive unit/integration tests,
GitHub/Gitea CI workflows, docs, and lint/test enforcement (>=80% cov).
This commit is contained in:
2026-02-25 16:54:01 +01:00
parent a00b6a0ba2
commit 59e1ea53a8
31 changed files with 2575 additions and 660 deletions

View File

@@ -1,49 +1,69 @@
# API Reference
## Endpoints
## Core Endpoints
- `GET /`: server metadata.
- `GET /health`: health probe.
- `GET /metrics`: Prometheus metrics (when enabled).
- `POST /automation/webhook`: ingest policy-controlled webhook events.
- `POST /automation/jobs/run`: run policy-controlled automation jobs.
## OAuth Discovery and Token Exchange
- `GET /.well-known/oauth-protected-resource`
- Returns OAuth protected resource metadata used by MCP clients.
- `GET /.well-known/oauth-authorization-server`
- Returns OAuth authorization server metadata.
- `POST /oauth/token`
- Proxies OAuth authorization-code token exchange to Gitea.
## MCP Endpoints
- `GET /mcp/tools`: list tool definitions.
- `POST /mcp/tool/call`: execute a tool (`Authorization: Bearer <api-key>` required except in explicitly disabled auth mode).
- `POST /mcp/tool/call`: execute a tool.
- `GET /mcp/sse` and `POST /mcp/sse`: MCP SSE transport.
## Automation Jobs
Authentication requirements:
`POST /automation/jobs/run` supports:
- `dependency_hygiene_scan` (read-only scaffold).
- `stale_issue_detection` (read-only issue age analysis).
- `auto_issue_creation` (write-mode + whitelist + policy required).
- MCP tool execution requires `Authorization: Bearer <token>`.
- Missing or invalid tokens return `401` with:
- `WWW-Authenticate: Bearer resource_metadata="<absolute metadata url>", scope="read:repository"`
Scope requirements:
- Read tools require `read:repository`.
- Write tools require `write:repository`.
- Insufficient scope returns `403`.
## Automation Endpoints
- `POST /automation/webhook`: ingest policy-controlled webhook events.
- `POST /automation/jobs/run`: run policy-controlled automation jobs.
## Read Tools
- `list_repositories`.
- `get_repository_info` (`owner`, `repo`).
- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`).
- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`).
- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`).
- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`).
- `get_commit_diff` (`owner`, `repo`, `sha`).
- `compare_refs` (`owner`, `repo`, `base`, `head`).
- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`).
- `get_issue` (`owner`, `repo`, `issue_number`).
- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`).
- `get_pull_request` (`owner`, `repo`, `pull_number`).
- `list_labels` (`owner`, `repo`, optional `page`, `limit`).
- `list_tags` (`owner`, `repo`, optional `page`, `limit`).
- `list_releases` (`owner`, `repo`, optional `page`, `limit`).
- `list_repositories`
- `get_repository_info` (`owner`, `repo`)
- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`)
- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`)
- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`)
- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`)
- `get_commit_diff` (`owner`, `repo`, `sha`)
- `compare_refs` (`owner`, `repo`, `base`, `head`)
- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`)
- `get_issue` (`owner`, `repo`, `issue_number`)
- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`)
- `get_pull_request` (`owner`, `repo`, `pull_number`)
- `list_labels` (`owner`, `repo`, optional `page`, `limit`)
- `list_tags` (`owner`, `repo`, optional `page`, `limit`)
- `list_releases` (`owner`, `repo`, optional `page`, `limit`)
## Write Tools (Write Mode Required)
- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`).
- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`).
- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`).
- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`).
- `add_labels` (`owner`, `repo`, `issue_number`, `labels`).
- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`).
- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`)
- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`)
- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`)
- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`)
- `add_labels` (`owner`, `repo`, `issue_number`, `labels`)
- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`)
## Validation and Limits
@@ -54,8 +74,8 @@
## Error Model
- Policy denial: HTTP `403`.
- Validation error: HTTP `400`.
- Auth error: HTTP `401`.
- Policy/scope denial: HTTP `403`.
- Validation error: HTTP `400`.
- Rate limit: HTTP `429`.
- Internal errors: HTTP `500` without stack traces in production.
- Internal errors: HTTP `500` (no stack traces in production).

View File

@@ -1,23 +1,21 @@
# Configuration
All configuration is done through environment variables. Copy `.env.example` to `.env` and set the values before starting the server.
Copy `.env.example` to `.env` and set values before starting:
```bash
cp .env.example .env
```
---
## Gitea Settings
## OAuth/OIDC Settings (Primary)
| Variable | Required | Default | Description |
|---|---|---|---|
| `GITEA_URL` | Yes | | Base URL of your Gitea instance (e.g. `https://gitea.example.com`) |
| `GITEA_TOKEN` | Yes | — | API token of the Gitea bot user |
The `GITEA_TOKEN` must be a token belonging to a user that has at least read access to all repositories you want the AI to access. The server validates the token on startup by calling the Gitea `/api/v1/user` endpoint.
---
| `GITEA_URL` | Yes | - | Base URL of your Gitea instance |
| `OAUTH_MODE` | No | `false` | Enables OAuth-oriented validation settings |
| `GITEA_OAUTH_CLIENT_ID` | Yes when `OAUTH_MODE=true` | - | OAuth client id |
| `GITEA_OAUTH_CLIENT_SECRET` | Yes when `OAUTH_MODE=true` | - | OAuth client secret |
| `OAUTH_EXPECTED_AUDIENCE` | No | empty | Expected JWT audience; defaults to client id |
| `OAUTH_CACHE_TTL_SECONDS` | No | `300` | OIDC discovery/JWKS cache TTL |
## MCP Server Settings
@@ -25,84 +23,44 @@ The `GITEA_TOKEN` must be a token belonging to a user that has at least read acc
|---|---|---|---|
| `MCP_HOST` | No | `127.0.0.1` | Interface to bind to |
| `MCP_PORT` | No | `8080` | Port to listen on |
| `MCP_DOMAIN` | No | — | Public domain name (used for Traefik labels in Docker) |
| `LOG_LEVEL` | No | `INFO` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |
| `STARTUP_VALIDATE_GITEA` | No | `true` | Validate Gitea token and connectivity at startup via `/api/v1/user` |
| `ALLOW_INSECURE_BIND` | No | `false` | Explicit opt-in required for `0.0.0.0` bind |
| `LOG_LEVEL` | No | `INFO` | `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |
| `STARTUP_VALIDATE_GITEA` | No | `true` | Validate OIDC discovery endpoint at startup |
If startup validation fails with `403 Forbidden`, the token is authenticated but lacks permission to access `/api/v1/user`. Grant the bot user token the required API scope/permissions, or temporarily set `STARTUP_VALIDATE_GITEA=false` in controlled troubleshooting environments.
---
## Authentication Settings
## Security and Limits
| Variable | Required | Default | Description |
|---|---|---|---|
| `AUTH_ENABLED` | No | `true` | Enable or disable API key authentication |
| `MCP_API_KEYS` | Yes (if auth enabled) | — | Comma-separated list of valid API keys |
| `MAX_AUTH_FAILURES` | No | `5` | Number of failed attempts before rate limiting an IP |
| `AUTH_FAILURE_WINDOW` | No | `300` | Time window in seconds for counting failures |
| `MAX_AUTH_FAILURES` | No | `5` | Failed auth attempts before rate limiting |
| `AUTH_FAILURE_WINDOW` | No | `300` | Window in seconds for auth failure counting |
| `RATE_LIMIT_PER_MINUTE` | No | `60` | Per-IP request limit |
| `TOKEN_RATE_LIMIT_PER_MINUTE` | No | `120` | Per-token request limit |
| `MAX_FILE_SIZE_BYTES` | No | `1048576` | Max file payload returned by read tools |
| `MAX_TOOL_RESPONSE_ITEMS` | No | `200` | Max list items in tool responses |
| `MAX_TOOL_RESPONSE_CHARS` | No | `20000` | Max chars in text fields |
| `REQUEST_TIMEOUT_SECONDS` | No | `30` | Upstream timeout for Gitea calls |
| `SECRET_DETECTION_MODE` | No | `mask` | `off`, `mask`, `block` |
### API Key Requirements
- Minimum length: 32 characters
- Recommended: generate with `make generate-key` (produces 64-character hex keys)
- Multiple keys: separate with commas — useful during key rotation
```env
# Single key
MCP_API_KEYS=abc123...
# Multiple keys (grace period during rotation)
MCP_API_KEYS=newkey123...,oldkey456...
```
> **Warning:** Setting `AUTH_ENABLED=false` disables all authentication. Only do this in isolated development environments.
---
## File Access Settings
## Write Mode
| Variable | Required | Default | Description |
|---|---|---|---|
| `MAX_FILE_SIZE_BYTES` | No | `1048576` | Maximum file size the server will return (bytes). Default: 1 MB |
| `REQUEST_TIMEOUT_SECONDS` | No | `30` | Timeout for upstream Gitea API calls (seconds) |
| `WRITE_MODE` | No | `false` | Enables write tools |
| `WRITE_REPOSITORY_WHITELIST` | Required if write mode enabled and allow-all disabled | empty | Comma-separated `owner/repo` allow list |
| `WRITE_ALLOW_ALL_TOKEN_REPOS` | No | `false` | Allow all repos accessible by token |
---
## Audit Logging Settings
## Automation
| Variable | Required | Default | Description |
|---|---|---|---|
| `AUDIT_LOG_PATH` | No | `/var/log/aegis-mcp/audit.log` | Absolute path for the JSON audit log file |
| `AUTOMATION_ENABLED` | No | `false` | Enables automation endpoints |
| `AUTOMATION_SCHEDULER_ENABLED` | No | `false` | Enables scheduler loop |
| `AUTOMATION_STALE_DAYS` | No | `30` | Age threshold for stale issue checks |
The directory is created automatically if it does not exist (requires write permission).
## Legacy Compatibility Variables
---
These are retained for compatibility but not used for OAuth-protected MCP tool execution:
## Full Example
```env
# Gitea
GITEA_URL=https://gitea.example.com
GITEA_TOKEN=abcdef1234567890abcdef1234567890
# Server
MCP_HOST=127.0.0.1
MCP_PORT=8080
MCP_DOMAIN=mcp.example.com
LOG_LEVEL=INFO
STARTUP_VALIDATE_GITEA=true
# Auth
AUTH_ENABLED=true
MCP_API_KEYS=a1b2c3d4e5f6...64chars
MAX_AUTH_FAILURES=5
AUTH_FAILURE_WINDOW=300
# Limits
MAX_FILE_SIZE_BYTES=1048576
REQUEST_TIMEOUT_SECONDS=30
# Audit
AUDIT_LOG_PATH=/var/log/aegis-mcp/audit.log
```
- `GITEA_TOKEN`
- `MCP_API_KEYS`
- `AUTH_ENABLED`

View File

@@ -2,26 +2,29 @@
## Secure Defaults
- Default bind: `MCP_HOST=127.0.0.1`.
- Binding `0.0.0.0` requires explicit `ALLOW_INSECURE_BIND=true`.
- Default bind is `127.0.0.1`.
- Binding `0.0.0.0` requires `ALLOW_INSECURE_BIND=true`.
- Write mode disabled by default.
- Policy file path configurable via `POLICY_FILE_PATH`.
- Policy checks run before tool execution.
- OAuth-protected MCP challenge responses are enabled by default for tool calls.
## Local Development
```bash
make install-dev
cp .env.example .env
make generate-key
make run
```
## Docker
- Use `docker/Dockerfile` (non-root runtime).
- Use compose profiles:
- `prod`: hardened runtime profile.
- `dev`: local development profile (localhost-only port bind).
Use `docker/Dockerfile`:
- Multi-stage image build.
- Non-root runtime user.
- Production env flags (`NODE_ENV=production`, `ENVIRONMENT=production`).
- Only required app files copied.
- Healthcheck on `/health`.
Run examples:
@@ -30,17 +33,18 @@ docker compose --profile prod up -d
docker compose --profile dev up -d
```
## Environment Validation
## CI/CD (Gitea Workflows)
Startup validates:
- Required Gitea settings.
- API keys (when auth enabled).
- Insecure bind opt-in.
- Write whitelist when write mode enabled (unless `WRITE_ALLOW_ALL_TOKEN_REPOS=true`).
Workflows live in `.gitea/workflows/`:
- `lint.yml`: ruff + format checks + mypy.
- `test.yml`: lint + tests + coverage fail-under `80`.
- `docker.yml`: gated Docker build (depends on lint+test), SHA tag, `latest` on `main`.
## Production Recommendations
- Run behind TLS-terminating reverse proxy.
- Restrict network exposure.
- Persist and rotate audit logs.
- Enable external monitoring for `/metrics`.
- Place MCP behind TLS reverse proxy.
- Restrict inbound traffic to expected clients.
- Persist and monitor audit logs.
- Monitor `/metrics` and auth-failure events.
- Rotate OAuth client credentials when required.

View File

@@ -2,38 +2,57 @@
## Core Controls
- API key authentication with constant-time comparison.
- Auth failure throttling.
- Per-IP and per-token request rate limits.
- Strict input validation via Pydantic schemas (`extra=forbid`).
- Policy engine authorization before tool execution.
- Secret detection with mask/block behavior.
- Production-safe error responses (no stack traces).
- OAuth2/OIDC bearer-token authentication for MCP tool execution.
- OIDC discovery + JWKS validation cache for JWT tokens.
- Userinfo validation fallback for opaque OAuth tokens.
- Scope enforcement:
- `read:repository` for read tools.
- `write:repository` for write tools.
- Policy engine checks before tool execution.
- Per-IP and per-token rate limiting.
- Strict schema validation (`extra=forbid`).
- Tamper-evident audit logging with hash chaining.
- Secret sanitization for logs and tool output.
- Production-safe error responses (no internal stack traces).
## Threat Model
### Why shared bot tokens are dangerous
- A single leaked bot token can expose all repositories that bot can access.
- Access is not naturally bounded per end user.
- Blast radius is large and cross-tenant.
### Why token-in-URL is insecure
- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines.
- Bearer tokens must be passed in `Authorization` headers only.
### Why per-user OAuth reduces lateral access
- Each MCP request executes with the signed-in user token.
- Gitea authorization stays source-of-truth for repository visibility.
- A compromised token is limited to that users permissions.
## Prompt Injection Hardening
Repository content is treated strictly as data.
Repository content is treated as untrusted data.
- Tool outputs are bounded and sanitized.
- No instruction execution from repository text.
- Untrusted content handling helpers enforce maximum output size.
- No instructions from repository text are executed.
- Text fields are size-limited before returning to LLM clients.
## Secret Detection
Detected classes include:
- API keys and generic token patterns.
- API key and token patterns.
- JWT-like tokens.
- Private key block markers.
- Common provider token formats.
- Common provider credential formats.
Behavior:
- `SECRET_DETECTION_MODE=mask`: redact in place.
- `SECRET_DETECTION_MODE=block`: replace secret-bearing field values.
- `SECRET_DETECTION_MODE=block`: replace secret-bearing values.
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
## Authentication and Key Lifecycle
- Keys must be at least 32 characters.
- Rotate keys regularly (`scripts/rotate_api_key.py`).
- Check key age and expiry (`scripts/check_key_age.py`).
- Prefer dedicated bot credentials with least privilege.