Add OAuth2/OIDC per-user Gitea authentication
Some checks failed
docker / lint (push) Has been cancelled
docker / test (push) Has been cancelled
docker / docker-build (push) Has been cancelled
lint / lint (push) Has been cancelled
test / test (push) Has been cancelled

Introduce a GiteaOAuthValidator for JWT and userinfo validation and
fallbacks, add /oauth/token proxy, and thread per-user tokens through
the
request context and automation paths. Update config and .env.example for
OAuth-first mode, add OpenAPI, extensive unit/integration tests,
GitHub/Gitea CI workflows, docs, and lint/test enforcement (>=80% cov).
This commit is contained in:
2026-02-25 16:54:01 +01:00
parent a00b6a0ba2
commit 59e1ea53a8
31 changed files with 2575 additions and 660 deletions

View File

@@ -2,38 +2,57 @@
## Core Controls
- API key authentication with constant-time comparison.
- Auth failure throttling.
- Per-IP and per-token request rate limits.
- Strict input validation via Pydantic schemas (`extra=forbid`).
- Policy engine authorization before tool execution.
- Secret detection with mask/block behavior.
- Production-safe error responses (no stack traces).
- OAuth2/OIDC bearer-token authentication for MCP tool execution.
- OIDC discovery + JWKS validation cache for JWT tokens.
- Userinfo validation fallback for opaque OAuth tokens.
- Scope enforcement:
- `read:repository` for read tools.
- `write:repository` for write tools.
- Policy engine checks before tool execution.
- Per-IP and per-token rate limiting.
- Strict schema validation (`extra=forbid`).
- Tamper-evident audit logging with hash chaining.
- Secret sanitization for logs and tool output.
- Production-safe error responses (no internal stack traces).
## Threat Model
### Why shared bot tokens are dangerous
- A single leaked bot token can expose all repositories that bot can access.
- Access is not naturally bounded per end user.
- Blast radius is large and cross-tenant.
### Why token-in-URL is insecure
- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines.
- Bearer tokens must be passed in `Authorization` headers only.
### Why per-user OAuth reduces lateral access
- Each MCP request executes with the signed-in user token.
- Gitea authorization stays source-of-truth for repository visibility.
- A compromised token is limited to that users permissions.
## Prompt Injection Hardening
Repository content is treated strictly as data.
Repository content is treated as untrusted data.
- Tool outputs are bounded and sanitized.
- No instruction execution from repository text.
- Untrusted content handling helpers enforce maximum output size.
- No instructions from repository text are executed.
- Text fields are size-limited before returning to LLM clients.
## Secret Detection
Detected classes include:
- API keys and generic token patterns.
- API key and token patterns.
- JWT-like tokens.
- Private key block markers.
- Common provider token formats.
- Common provider credential formats.
Behavior:
- `SECRET_DETECTION_MODE=mask`: redact in place.
- `SECRET_DETECTION_MODE=block`: replace secret-bearing field values.
- `SECRET_DETECTION_MODE=block`: replace secret-bearing values.
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
## Authentication and Key Lifecycle
- Keys must be at least 32 characters.
- Rotate keys regularly (`scripts/rotate_api_key.py`).
- Check key age and expiry (`scripts/check_key_age.py`).
- Prefer dedicated bot credentials with least privilege.