Add OAuth2/OIDC per-user Gitea authentication
Introduce a GiteaOAuthValidator for JWT and userinfo validation and fallbacks, add /oauth/token proxy, and thread per-user tokens through the request context and automation paths. Update config and .env.example for OAuth-first mode, add OpenAPI, extensive unit/integration tests, GitHub/Gitea CI workflows, docs, and lint/test enforcement (>=80% cov).
This commit is contained in:
@@ -2,38 +2,57 @@
|
||||
|
||||
## Core Controls
|
||||
|
||||
- API key authentication with constant-time comparison.
|
||||
- Auth failure throttling.
|
||||
- Per-IP and per-token request rate limits.
|
||||
- Strict input validation via Pydantic schemas (`extra=forbid`).
|
||||
- Policy engine authorization before tool execution.
|
||||
- Secret detection with mask/block behavior.
|
||||
- Production-safe error responses (no stack traces).
|
||||
- OAuth2/OIDC bearer-token authentication for MCP tool execution.
|
||||
- OIDC discovery + JWKS validation cache for JWT tokens.
|
||||
- Userinfo validation fallback for opaque OAuth tokens.
|
||||
- Scope enforcement:
|
||||
- `read:repository` for read tools.
|
||||
- `write:repository` for write tools.
|
||||
- Policy engine checks before tool execution.
|
||||
- Per-IP and per-token rate limiting.
|
||||
- Strict schema validation (`extra=forbid`).
|
||||
- Tamper-evident audit logging with hash chaining.
|
||||
- Secret sanitization for logs and tool output.
|
||||
- Production-safe error responses (no internal stack traces).
|
||||
|
||||
## Threat Model
|
||||
|
||||
### Why shared bot tokens are dangerous
|
||||
|
||||
- A single leaked bot token can expose all repositories that bot can access.
|
||||
- Access is not naturally bounded per end user.
|
||||
- Blast radius is large and cross-tenant.
|
||||
|
||||
### Why token-in-URL is insecure
|
||||
|
||||
- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines.
|
||||
- Bearer tokens must be passed in `Authorization` headers only.
|
||||
|
||||
### Why per-user OAuth reduces lateral access
|
||||
|
||||
- Each MCP request executes with the signed-in user token.
|
||||
- Gitea authorization stays source-of-truth for repository visibility.
|
||||
- A compromised token is limited to that user’s permissions.
|
||||
|
||||
## Prompt Injection Hardening
|
||||
|
||||
Repository content is treated strictly as data.
|
||||
Repository content is treated as untrusted data.
|
||||
|
||||
- Tool outputs are bounded and sanitized.
|
||||
- No instruction execution from repository text.
|
||||
- Untrusted content handling helpers enforce maximum output size.
|
||||
- No instructions from repository text are executed.
|
||||
- Text fields are size-limited before returning to LLM clients.
|
||||
|
||||
## Secret Detection
|
||||
|
||||
Detected classes include:
|
||||
- API keys and generic token patterns.
|
||||
|
||||
- API key and token patterns.
|
||||
- JWT-like tokens.
|
||||
- Private key block markers.
|
||||
- Common provider token formats.
|
||||
- Common provider credential formats.
|
||||
|
||||
Behavior:
|
||||
|
||||
- `SECRET_DETECTION_MODE=mask`: redact in place.
|
||||
- `SECRET_DETECTION_MODE=block`: replace secret-bearing field values.
|
||||
- `SECRET_DETECTION_MODE=block`: replace secret-bearing values.
|
||||
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
|
||||
|
||||
## Authentication and Key Lifecycle
|
||||
|
||||
- Keys must be at least 32 characters.
|
||||
- Rotate keys regularly (`scripts/rotate_api_key.py`).
|
||||
- Check key age and expiry (`scripts/check_key_age.py`).
|
||||
- Prefer dedicated bot credentials with least privilege.
|
||||
|
||||
Reference in New Issue
Block a user