385b442b6f
Reframe the README around two transports and add a local stdio quickstart with uvx/pip and Claude Desktop / Claude Code wiring. New docs: local-quickstart.md and packaging.md (uv build/publish). Document resource-type-aware authorization and classified gitea_request in security.md; stdio env vars + audit-log fallback in configuration.md; local install in deployment.md; core+adapters in architecture.md. Add the missing root AGENTS.md contract, update CLAUDE.md with the core/adapter layout, fail-closed invariants, and the branching flow (HEAD -> feature -> dev -> main). Update roadmap/todo and .env.example. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
109 lines
5.0 KiB
Markdown
109 lines
5.0 KiB
Markdown
# Security
|
|
|
|
## Core Controls
|
|
|
|
- OAuth2/OIDC bearer-token authentication for MCP tool execution.
|
|
- OIDC discovery + JWKS validation cache for JWT tokens.
|
|
- Userinfo validation fallback for opaque OAuth tokens.
|
|
- Scope enforcement:
|
|
- `read:repository` for read tools.
|
|
- `write:repository` for write tools.
|
|
- Policy engine checks before tool execution.
|
|
- Per-IP and per-token rate limiting.
|
|
- Strict schema validation (`extra=forbid`).
|
|
- Tamper-evident audit logging with hash chaining.
|
|
- Secret sanitization for logs and tool output.
|
|
- Production-safe error responses (no internal stack traces).
|
|
|
|
## Threat Model
|
|
|
|
### Why shared bot tokens are dangerous
|
|
|
|
- A single leaked bot token can expose all repositories that bot can access.
|
|
- Access is not naturally bounded per end user.
|
|
- Blast radius is large and cross-tenant.
|
|
|
|
### Why token-in-URL is insecure
|
|
|
|
- URLs can be captured by reverse proxy logs, browser history, referer headers, and monitoring pipelines.
|
|
- Bearer tokens must be passed in `Authorization` headers only.
|
|
|
|
### Why per-user OAuth reduces lateral access
|
|
|
|
- Each MCP request executes with the signed-in user token.
|
|
- Gitea authorization stays source-of-truth for repository visibility.
|
|
- A compromised token is limited to that user�s permissions.
|
|
|
|
## Resource-type-aware authorization
|
|
|
|
The public server runs in *service-PAT mode*: a privileged bot token makes the
|
|
actual Gitea calls while the per-user OAuth identity decides what the user may
|
|
reach. Repository calls are gated by the user's collaborator permission on
|
|
`owner/repo`. The rest of the Gitea surface — reachable through the
|
|
`gitea_request` escape hatch — is gated by **resource-type-aware authorization**
|
|
(`authz.py`). Every call is classified by `(method, path)` and enforced against
|
|
a type-specific rule. **Every decision fails closed**: a call that cannot be
|
|
classified, or whose permission cannot be positively verified against Gitea, is
|
|
denied and audited.
|
|
|
|
| Resource type | Rule (service-PAT mode) |
|
|
|---------------|--------------------------|
|
|
| `repository` | Per-user collaborator permission on `owner/repo` (existing check). A repo path that cannot be parsed to `owner/repo` is denied. |
|
|
| `org` | The signed-in user must be a **verified member** of the target org (checked against Gitea, fail closed). |
|
|
| `user_owned` | A resource owned by a named user/org (`/users/{name}`, `/packages/{owner}`): allowed only when the owner is the caller, or the caller is a verified member of the owning org. |
|
|
| `user_self` | Token-owner-scoped endpoints (`/user`, `/notifications`): **denied** — in service-PAT mode the data belongs to the bot, not the caller. |
|
|
| `misc_global` | Instance-wide read-only utilities (markdown render, version, gitignore templates): reads allowed; writes denied. |
|
|
| `admin` | **Default deny.** Allowed only when the operator opts in (`RAW_API_ALLOW_SENSITIVE=true`) **and** the signed-in user is a verified Gitea site administrator. |
|
|
| `unknown` | Denied. |
|
|
|
|
This gate runs *in addition to* the policy engine and the `WRITE_MODE` gate — a
|
|
write call is denied unless write mode is on, policy allows it, and the
|
|
resource-type rule passes. In pure-OAuth mode (no service PAT) the user's own
|
|
token already scopes every call at Gitea, so the extra gate is unnecessary.
|
|
|
|
Positive verification results (org membership, site-admin) are cached briefly
|
|
and bounded; only successful checks are cached, so a transient failure never
|
|
grants access.
|
|
|
|
## Full-API coverage: classified `gitea_request`
|
|
|
|
`gitea_request` exposes the long tail of the Gitea API that the curated typed
|
|
tools do not cover, safely:
|
|
|
|
- **Deterministic read/write classifier.** `GET`/`HEAD` are reads; everything
|
|
else is a write. A small, explicit override table may only *downgrade*
|
|
provably side-effect-free render endpoints (markdown/markup) to reads — never
|
|
the reverse — so a mutating call can never be misclassified as a read and slip
|
|
past the `WRITE_MODE` gate.
|
|
- **Known-path gate.** A request whose top path segment is not a recognized
|
|
Gitea `/api/v1` route prefix is denied (fail closed): unknown paths are never
|
|
passed straight through.
|
|
- **Admin/credential denylist.** `/admin`, `*tokens*`, `*secrets*`, `*hooks*`,
|
|
`*keys*`, `applications/oauth2`, and runner registration tokens are blocked for
|
|
every method (including `GET`) and cannot be re-opened from `policy.yaml` —
|
|
only `RAW_API_ALLOW_SENSITIVE=true` overrides them, and admin then still
|
|
requires a verified site administrator (see above).
|
|
|
|
## Prompt Injection Hardening
|
|
|
|
Repository content is treated as untrusted data.
|
|
|
|
- Tool outputs are bounded and sanitized.
|
|
- No instructions from repository text are executed.
|
|
- Text fields are size-limited before returning to LLM clients.
|
|
|
|
## Secret Detection
|
|
|
|
Detected classes include:
|
|
|
|
- API key and token patterns.
|
|
- JWT-like tokens.
|
|
- Private key block markers.
|
|
- Common provider credential formats.
|
|
|
|
Behavior:
|
|
|
|
- `SECRET_DETECTION_MODE=mask`: redact in place.
|
|
- `SECRET_DETECTION_MODE=block`: replace secret-bearing values.
|
|
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
|