feat: harden gateway with policy engine, secure tools, and governance docs
This commit is contained in:
@@ -1,255 +1,61 @@
|
||||
# API Reference
|
||||
|
||||
## HTTP Endpoints
|
||||
|
||||
### `GET /`
|
||||
|
||||
Returns basic server information. No authentication required.
|
||||
|
||||
**Response**
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "AegisGitea MCP",
|
||||
"version": "0.1.0",
|
||||
"status": "running"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `GET /health`
|
||||
|
||||
Health check endpoint. No authentication required.
|
||||
|
||||
**Response**
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"gitea_connected": true
|
||||
}
|
||||
```
|
||||
|
||||
Returns HTTP 200 when healthy. Returns HTTP 503 when Gitea is unreachable.
|
||||
|
||||
---
|
||||
|
||||
### `GET /mcp/tools`
|
||||
|
||||
Returns the list of available MCP tools. No authentication required (needed for ChatGPT tool discovery).
|
||||
|
||||
**Response**
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": [
|
||||
{
|
||||
"name": "list_repositories",
|
||||
"description": "...",
|
||||
"inputSchema": { ... }
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `POST /mcp/tool/call`
|
||||
|
||||
Executes an MCP tool. **Authentication required.**
|
||||
|
||||
**Request headers**
|
||||
|
||||
```
|
||||
Authorization: Bearer <api-key>
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Request body**
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "<tool-name>",
|
||||
"arguments": { ... }
|
||||
}
|
||||
```
|
||||
|
||||
**Response**
|
||||
|
||||
```json
|
||||
{
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": "..."
|
||||
}
|
||||
],
|
||||
"isError": false
|
||||
}
|
||||
```
|
||||
|
||||
On error, `isError` is `true` and `text` contains the error message.
|
||||
|
||||
---
|
||||
|
||||
### `GET /mcp/sse`
|
||||
|
||||
Server-Sent Events stream endpoint. Authentication required. Used for streaming MCP sessions.
|
||||
|
||||
---
|
||||
|
||||
### `POST /mcp/sse`
|
||||
|
||||
Sends a client message over an active SSE session. Authentication required.
|
||||
|
||||
---
|
||||
|
||||
## Authentication
|
||||
|
||||
All authenticated endpoints require a bearer token:
|
||||
|
||||
```
|
||||
Authorization: Bearer <api-key>
|
||||
```
|
||||
|
||||
Alternatively, the key can be passed as a query parameter (useful for tools that do not support custom headers):
|
||||
|
||||
```
|
||||
GET /mcp/tool/call?api_key=<api-key>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## MCP Tools
|
||||
|
||||
### `list_repositories`
|
||||
|
||||
Lists all Gitea repositories accessible to the bot user.
|
||||
|
||||
**Arguments:** none
|
||||
|
||||
**Example response text**
|
||||
|
||||
```
|
||||
Found 3 repositories:
|
||||
|
||||
1. myorg/backend - Backend API service [Python] ★ 42
|
||||
2. myorg/frontend - React frontend [TypeScript] ★ 18
|
||||
3. myorg/infra - Infrastructure as code [HCL] ★ 5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `get_repository_info`
|
||||
|
||||
Returns metadata for a single repository.
|
||||
|
||||
**Arguments**
|
||||
|
||||
| Name | Type | Required | Description |
|
||||
|---|---|---|---|
|
||||
| `owner` | string | Yes | Repository owner (user or organisation) |
|
||||
| `repo` | string | Yes | Repository name |
|
||||
|
||||
**Example response text**
|
||||
|
||||
```
|
||||
Repository: myorg/backend
|
||||
Description: Backend API service
|
||||
Language: Python
|
||||
Stars: 42
|
||||
Forks: 3
|
||||
Default branch: main
|
||||
Private: false
|
||||
URL: https://gitea.example.com/myorg/backend
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `get_file_tree`
|
||||
|
||||
Returns the file and directory structure of a repository.
|
||||
|
||||
**Arguments**
|
||||
|
||||
| Name | Type | Required | Default | Description |
|
||||
|---|---|---|---|---|
|
||||
| `owner` | string | Yes | — | Repository owner |
|
||||
| `repo` | string | Yes | — | Repository name |
|
||||
| `ref` | string | No | default branch | Branch, tag, or commit SHA |
|
||||
| `recursive` | boolean | No | `false` | Recursively list all subdirectories |
|
||||
|
||||
> **Note:** Recursive mode is disabled by default to limit response size. Enable with care on large repositories.
|
||||
|
||||
**Example response text**
|
||||
|
||||
```
|
||||
File tree for myorg/backend (ref: main):
|
||||
|
||||
src/
|
||||
src/main.py
|
||||
src/config.py
|
||||
tests/
|
||||
tests/test_main.py
|
||||
README.md
|
||||
requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `get_file_contents`
|
||||
|
||||
Returns the contents of a single file.
|
||||
|
||||
**Arguments**
|
||||
|
||||
| Name | Type | Required | Default | Description |
|
||||
|---|---|---|---|---|
|
||||
| `owner` | string | Yes | — | Repository owner |
|
||||
| `repo` | string | Yes | — | Repository name |
|
||||
| `filepath` | string | Yes | — | Path to the file within the repository |
|
||||
| `ref` | string | No | default branch | Branch, tag, or commit SHA |
|
||||
|
||||
**Limits**
|
||||
|
||||
- Files larger than `MAX_FILE_SIZE_BYTES` (default 1 MB) are rejected.
|
||||
- Binary files that cannot be decoded as UTF-8 are returned as raw base64.
|
||||
|
||||
**Example response text**
|
||||
|
||||
```
|
||||
Contents of myorg/backend/src/main.py (ref: main):
|
||||
|
||||
import fastapi
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Responses
|
||||
|
||||
All errors follow this structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": "Error: <description>"
|
||||
}
|
||||
],
|
||||
"isError": true
|
||||
}
|
||||
```
|
||||
|
||||
Common error scenarios:
|
||||
|
||||
| Scenario | HTTP Status | `isError` |
|
||||
|---|---|---|
|
||||
| Missing or invalid API key | 401 | — (rejected before tool runs) |
|
||||
| Rate limited IP address | 429 | — |
|
||||
| Tool not found | 404 | — |
|
||||
| Repository not found in Gitea | 200 | `true` |
|
||||
| File too large | 200 | `true` |
|
||||
| Gitea API unavailable | 200 | `true` |
|
||||
## Endpoints
|
||||
|
||||
- `GET /`: server metadata.
|
||||
- `GET /health`: health probe.
|
||||
- `GET /metrics`: Prometheus metrics (when enabled).
|
||||
- `POST /automation/webhook`: ingest policy-controlled webhook events.
|
||||
- `POST /automation/jobs/run`: run policy-controlled automation jobs.
|
||||
- `GET /mcp/tools`: list tool definitions.
|
||||
- `POST /mcp/tool/call`: execute a tool (`Authorization: Bearer <api-key>` required except in explicitly disabled auth mode).
|
||||
- `GET /mcp/sse` and `POST /mcp/sse`: MCP SSE transport.
|
||||
|
||||
## Automation Jobs
|
||||
|
||||
`POST /automation/jobs/run` supports:
|
||||
- `dependency_hygiene_scan` (read-only scaffold).
|
||||
- `stale_issue_detection` (read-only issue age analysis).
|
||||
- `auto_issue_creation` (write-mode + whitelist + policy required).
|
||||
|
||||
## Read Tools
|
||||
|
||||
- `list_repositories`.
|
||||
- `get_repository_info` (`owner`, `repo`).
|
||||
- `get_file_tree` (`owner`, `repo`, optional `ref`, `recursive`).
|
||||
- `get_file_contents` (`owner`, `repo`, `filepath`, optional `ref`).
|
||||
- `search_code` (`owner`, `repo`, `query`, optional `ref`, `page`, `limit`).
|
||||
- `list_commits` (`owner`, `repo`, optional `ref`, `page`, `limit`).
|
||||
- `get_commit_diff` (`owner`, `repo`, `sha`).
|
||||
- `compare_refs` (`owner`, `repo`, `base`, `head`).
|
||||
- `list_issues` (`owner`, `repo`, optional `state`, `page`, `limit`, `labels`).
|
||||
- `get_issue` (`owner`, `repo`, `issue_number`).
|
||||
- `list_pull_requests` (`owner`, `repo`, optional `state`, `page`, `limit`).
|
||||
- `get_pull_request` (`owner`, `repo`, `pull_number`).
|
||||
- `list_labels` (`owner`, `repo`, optional `page`, `limit`).
|
||||
- `list_tags` (`owner`, `repo`, optional `page`, `limit`).
|
||||
- `list_releases` (`owner`, `repo`, optional `page`, `limit`).
|
||||
|
||||
## Write Tools (Write Mode Required)
|
||||
|
||||
- `create_issue` (`owner`, `repo`, `title`, optional `body`, `labels`, `assignees`).
|
||||
- `update_issue` (`owner`, `repo`, `issue_number`, one or more of `title`, `body`, `state`).
|
||||
- `create_issue_comment` (`owner`, `repo`, `issue_number`, `body`).
|
||||
- `create_pr_comment` (`owner`, `repo`, `pull_number`, `body`).
|
||||
- `add_labels` (`owner`, `repo`, `issue_number`, `labels`).
|
||||
- `assign_issue` (`owner`, `repo`, `issue_number`, `assignees`).
|
||||
|
||||
## Validation and Limits
|
||||
|
||||
- All tool argument schemas reject unknown fields.
|
||||
- List responses are capped by `MAX_TOOL_RESPONSE_ITEMS`.
|
||||
- Text payloads are capped by `MAX_TOOL_RESPONSE_CHARS`.
|
||||
- File reads are capped by `MAX_FILE_SIZE_BYTES`.
|
||||
|
||||
## Error Model
|
||||
|
||||
- Policy denial: HTTP `403`.
|
||||
- Validation error: HTTP `400`.
|
||||
- Auth error: HTTP `401`.
|
||||
- Rate limit: HTTP `429`.
|
||||
- Internal errors: HTTP `500` without stack traces in production.
|
||||
|
||||
33
docs/audit.md
Normal file
33
docs/audit.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Audit Logging
|
||||
|
||||
## Design
|
||||
|
||||
Audit logs are append-only JSON lines with hash chaining:
|
||||
- `prev_hash`: previous entry hash.
|
||||
- `entry_hash`: hash of current entry payload + previous hash.
|
||||
|
||||
This makes tampering detectable.
|
||||
|
||||
## Event Types
|
||||
|
||||
- `tool_invocation`
|
||||
- `access_denied`
|
||||
- `security_event`
|
||||
|
||||
Each event includes timestamps and correlation context.
|
||||
|
||||
## Integrity Validation
|
||||
|
||||
Use:
|
||||
|
||||
```bash
|
||||
python3 scripts/validate_audit_log.py --path /var/log/aegis-mcp/audit.log
|
||||
```
|
||||
|
||||
Exit code `0` indicates valid chain, non-zero indicates tamper/corruption.
|
||||
|
||||
## Operational Expectations
|
||||
|
||||
- Persist audit logs to durable storage.
|
||||
- Protect write permissions (service account only).
|
||||
- Validate integrity during incident response and release checks.
|
||||
27
docs/automation.md
Normal file
27
docs/automation.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# Automation
|
||||
|
||||
## Scope
|
||||
|
||||
Current automation capabilities:
|
||||
- Webhook ingestion endpoint (`POST /automation/webhook`).
|
||||
- On-demand scheduled-job execution endpoint (`POST /automation/jobs/run`).
|
||||
- Dependency hygiene scan job scaffold (`dependency_hygiene_scan`).
|
||||
- Stale issue detection job (`stale_issue_detection`).
|
||||
- Auto issue creation job scaffold (`auto_issue_creation`, write-mode and policy required).
|
||||
|
||||
Planned extensions:
|
||||
- Background scheduler orchestration.
|
||||
|
||||
## Control Requirements
|
||||
|
||||
All automation must be:
|
||||
- Policy-controlled.
|
||||
- Independently disableable.
|
||||
- Fully audited.
|
||||
- Explicitly documented with runbook guidance.
|
||||
|
||||
## Enablement
|
||||
|
||||
- `AUTOMATION_ENABLED=true` to allow automation endpoints.
|
||||
- `AUTOMATION_SCHEDULER_ENABLED=true` reserved for future built-in scheduler loop.
|
||||
- Policy rules must allow automation pseudo-tools (`automation_*`) per repository.
|
||||
@@ -1,126 +1,46 @@
|
||||
# Deployment
|
||||
|
||||
## Local / Development
|
||||
## Secure Defaults
|
||||
|
||||
- Default bind: `MCP_HOST=127.0.0.1`.
|
||||
- Binding `0.0.0.0` requires explicit `ALLOW_INSECURE_BIND=true`.
|
||||
- Write mode disabled by default.
|
||||
- Policy file path configurable via `POLICY_FILE_PATH`.
|
||||
|
||||
## Local Development
|
||||
|
||||
```bash
|
||||
make install-dev
|
||||
source venv/bin/activate # Linux/macOS
|
||||
# venv\Scripts\activate # Windows
|
||||
|
||||
cp .env.example .env
|
||||
# Edit .env
|
||||
make generate-key # Add key to .env
|
||||
make generate-key
|
||||
make run
|
||||
```
|
||||
|
||||
The server listens on `http://0.0.0.0:8080` by default.
|
||||
|
||||
---
|
||||
|
||||
## Docker
|
||||
|
||||
### Build
|
||||
- Use `docker/Dockerfile` (non-root runtime).
|
||||
- Use compose profiles:
|
||||
- `prod`: hardened runtime profile.
|
||||
- `dev`: local development profile (localhost-only port bind).
|
||||
|
||||
Run examples:
|
||||
|
||||
```bash
|
||||
make docker-build
|
||||
# or: docker build -f docker/Dockerfile -t aegis-gitea-mcp .
|
||||
docker compose --profile prod up -d
|
||||
docker compose --profile dev up -d
|
||||
```
|
||||
|
||||
### Configure
|
||||
## Environment Validation
|
||||
|
||||
Create a `.env` file (copy from `.env.example`) with your settings before starting the container.
|
||||
Startup validates:
|
||||
- Required Gitea settings.
|
||||
- API keys (when auth enabled).
|
||||
- Insecure bind opt-in.
|
||||
- Write whitelist when write mode enabled.
|
||||
|
||||
### Run
|
||||
## Production Recommendations
|
||||
|
||||
```bash
|
||||
make docker-up
|
||||
# or: docker-compose up -d
|
||||
```
|
||||
|
||||
### Logs
|
||||
|
||||
```bash
|
||||
make docker-logs
|
||||
# or: docker-compose logs -f
|
||||
```
|
||||
|
||||
### Stop
|
||||
|
||||
```bash
|
||||
make docker-down
|
||||
# or: docker-compose down
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## docker-compose.yml Overview
|
||||
|
||||
The included `docker-compose.yml` provides:
|
||||
|
||||
- **Health check:** polls `GET /health` every 30 seconds
|
||||
- **Audit log volume:** mounts a named volume at `/var/log/aegis-mcp` so logs survive container restarts
|
||||
- **Resource limits:** 1 CPU, 512 MB memory
|
||||
- **Security:** non-root user, `no-new-privileges`
|
||||
- **Traefik labels:** commented out — uncomment and set `MCP_DOMAIN` to enable automatic HTTPS via Traefik
|
||||
|
||||
### Enabling Traefik
|
||||
|
||||
1. Set `MCP_DOMAIN=mcp.yourdomain.com` in `.env`.
|
||||
2. Uncomment the Traefik labels in `docker-compose.yml`.
|
||||
3. Make sure Traefik is running with a `web` and `websecure` entrypoint and Let's Encrypt configured.
|
||||
|
||||
---
|
||||
|
||||
## Dockerfile Details
|
||||
|
||||
The image uses a multi-stage build:
|
||||
|
||||
| Stage | Base image | Purpose |
|
||||
|---|---|---|
|
||||
| `builder` | `python:3.11-slim` | Install dependencies |
|
||||
| `final` | `python:3.11-slim` | Minimal runtime image |
|
||||
|
||||
The final image:
|
||||
- Runs as user `aegis` (UID 1000, GID 1000)
|
||||
- Exposes port `8080`
|
||||
- Entry point: `python -m aegis_gitea_mcp.server`
|
||||
|
||||
---
|
||||
|
||||
## Production Checklist
|
||||
|
||||
- [ ] `AUTH_ENABLED=true` and `MCP_API_KEYS` set to a strong key
|
||||
- [ ] `GITEA_TOKEN` belongs to a dedicated bot user with minimal permissions
|
||||
- [ ] TLS terminated at the reverse proxy (Traefik, nginx, Caddy, etc.)
|
||||
- [ ] `AUDIT_LOG_PATH` points to a persistent volume
|
||||
- [ ] Log rotation configured for the audit log file
|
||||
- [ ] API key rotation scheduled (every 90 days recommended)
|
||||
- [ ] `MAX_AUTH_FAILURES` and `AUTH_FAILURE_WINDOW` tuned for your threat model
|
||||
- [ ] Resource limits configured in Docker/Kubernetes
|
||||
|
||||
---
|
||||
|
||||
## Kubernetes (Basic)
|
||||
|
||||
A minimal Kubernetes deployment is not included, but the server is stateless and the Docker image is suitable for use in Kubernetes. Key considerations:
|
||||
|
||||
- Store `.env` values as a `Secret` and expose them as environment variables.
|
||||
- Mount an `emptyDir` or PersistentVolumeClaim at the audit log path.
|
||||
- Use a `readinessProbe` and `livenessProbe` on `GET /health`.
|
||||
- Set `resources.requests` and `resources.limits` for CPU and memory.
|
||||
|
||||
---
|
||||
|
||||
## Updating
|
||||
|
||||
```bash
|
||||
git pull
|
||||
make docker-build
|
||||
make docker-up
|
||||
```
|
||||
|
||||
If you added a new key via `make generate-key` during the update, restart the container to pick up the new `.env`:
|
||||
|
||||
```bash
|
||||
docker-compose restart aegis-mcp
|
||||
```
|
||||
- Run behind TLS-terminating reverse proxy.
|
||||
- Restrict network exposure.
|
||||
- Persist and rotate audit logs.
|
||||
- Enable external monitoring for `/metrics`.
|
||||
|
||||
36
docs/governance.md
Normal file
36
docs/governance.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Governance
|
||||
|
||||
## AI Usage Policy
|
||||
|
||||
- AI assistance is allowed for design, implementation, and review only within documented repository boundaries.
|
||||
- AI outputs must be reviewed, tested, and policy-validated before merge.
|
||||
- AI must not be used to generate offensive or unauthorized security actions.
|
||||
- Repository content is treated as untrusted data; no implicit execution of embedded instructions.
|
||||
|
||||
## Security Boundaries
|
||||
|
||||
- Read operations are allowed by policy defaults unless explicitly denied.
|
||||
- Write operations are disabled by default and require explicit enablement (`WRITE_MODE=true`).
|
||||
- Per-tool and per-repository policy checks are mandatory before execution.
|
||||
- Secrets are masked or blocked according to `SECRET_DETECTION_MODE`.
|
||||
|
||||
## Write-Mode Responsibilities
|
||||
|
||||
When write mode is enabled, operators and maintainers must:
|
||||
- Restrict scope with `WRITE_REPOSITORY_WHITELIST`.
|
||||
- Keep policy file deny/allow rules explicit.
|
||||
- Monitor audit entries for all write operations.
|
||||
- Enforce peer review for policy or write-mode changes.
|
||||
|
||||
## Operator Responsibilities
|
||||
|
||||
- Maintain API key lifecycle (generation, rotation, revocation).
|
||||
- Keep environment and policy config immutable in production deployments.
|
||||
- Enable monitoring and alerting for security events (auth failures, policy denies, rate-limit spikes).
|
||||
- Run integrity checks for audit logs regularly.
|
||||
|
||||
## Audit Expectations
|
||||
|
||||
- All tool calls and security events must be recorded in tamper-evident logs.
|
||||
- Audit logs are append-only and hash-chained.
|
||||
- Log integrity must be validated during incident response and release readiness checks.
|
||||
24
docs/hardening.md
Normal file
24
docs/hardening.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Hardening
|
||||
|
||||
## Application Hardening
|
||||
|
||||
- Secure defaults: localhost bind, write mode disabled, policy-enforced writes.
|
||||
- Strict config validation at startup.
|
||||
- Redacted secret handling in logs and responses.
|
||||
- Policy deny/allow model with path restrictions.
|
||||
- Non-leaking production error responses.
|
||||
|
||||
## Container Hardening
|
||||
|
||||
- Non-root runtime user.
|
||||
- `no-new-privileges` and dropped Linux capabilities.
|
||||
- Read-only filesystem where practical.
|
||||
- Explicit health checks.
|
||||
- Separate dev and production compose profiles.
|
||||
|
||||
## Operational Hardening
|
||||
|
||||
- Rotate API keys regularly.
|
||||
- Minimize Gitea bot permissions.
|
||||
- Keep policy file under change control.
|
||||
- Alert on repeated policy denials and auth failures.
|
||||
28
docs/observability.md
Normal file
28
docs/observability.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Observability
|
||||
|
||||
## Logging
|
||||
|
||||
- Structured JSON logs.
|
||||
- Request correlation via `X-Request-ID`.
|
||||
- Security events and policy denials are audit logged.
|
||||
|
||||
## Metrics
|
||||
|
||||
Prometheus-compatible endpoint: `GET /metrics`.
|
||||
|
||||
Current metrics:
|
||||
- `aegis_http_requests_total{method,path,status}`
|
||||
- `aegis_tool_calls_total{tool,status}`
|
||||
- `aegis_tool_duration_seconds_sum{tool}`
|
||||
- `aegis_tool_duration_seconds_count{tool}`
|
||||
|
||||
## Tracing and Correlation
|
||||
|
||||
- Request IDs propagate in response header (`X-Request-ID`).
|
||||
- Tool-level correlation IDs included in MCP responses.
|
||||
|
||||
## Operational Guidance
|
||||
|
||||
- Alert on spikes in 401/403/429 rates.
|
||||
- Alert on repeated `access_denied` and auth-rate-limit events.
|
||||
- Track tool latency trends for incident triage.
|
||||
50
docs/policy.md
Normal file
50
docs/policy.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Policy Engine
|
||||
|
||||
## Overview
|
||||
|
||||
Aegis uses a YAML policy engine to authorize tool execution before any Gitea API call is made.
|
||||
|
||||
## Behavior Summary
|
||||
|
||||
- Global tool allow/deny supported.
|
||||
- Per-repository tool allow/deny supported.
|
||||
- Optional repository path allow/deny supported.
|
||||
- Write operations are denied by default.
|
||||
- Write operations also require `WRITE_MODE=true` and `WRITE_REPOSITORY_WHITELIST` match.
|
||||
|
||||
## Example Configuration
|
||||
|
||||
```yaml
|
||||
defaults:
|
||||
read: allow
|
||||
write: deny
|
||||
|
||||
tools:
|
||||
deny:
|
||||
- search_code
|
||||
|
||||
repositories:
|
||||
acme/service-a:
|
||||
tools:
|
||||
allow:
|
||||
- get_file_contents
|
||||
- list_commits
|
||||
paths:
|
||||
allow:
|
||||
- src/*
|
||||
deny:
|
||||
- src/secrets/*
|
||||
```
|
||||
|
||||
## Failure Behavior
|
||||
|
||||
- Invalid YAML or invalid schema: startup failure (fail closed).
|
||||
- Denied tool call: HTTP `403` + audit `access_denied` entry.
|
||||
- Path traversal attempt in path-scoped tools: denied by validation/policy checks.
|
||||
|
||||
## Operational Guidance
|
||||
|
||||
- Keep policy files version-controlled and code-reviewed.
|
||||
- Prefer explicit deny entries for sensitive tools.
|
||||
- Use repository-specific allow lists for high-risk environments.
|
||||
- Test policy updates in staging before production rollout.
|
||||
72
docs/roadmap.md
Normal file
72
docs/roadmap.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# Roadmap
|
||||
|
||||
## High-Level Evolution Plan
|
||||
|
||||
1. Hardened read-only gateway baseline.
|
||||
2. Policy-driven authorization and observability.
|
||||
3. Controlled write-mode rollout.
|
||||
4. Automation and event-driven workflows.
|
||||
5. Continuous hardening and enterprise controls.
|
||||
|
||||
## Threat Model Updates
|
||||
|
||||
- Primary threats: credential theft, over-permissioned automation, prompt injection via repo data, policy bypass, audit tampering.
|
||||
- Secondary threats: denial-of-service, misconfiguration drift, unsafe deployment defaults.
|
||||
|
||||
## Security Model
|
||||
|
||||
- API key authentication + auth failure throttling.
|
||||
- Per-IP and per-token request rate limits.
|
||||
- Secret detection and outbound sanitization.
|
||||
- Tamper-evident audit logs with integrity verification.
|
||||
- No production stack-trace disclosure.
|
||||
|
||||
## Policy Model
|
||||
|
||||
- YAML policy with global and per-repository allow/deny rules.
|
||||
- Optional path restrictions for file-oriented tools.
|
||||
- Default write deny.
|
||||
- Write-mode repository whitelist enforcement.
|
||||
|
||||
## Capability Matrix Concept
|
||||
|
||||
- `Read` capabilities: enabled by default but policy-filtered.
|
||||
- `Write` capabilities: disabled by default, policy + whitelist gated.
|
||||
- `Automation` capabilities: disabled by default, policy-controlled.
|
||||
|
||||
## Audit Log Design
|
||||
|
||||
- JSON lines.
|
||||
- `prev_hash` + `entry_hash` chain.
|
||||
- Correlation/request IDs for traceability.
|
||||
- Validation script for chain integrity.
|
||||
|
||||
## Write-Mode Architecture
|
||||
|
||||
- Separate write tool set with strict schemas.
|
||||
- Global toggle (`WRITE_MODE`) + per-repo whitelist.
|
||||
- Policy engine still authoritative.
|
||||
- No merge, branch deletion, or force push endpoints.
|
||||
|
||||
## Deployment Architecture
|
||||
|
||||
- Non-root container runtime.
|
||||
- Read-only filesystem where practical.
|
||||
- Explicit opt-in for insecure bind.
|
||||
- Separate dev and prod compose profiles.
|
||||
|
||||
## Observability Architecture
|
||||
|
||||
- Structured JSON logs with request correlation.
|
||||
- Prometheus-compatible `/metrics` endpoint.
|
||||
- Tool execution counters and duration aggregates.
|
||||
|
||||
## Risk Analysis
|
||||
|
||||
- Highest risk: write-mode misuse and policy misconfiguration.
|
||||
- Mitigations: deny-by-default, whitelist, audit chain, tests, docs, reviews.
|
||||
|
||||
## Extensibility Notes
|
||||
|
||||
- Add new tools only through schema + policy + docs + tests path.
|
||||
- Keep transport-agnostic execution core for webhook/scheduler integrations.
|
||||
172
docs/security.md
172
docs/security.md
@@ -1,155 +1,39 @@
|
||||
# Security
|
||||
|
||||
## Authentication
|
||||
## Core Controls
|
||||
|
||||
AegisGitea MCP uses bearer token authentication. Clients must include a valid API key with every tool call.
|
||||
- API key authentication with constant-time comparison.
|
||||
- Auth failure throttling.
|
||||
- Per-IP and per-token request rate limits.
|
||||
- Strict input validation via Pydantic schemas (`extra=forbid`).
|
||||
- Policy engine authorization before tool execution.
|
||||
- Secret detection with mask/block behavior.
|
||||
- Production-safe error responses (no stack traces).
|
||||
|
||||
### How It Works
|
||||
## Prompt Injection Hardening
|
||||
|
||||
1. The client sends `Authorization: Bearer <key>` with its request.
|
||||
2. The server extracts the token and validates it against the configured `MCP_API_KEYS`.
|
||||
3. Comparison is done in **constant time** to prevent timing attacks.
|
||||
4. If validation fails, the failure is counted against the client's IP address.
|
||||
Repository content is treated strictly as data.
|
||||
|
||||
### Generating API Keys
|
||||
- Tool outputs are bounded and sanitized.
|
||||
- No instruction execution from repository text.
|
||||
- Untrusted content handling helpers enforce maximum output size.
|
||||
|
||||
Use the provided script to generate cryptographically secure 64-character hex keys:
|
||||
## Secret Detection
|
||||
|
||||
```bash
|
||||
make generate-key
|
||||
# or: python scripts/generate_api_key.py
|
||||
```
|
||||
Detected classes include:
|
||||
- API keys and generic token patterns.
|
||||
- JWT-like tokens.
|
||||
- Private key block markers.
|
||||
- Common provider token formats.
|
||||
|
||||
Keys must be at least 32 characters long. The script also saves metadata (creation date, expiration) to a `keys/` directory.
|
||||
Behavior:
|
||||
- `SECRET_DETECTION_MODE=mask`: redact in place.
|
||||
- `SECRET_DETECTION_MODE=block`: replace secret-bearing field values.
|
||||
- `SECRET_DETECTION_MODE=off`: disable sanitization (not recommended).
|
||||
|
||||
### Multiple Keys (Grace Period During Rotation)
|
||||
## Authentication and Key Lifecycle
|
||||
|
||||
You can configure multiple keys separated by commas. This allows you to add a new key and remove the old one without downtime:
|
||||
|
||||
```env
|
||||
MCP_API_KEYS=newkey...,oldkey...
|
||||
```
|
||||
|
||||
Remove the old key from the list after all clients have been updated.
|
||||
|
||||
---
|
||||
|
||||
## Key Rotation
|
||||
|
||||
Rotate keys regularly (recommended: every 90 days).
|
||||
|
||||
```bash
|
||||
make rotate-key
|
||||
# or: python scripts/rotate_api_key.py
|
||||
```
|
||||
|
||||
The rotation script:
|
||||
1. Reads the current key from `.env`
|
||||
2. Generates a new key
|
||||
3. Offers to replace the key immediately or add it alongside the old key (grace period)
|
||||
4. Creates a backup of your `.env` before modifying it
|
||||
|
||||
### Checking Key Age
|
||||
|
||||
```bash
|
||||
make check-key-age
|
||||
# or: python scripts/check_key_age.py
|
||||
```
|
||||
|
||||
Exit codes: `0` = OK, `1` = expiring within 7 days (warning), `2` = already expired (critical).
|
||||
|
||||
---
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
Failed authentication attempts are tracked per client IP address.
|
||||
|
||||
| Setting | Default | Description |
|
||||
|---|---|---|
|
||||
| `MAX_AUTH_FAILURES` | `5` | Maximum failures before the IP is blocked |
|
||||
| `AUTH_FAILURE_WINDOW` | `300` | Rolling window in seconds |
|
||||
|
||||
Once an IP exceeds the threshold, all further requests from that IP return HTTP 429 until the window resets. This is enforced entirely in memory — a server restart resets the counters.
|
||||
|
||||
---
|
||||
|
||||
## Audit Logging
|
||||
|
||||
All security-relevant events are written to a structured JSON log file.
|
||||
|
||||
### Log Location
|
||||
|
||||
Default: `/var/log/aegis-mcp/audit.log`
|
||||
Configurable via `AUDIT_LOG_PATH`.
|
||||
|
||||
The directory is created automatically on startup.
|
||||
|
||||
### What Is Logged
|
||||
|
||||
| Event | Description |
|
||||
|---|---|
|
||||
| Tool invocation | Every call to a tool: tool name, arguments, result status, correlation ID |
|
||||
| Access denied | Failed authentication attempts: IP address, reason |
|
||||
| Security event | Rate limit triggers, invalid key formats, startup authentication status |
|
||||
|
||||
### Log Format
|
||||
|
||||
Each entry is a JSON object on a single line:
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-02-13T10:00:00Z",
|
||||
"event": "tool_invocation",
|
||||
"correlation_id": "a1b2c3d4-...",
|
||||
"tool": "get_file_contents",
|
||||
"owner": "myorg",
|
||||
"repo": "backend",
|
||||
"path": "src/main.py",
|
||||
"result": "success",
|
||||
"client_ip": "10.0.0.1"
|
||||
}
|
||||
```
|
||||
|
||||
### Using Logs for Monitoring
|
||||
|
||||
Because entries are newline-delimited JSON, they are easy to parse:
|
||||
|
||||
```bash
|
||||
# Show all failed tool calls
|
||||
grep '"result": "error"' /var/log/aegis-mcp/audit.log | jq .
|
||||
|
||||
# Show all access-denied events
|
||||
grep '"event": "access_denied"' /var/log/aegis-mcp/audit.log | jq .
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Access Control Model
|
||||
|
||||
AegisGitea MCP does **not** implement its own repository access control. Access to repositories is determined entirely by the Gitea bot user's permissions:
|
||||
|
||||
- If the bot user has no access to a repository, it will not appear in `list_repositories` and `get_repository_info` will return an error.
|
||||
- Grant the bot user the minimum set of repository permissions needed.
|
||||
|
||||
**Principle of least privilege:** create a dedicated bot user and grant it read-only access only to the repositories that the AI needs to see.
|
||||
|
||||
---
|
||||
|
||||
## Network Security Recommendations
|
||||
|
||||
- Run the MCP server behind a reverse proxy (e.g. Traefik or nginx) with TLS.
|
||||
- Do not expose the server directly on a public port without TLS.
|
||||
- Restrict inbound connections to known AI client IP ranges where possible.
|
||||
- The `/mcp/tools` endpoint is intentionally public (required for ChatGPT plugin discovery). If this is undesirable, restrict it at the network/proxy level.
|
||||
|
||||
---
|
||||
|
||||
## Container Security
|
||||
|
||||
The provided Docker image runs with:
|
||||
|
||||
- A non-root user (`aegis`, UID 1000)
|
||||
- `no-new-privileges` security option
|
||||
- CPU and memory resource limits (1 CPU, 512 MB)
|
||||
|
||||
See [Deployment](deployment.md) for details.
|
||||
- Keys must be at least 32 characters.
|
||||
- Rotate keys regularly (`scripts/rotate_api_key.py`).
|
||||
- Check key age and expiry (`scripts/check_key_age.py`).
|
||||
- Prefer dedicated bot credentials with least privilege.
|
||||
|
||||
92
docs/todo.md
Normal file
92
docs/todo.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# TODO
|
||||
|
||||
## Phase 0 Governance
|
||||
|
||||
- [x] Add `CODE_OF_CONDUCT.md`.
|
||||
- [x] Add governance policy documentation.
|
||||
- [x] Upgrade `AGENTS.md` as authoritative AI contract.
|
||||
|
||||
## Phase 1 Architecture
|
||||
|
||||
- [x] Publish roadmap and threat/security model updates.
|
||||
- [x] Publish phased TODO tracker.
|
||||
|
||||
## Phase 2 Expanded Read Tools
|
||||
|
||||
- [x] Implement `search_code`.
|
||||
- [x] Implement `list_commits`.
|
||||
- [x] Implement `get_commit_diff`.
|
||||
- [x] Implement `compare_refs`.
|
||||
- [x] Implement `list_issues`.
|
||||
- [x] Implement `get_issue`.
|
||||
- [x] Implement `list_pull_requests`.
|
||||
- [x] Implement `get_pull_request`.
|
||||
- [x] Implement `list_labels`.
|
||||
- [x] Implement `list_tags`.
|
||||
- [x] Implement `list_releases`.
|
||||
- [x] Add input validation and response bounds.
|
||||
- [x] Add unit/failure-mode tests.
|
||||
|
||||
## Phase 3 Policy Engine
|
||||
|
||||
- [x] Implement YAML policy loader and validator.
|
||||
- [x] Implement per-tool and per-repo allow/deny.
|
||||
- [x] Implement optional path restrictions.
|
||||
- [x] Enforce default write deny.
|
||||
- [x] Add policy unit tests.
|
||||
|
||||
## Phase 4 Write Mode
|
||||
|
||||
- [x] Implement write tools (`create_issue`, `update_issue`, comments, labels, assignment).
|
||||
- [x] Keep write mode disabled by default.
|
||||
- [x] Enforce repository whitelist.
|
||||
- [x] Ensure no merge/deletion/force-push capabilities.
|
||||
- [x] Add write denial tests.
|
||||
|
||||
## Phase 5 Hardening
|
||||
|
||||
- [x] Add secret detection + mask/block controls.
|
||||
- [x] Add prompt-injection defensive model (data-only handling).
|
||||
- [x] Add tamper-evident audit chaining and validation.
|
||||
- [x] Add per-IP and per-token rate limiting.
|
||||
|
||||
## Phase 6 Automation
|
||||
|
||||
- [x] Implement webhook ingestion pipeline.
|
||||
- [x] Implement on-demand scheduled jobs runner endpoint.
|
||||
- [x] Implement auto issue creation job scaffold from findings.
|
||||
- [x] Implement dependency hygiene scan orchestration scaffold.
|
||||
- [x] Implement stale issue detection automation.
|
||||
- [x] Add automation endpoint tests.
|
||||
|
||||
## Phase 7 Deployment
|
||||
|
||||
- [x] Harden Docker runtime defaults.
|
||||
- [x] Separate dev/prod compose profiles.
|
||||
- [x] Preserve non-root runtime and health checks.
|
||||
|
||||
## Phase 8 Observability
|
||||
|
||||
- [x] Add Prometheus metrics endpoint.
|
||||
- [x] Add structured JSON logging.
|
||||
- [x] Add request ID correlation.
|
||||
- [x] Add tool timing metrics.
|
||||
|
||||
## Phase 9 Testing and Release Readiness
|
||||
|
||||
- [x] Extend unit tests.
|
||||
- [x] Add policy tests.
|
||||
- [x] Add secret detection tests.
|
||||
- [x] Add write-mode denial tests.
|
||||
- [x] Add audit integrity tests.
|
||||
- [ ] Add integration-tagged tests against live Gitea (optional CI stage).
|
||||
- [ ] Final security review sign-off.
|
||||
- [ ] Release checklist execution.
|
||||
|
||||
## Release Checklist
|
||||
|
||||
- [ ] `make lint`
|
||||
- [ ] `make test`
|
||||
- [ ] Documentation review complete
|
||||
- [ ] Policy file reviewed for production scope
|
||||
- [ ] Write mode remains disabled unless explicitly approved
|
||||
40
docs/write-mode.md
Normal file
40
docs/write-mode.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Write Mode
|
||||
|
||||
## Threat Model
|
||||
|
||||
Write mode introduces mutation risk (issue/PR changes, metadata updates). Risks include unauthorized action, accidental mass updates, and audit evasion.
|
||||
|
||||
## Default Posture
|
||||
|
||||
- `WRITE_MODE=false` by default.
|
||||
- Even when enabled, writes require repository whitelist membership.
|
||||
- Policy engine remains authoritative and may deny specific write tools.
|
||||
|
||||
## Supported Write Tools
|
||||
|
||||
- `create_issue`
|
||||
- `update_issue`
|
||||
- `create_issue_comment`
|
||||
- `create_pr_comment`
|
||||
- `add_labels`
|
||||
- `assign_issue`
|
||||
|
||||
Not supported (explicitly forbidden): merge actions, branch deletion, force push.
|
||||
|
||||
## Enablement Steps
|
||||
|
||||
1. Set `WRITE_MODE=true`.
|
||||
2. Set `WRITE_REPOSITORY_WHITELIST=owner/repo,...`.
|
||||
3. Review policy file for write-tool scope.
|
||||
4. Verify audit logging and alerting before rollout.
|
||||
|
||||
## Safe Operations
|
||||
|
||||
- Start with one repository in whitelist.
|
||||
- Use narrowly scoped bot credentials.
|
||||
- Require peer review for whitelist/policy changes.
|
||||
- Disable write mode during incident response if abuse is suspected.
|
||||
|
||||
## Risk Tradeoffs
|
||||
|
||||
Write mode improves automation and triage speed but increases blast radius. Use least privilege, tight policy, and strong monitoring.
|
||||
Reference in New Issue
Block a user