Files
Latte f53e1a3a5a
docker / test (pull_request) Successful in 29s
test / test (push) Successful in 38s
docker / lint (pull_request) Successful in 39s
lint / lint (push) Successful in 39s
docker / docker-test (pull_request) Successful in 12s
docker / docker-publish (pull_request) Has been skipped
lint / lint (pull_request) Successful in 28s
test / test (pull_request) Successful in 22s
feat: add structured logging helpers and instrument get_issue (#14)
Adds reusable, secret-safe logging helpers to `logging_utils`:
- `log_event(logger, level, event, **context)` emits a named event with a
  sanitized `context` mapping (sensitive keys masked as `***`).
- `log_nullable_field(...)` records whether a parsed field is None plus its
  runtime type, without dumping its contents.
- `sanitize_context(...)` is the shared masking primitive.

The JSON formatter now serializes a record's `context` into the payload.

`get_issue_tool` is instrumented at DEBUG (`get_issue.start`,
`get_issue.payload_shape`, `get_issue.field_check` for labels/assignees/user)
so the nullable-field parsing that caused #13 is diagnosable going forward.

Adds tests for the helpers, the formatter, and the get_issue instrumentation,
and documents the pattern in docs/observability.md.
2026-06-22 15:40:36 +02:00

49 lines
1.7 KiB
Markdown

# Observability
## Logging
- Structured JSON logs.
- Request correlation via `X-Request-ID`.
- Security events and policy denials are audit logged.
### Structured event helpers
`logging_utils` exposes reusable helpers so endpoints emit consistent,
secret-safe structured events instead of ad-hoc inline logging:
- `log_event(logger, level, event, **context)` — emit a named event with a
`context` mapping; keys in `SENSITIVE_CONTEXT_KEYS` (e.g. `token`,
`authorization`, `password`) are masked as `***`.
- `log_nullable_field(logger, event, field, value)` — record whether a parsed
response field is `None` and its runtime type, without dumping its contents.
- `sanitize_context(context)` — the masking primitive used by the above.
The `context` mapping is serialized into the JSON log payload under a `context`
key. These run at `DEBUG`, so they are silent unless `LOG_LEVEL=DEBUG`.
`get_issue` is instrumented with these helpers (`get_issue.start`,
`get_issue.payload_shape`, `get_issue.field_check`) to make nullable-field
parsing failures diagnosable. The same pattern can be reused for other
parsing-heavy endpoints (`get_pull_request`, `list_issues`, `get_commit_diff`).
## Metrics
Prometheus-compatible endpoint: `GET /metrics`.
Current metrics:
- `aegis_http_requests_total{method,path,status}`
- `aegis_tool_calls_total{tool,status}`
- `aegis_tool_duration_seconds_sum{tool}`
- `aegis_tool_duration_seconds_count{tool}`
## Tracing and Correlation
- Request IDs propagate in response header (`X-Request-ID`).
- Tool-level correlation IDs included in MCP responses.
## Operational Guidance
- Alert on spikes in 401/403/429 rates.
- Alert on repeated `access_denied` and auth-rate-limit events.
- Track tool latency trends for incident triage.