2.0 KiB
2.0 KiB
title, description, tags, category, created, updated
| title | description | tags | category | created | updated | |||
|---|---|---|---|---|---|---|---|---|
| Prometheus | Tool overview for Prometheus as a metrics collection, query, and alerting platform |
|
tools | 2026-03-14 | 2026-03-14 |
Prometheus
Summary
Prometheus is an open source monitoring system built around time-series metrics, pull-based scraping, alert evaluation, and queryable historical data. It is a standard choice for infrastructure and service monitoring in self-hosted environments.
Why it matters
Prometheus gives operators a consistent way to collect metrics from hosts, applications, and infrastructure components. It is especially valuable because it pairs collection, storage, and alert evaluation in one practical operational model.
Core concepts
- Scrape targets and exporters
- Time-series storage
- PromQL for querying and aggregation
- Alerting rules for actionable conditions
- Service discovery integrations for dynamic environments
Practical usage
Prometheus commonly fits into infrastructure as:
Targets and exporters -> Prometheus -> dashboards and alerts
Typical uses:
- Scraping node, container, and application metrics
- Evaluating alert rules for outages and resource pressure
- Providing metrics data to Grafana
Best practices
- Start with critical infrastructure and user-facing services
- Keep retention and scrape frequency aligned with actual operational needs
- Write alerts that map to a human response
- Protect Prometheus access because metrics can reveal sensitive system details
Pitfalls
- Collecting too many high-cardinality metrics without a clear reason
- Treating every metric threshold as an alert
- Forgetting to monitor backup freshness, certificate expiry, or ingress paths
- Running Prometheus without a retention and storage plan