Files
Knowledge-Base/70 - Tools/prometheus/prometheus.md

2.0 KiB

title, description, tags, category, created, updated
title description tags category created updated
Prometheus Tool overview for Prometheus as a metrics collection, query, and alerting platform
prometheus
monitoring
observability
tools 2026-03-14 2026-03-14

Prometheus

Summary

Prometheus is an open source monitoring system built around time-series metrics, pull-based scraping, alert evaluation, and queryable historical data. It is a standard choice for infrastructure and service monitoring in self-hosted environments.

Why it matters

Prometheus gives operators a consistent way to collect metrics from hosts, applications, and infrastructure components. It is especially valuable because it pairs collection, storage, and alert evaluation in one practical operational model.

Core concepts

  • Scrape targets and exporters
  • Time-series storage
  • PromQL for querying and aggregation
  • Alerting rules for actionable conditions
  • Service discovery integrations for dynamic environments

Practical usage

Prometheus commonly fits into infrastructure as:

Targets and exporters -> Prometheus -> dashboards and alerts

Typical uses:

  • Scraping node, container, and application metrics
  • Evaluating alert rules for outages and resource pressure
  • Providing metrics data to Grafana

Best practices

  • Start with critical infrastructure and user-facing services
  • Keep retention and scrape frequency aligned with actual operational needs
  • Write alerts that map to a human response
  • Protect Prometheus access because metrics can reveal sensitive system details

Pitfalls

  • Collecting too many high-cardinality metrics without a clear reason
  • Treating every metric threshold as an alert
  • Forgetting to monitor backup freshness, certificate expiry, or ingress paths
  • Running Prometheus without a retention and storage plan

References