What is Prometheus?
Prometheus is an open-source monitoring application. It scrapes HTTP endpoints to collect metrics exposed in a simple text format.
For example, your web app might expose a metric like
http_server_requests_seconds_count{exception="None", method="GET",outcome="SUCCESS",status="200",uri="/actuator/health"} 435
which means that the endpoint /actuator/health was successfully queried 435 times via a GET request.
Prometheus can also create alerts if a metric exceeds a threshold, e.g. if your endpoint returned more than one-hundred times the status code 500 in the last 5 minutes.
Configuration
To set up Prometheus, we create three files:
prometheus/prometheus.yml — the actual Prometheus configuration
prometheus/alert.yml — alerts you want Prometheus to check
docker-compose.yml
Configuration File: prometheus/prometheus.yml
Add the following to prometheus/prometheus.yml
global:
scrape_interval: 30s
scrape_timeout: 10s
rule_files:
- alert.yml
scrape_configs:
- job_name: services
metrics_path: /metrics
static_configs:
- targets:
- 'prometheus:9090'
- 'idonotexists:564'
scrape_configs tell Prometheus where your applications are. Here we use static_configs hard-code some endpoints.
The first one is Prometheus (this is the service name in the docker-compose.yml) itself, the second one is for demonstration purposes. It is an endpoint that is always down.
rule_files tells Prometheus where to search for the alert rules. We come to this in a moment.
scrape_interval defines how often to check for new metric values.
If a scrape takes longer than scrape_timeout (e.g. slow network), Prometheus will cancel the scrape.
Configuration File: prometheus/alert.yml
This file contains rules which Prometheus evaluates periodically. Insert this into the file:
groups:
- name: DemoAlerts
rules:
- alert: InstanceDown
expr: up{job="services"} < 1
for: 5m
up is a built-in metric from Prometheus. It returns zero if the services were not reachable in the last scrape.
{job="services"} filters the results of up to contain only metrics with the tag service. This tag is added to our metrics because we defined this as the job name in prometheus.yml
Docker Compose: docker-compose.yml
Finally, we want to launch Prometheus. Put this into your docker-compose.yml:
version: '3'
services:
prometheus:
image: prom/prometheus:v2.30.3
ports:
- 9000:9090
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
volumes:
prometheus-data:
The volume ./prometheus:/etc/prometheus mounts our prometheus folder in the right place for the image to pick up our configuration.
prometheus-data:/prometheus is used to store the scraped data so that they are available after a restart.
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml is optional. If you use --web.enable-lifecycle you can reload configuration files (e.g. rules) without restarting Prometheus:
curl -X POST http://localhost:9000/-/reload
If you change the command, you override the default of the image and must include --config.file=/etc/prometheus/prometheus.yml.
Start Prometheus
Finally, start Prometheus with:
docker-compose up -d
and open http://localhost:9000 in your browser.
You’ll see Prometheus UI where you can enter some ad-hoc queries on your metrics, like up:
As expected, this tells you that your Prometheus is up, and the other service is not.
If you go to Alerts you'll see that our alert is pending (or already firing):