In this article, we will explore how to set up a full Prometheus monitoring stack using docker-compose. We will cover the configuration and deployment of Prometheus for metric collection, Grafana for data visualization, and Alertmanager for alert management. This guide is intended for those who are familiar with Docker and are looking to implement a robust monitoring solution for their applications and infrastructure.
This is an updated version which combines three previous articles of mine.
Prometheus
Prometheus Prometheus is an open-source monitoring and alerting toolkit that collects and stores metrics from various sources. It provides a powerful query language, PromQL, to analyze and visualize metrics. Prometheus is widely used for monitoring applications, infrastructure, and services. For example, your web app might expose a metric like:
http_server_requests_seconds_count{exception="None", method="GET",outcome="SUCCESS",status="200",uri="/actuator/health"} 435
which means that the endpoint /actuator/health
was successfully queried 435 times via a GET request. Prometheus can also create alerts if a metric exceeds a threshold, for example, if your endpoint returned more than one hundred times the status code 500 in the last 5 minutes.
Configuration
To set up Prometheus, we create three files:
prometheus/prometheus.yml — the actual Prometheus configuration
prometheus/alert.yml — alerts you want Prometheus to check
docker-compose.yml
Add the following to prometheus/prometheus.yml
global:
scrape_interval: 30s
scrape_timeout: 10s
rule_files:
- alert.yml
scrape_configs:
- job_name: services
metrics_path: /metrics
static_configs:
- targets:
- 'prometheus:9090'
- 'idonotexists:564'
scrape_configs
tell Prometheus where your applications are. Here we use static_configs
hard-code some endpoints.
The first one is Prometheus (this is the service name in the docker-compose.yml
) itself, the second one is for demonstration purposes. It is an endpoint that is always down.
rule_files
tells Prometheus where to search for the alert rules. We come to this in a moment.
scrape_interval
defines how often to check for new metric values.
If a scrape takes longer than scrape_timeout
(e.g. slow network), Prometheus will cancel the scrape.
The alert.yml
file contains rules which Prometheus evaluates periodically. Insert this into the file:
groups:
- name: DemoAlerts
rules:
- alert: InstanceDown
expr: up{job="services"} < 1
for: 5m
up
is a built-in metric from Prometheus. It returns zero if the services were not reachable in the last scrape.
{job="services"}
filters the results of up
to contain only metrics with the tag service
. This tag is added to our metrics because we defined this as the job name in prometheus.yml
Finally, we want to launch Prometheus. Put this into your docker-compose.yml
:
version: '3'
services:
prometheus:
image: prom/prometheus:v2.46.0
ports:
- 9000:9090
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
volumes:
prometheus-data:
The volume ./prometheus:/etc/prometheus
mounts our prometheus
folder in the right place for the image to pick up our configuration.
prometheus-data:/prometheus
is used to store the scraped data so that they are available after a restart.If you use --web.enable-lifecycle
you can reload configuration files (e.g. rules) without restarting Prometheus:
curl -X POST http://localhost:9000/-/reload
If you modify the command, you will override the image’s default settings, so you must explicitly include the --config.file=...
option.
Start Prometheus.
Finally, start Prometheus with:
docker-compose up -d
and open
http://localhost:9000
in your browser.
You’ll see Prometheus UI where you can enter some ad-hoc queries on your metrics, like up
:
As expected, this tells you that your Prometheus is up, and the other service is not.
If you go to Alerts
you'll see that our alert is pending (or already firing):
Grafana
Grafana is a powerful open-source platform for monitoring and observability that allows you to create, explore, and share dashboards with your team. It supports various data sources, including Prometheus, and provides a wide range of visualization options.
Configuration
Grafana can work without any configuration files. However, to streamline the setup process, we will configure Prometheus as a data source using a configuration file. Create the file grafana/provisioning/datasources/prometheus_ds.yml
and add the following content:
datasources:
- name: Prometheus
access: proxy
type: prometheus
url: http://prometheus:9090
isDefault: true
This configuration file tells Grafana about the Prometheus data source. The url
field specifies the address of the Prometheus server, which is reachable at
http://prometheus:9090
within the Docker network.
Next, update the docker-compose.yml
file to include the Grafana service:
grafana:
image: grafana/grafana:10.0.0
ports:
- 3000:3000
restart: unless-stopped
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
volumes:
grafana-data:
The first volume mounts our data source configuration in the appropriate location for Grafana to pick it up. The second volume is used to persist Grafana’s data, such as dashboards and user settings.
Start Prometheus and Grafana:
docker-compose up -d
Open
http://localhost:3000
in your browser and log in with the default credentials admin
for both the username and password.
You should see the Grafana Landing Page after login:
Add the first dashboard
After logging in, you can create a new dashboard by clicking on the “+” icon on the top right and selecting “New Dashboard.” Click on “Add visualization” to create your first visualization.
Select the Prometheus datasource.You can select Prometheus here because we added the configuration earlier.
You can now enter PromQL queries to visualize your metrics. For example, enter the following query to display the total number of HTTP requests over time:
increase(prometheus_http_requests_total[1m])
To learn more about PromQL, see the official documentation.
Click on Apply
to save and go back to the dashboard. Finally, click on the dashboard save button in the upper right corner.
Grafana also allows you to import pre-made dashboards. Visit the Grafana Dashboards page to find a wide range of community-contributed dashboards for various data sources and use cases. Simply enter the dashboard ID or upload the JSON file to import it into your Grafana instance.
Alertmanager
Alertmanager is an open-source tool that handles alerts sent by Prometheus. It takes care of deduplicating, grouping, and routing alerts to the appropriate receiver, such as email, Slack, or PagerDuty. Alertmanager also allows you to silence and inhibit alerts.Configuration
You can get all source code from GitHub. Check out the tag `part-2-grafana` if you want to follow along.
Configuration
First of all, add Alertmanager and a volume to docker-compose.yml:
alertmanager:
image: prom/alertmanager:v0.25.0
ports:
- 9093:9093
volumes:
- ./alertmanager:/config
- alertmanager-data:/data
command: --config.file=/config/alertmanager/alertmanager.yml
volumes:
alertmanager-data:
Alertmanager will persist silence configurations to the volume.
The next configuration contains information about which channels to send to. For simplicity, we use e-mail. Refer to the Alertmanager docs to learn about other channels.
Create a folder alertmanager and add a file alertmanager.yml to it:
receivers:
- name: 'mail'
email_configs:
- smarthost: 'smtp.gmail.com:587'
from: 'your_mail@gmail.com'
to: 'some_mail@gmail.com'
auth_username: '...'
auth_password: '...'
The route
section configures which alerts will be sent. In our case, we sent all alerts. You could add more routes and filter, for example, on alert tags (see the example in the docs.).
receivers configures our target channels. Note how route refers to the receiver mail on line two. You will get a new mail every four hours until the problem is solved.
Finally, we need to tell Prometheus about the Alertmanager. Open prometheus/prometheus.yml and add the following:
alerting:
alertmanagers:
- static_configs:
- targets:
- 'alertmanager:9093'
Testing
Run docker-compose up
.
Open http://localhost:9093 in your browser to see the Alertmanager UI.
After a couple of minutes, the test alert fires. You can check this in your Prometheus instance.
Now, you can see the alert in the Alertmanager as well:
Check your inbox for the notification:
Using Alertmanager
In the Alertmanager UI, you can view active and silenced alerts, manage notification receivers, and configure alert routing. You can also create and manage silences to suppress alerts during maintenance windows or planned outages.
To silence an alert, go to the “Silences” tab and click on “New Silence.” Select the affected alert, set the start and end times for the silence, and provide a comment. Click on “Create” to activate the silence.
Conclusion
In conclusion, the Prometheus monitoring stack provides a comprehensive solution for monitoring applications and infrastructure. By using docker-compose, we can easily deploy and manage the components of the stack, including Prometheus, Grafana, and Alertmanager. This setup allows us to collect metrics, visualize data, and manage alerts effectively. Whether you are monitoring a small application or a large-scale infrastructure, this stack offers a powerful and flexible monitoring solution.