Prometheus has revolutionized the way we monitor and alert on modern infrastructure. As an open-source monitoring toolkit originally built at SoundCloud, Prometheus provides powerful time-series data collection, storage, and querying capabilities that make it the go-to solution for Linux system monitoring.
What is Prometheus?
Prometheus is a multi-dimensional time-series database with a built-in alerting system. It scrapes metrics from configured targets at given intervals, evaluates rule expressions, displays results, and triggers alerts when specified conditions are met. Its pull-based architecture and service discovery capabilities make it particularly well-suited for dynamic cloud environments and containerized applications.
Key Features of Prometheus
- Multi-dimensional data model with time series identified by metric name and key/value pairs
- PromQL – A flexible query language for leveraging dimensionality
- No dependency on distributed storage – Single server nodes are autonomous
- HTTP pull model with support for pushing via intermediary gateway
- Service discovery or static configuration for target discovery
- Multiple modes of graphing and dashboarding support
Installing Prometheus on Linux
Method 1: Binary Installation
The most straightforward way to install Prometheus is using the pre-compiled binaries:
# Create prometheus user
sudo useradd --no-create-home --shell /bin/false prometheus
# Create directories
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
# Set ownership
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
# Download Prometheus
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
# Extract
tar xvf prometheus-2.45.0.linux-amd64.tar.gz
cd prometheus-2.45.0.linux-amd64
# Copy binaries
sudo cp prometheus /usr/local/bin/
sudo cp promtool /usr/local/bin/
# Set ownership for binaries
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
# Copy configuration files
sudo cp -r consoles /etc/prometheus
sudo cp -r console_libraries /etc/prometheus
sudo cp prometheus.yml /etc/prometheus/prometheus.yml
# Set ownership
sudo chown -R prometheus:prometheus /etc/prometheus
Method 2: Package Manager Installation
For Ubuntu/Debian systems:
# Update package list
sudo apt update
# Install Prometheus
sudo apt install prometheus
# For CentOS/RHEL/Fedora
sudo yum install prometheus
# or
sudo dnf install prometheus
Method 3: Docker Installation
Running Prometheus in a Docker container:
# Pull Prometheus image
docker pull prom/prometheus
# Run Prometheus container
docker run -d \
--name prometheus \
-p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus
Basic Configuration
The main configuration file is prometheus.yml. Here’s a basic configuration example:
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "first_rules.yml"
- "second_rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
Configuration Parameters Explained
- scrape_interval: How frequently to scrape targets
- evaluation_interval: How often to evaluate rules
- rule_files: List of files containing recording and alerting rules
- scrape_configs: Configuration for what to scrape
Creating a Systemd Service
To run Prometheus as a system service, create a systemd unit file:
sudo nano /etc/systemd/system/prometheus.service
Add the following content:
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.enable-lifecycle
[Install]
WantedBy=multi-user.target
Enable and start the service:
# Reload systemd
sudo systemctl daemon-reload
# Enable Prometheus service
sudo systemctl enable prometheus
# Start Prometheus
sudo systemctl start prometheus
# Check status
sudo systemctl status prometheus
Installing Node Exporter
Node Exporter provides hardware and OS metrics. Install it to monitor your Linux system:
# Download Node Exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz
# Extract
tar xvf node_exporter-1.6.0.linux-amd64.tar.gz
# Copy binary
sudo cp node_exporter-1.6.0.linux-amd64/node_exporter /usr/local/bin/
# Create user
sudo useradd --no-create-home --shell /bin/false node_exporter
# Set ownership
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
Create systemd service for Node Exporter:
sudo nano /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
Start Node Exporter:
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
sudo systemctl status node_exporter
PromQL: Prometheus Query Language
PromQL is Prometheus’s functional query language that allows you to select and aggregate time series data in real time.
Basic Query Examples
# Get current CPU usage
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
# Disk usage percentage
(1 - (node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"})) * 100
# Network I/O rate
rate(node_network_receive_bytes_total[5m])
rate(node_network_transmit_bytes_total[5m])
# Load average
node_load1
node_load5
node_load15
Advanced Query Functions
# Rate function - per-second average rate of increase
rate(http_requests_total[5m])
# Increase function - increase in time series
increase(http_requests_total[1h])
# Sum by labels
sum by (job) (rate(http_requests_total[5m]))
# Histogram quantiles
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Aggregation functions
avg(node_load1)
max(node_load1)
min(node_load1)
count(up == 1)
Setting Up Alerting
Creating Alert Rules
Create an alert rules file:
sudo nano /etc/prometheus/alert_rules.yml
groups:
- name: system_alerts
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 80% for more than 5 minutes on {{ $labels.instance }}"
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage detected"
description: "Memory usage is above 90% on {{ $labels.instance }}"
- alert: DiskSpaceLow
expr: (1 - (node_filesystem_avail_bytes{fstype!="tmpfs"} / node_filesystem_size_bytes{fstype!="tmpfs"})) * 100 > 85
for: 10m
labels:
severity: warning
annotations:
summary: "Disk space running low"
description: "Disk usage is above 85% on {{ $labels.instance }} {{ $labels.mountpoint }}"
- alert: ServiceDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Service is down"
description: "{{ $labels.job }} on {{ $labels.instance }} has been down for more than 1 minute"
Update the Prometheus configuration to include the rules file:
rule_files:
- "/etc/prometheus/alert_rules.yml"
Installing and Configuring Alertmanager
# Download Alertmanager
cd /tmp
wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
# Extract and install
tar xvf alertmanager-0.25.0.linux-amd64.tar.gz
sudo cp alertmanager-0.25.0.linux-amd64/alertmanager /usr/local/bin/
sudo cp alertmanager-0.25.0.linux-amd64/amtool /usr/local/bin/
# Create user and directories
sudo useradd --no-create-home --shell /bin/false alertmanager
sudo mkdir /etc/alertmanager
sudo mkdir /var/lib/alertmanager
sudo chown alertmanager:alertmanager /etc/alertmanager
sudo chown alertmanager:alertmanager /var/lib/alertmanager
Create Alertmanager configuration:
sudo nano /etc/alertmanager/alertmanager.yml
global:
smtp_smarthost: 'localhost:587'
smtp_from: '[email protected]'
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
email_configs:
- to: '[email protected]'
subject: 'Prometheus Alert: {{ .GroupLabels.alertname }}'
body: |
{{ range .Alerts }}
Alert: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
Instance: {{ .Labels.instance }}
{{ end }}
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
Monitoring Different Services
Database Monitoring
For MySQL monitoring, use the MySQL exporter:
# Install MySQL exporter
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz
tar xvf mysqld_exporter-0.14.0.linux-amd64.tar.gz
sudo cp mysqld_exporter-0.14.0.linux-amd64/mysqld_exporter /usr/local/bin/
# Create MySQL user for monitoring
CREATE USER 'prometheus'@'localhost' IDENTIFIED BY 'password';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'prometheus'@'localhost';
GRANT SELECT ON performance_schema.* TO 'prometheus'@'localhost';
# Configure connection
echo 'DATA_SOURCE_NAME="prometheus:password@(localhost:3306)/"' | sudo tee /etc/default/mysqld_exporter
Web Server Monitoring
For Nginx monitoring, enable the stub_status module:
# Add to Nginx configuration
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
# Use nginx-prometheus-exporter
docker run -p 9113:9113 nginx/nginx-prometheus-exporter:0.10.0 -nginx.scrape-uri=http://localhost/nginx_status
Performance Optimization
Storage Configuration
# Optimize storage retention
--storage.tsdb.retention.time=30d
--storage.tsdb.retention.size=10GB
# Configure remote storage for long-term retention
remote_write:
- url: "https://your-remote-storage/api/v1/write"
remote_read:
- url: "https://your-remote-storage/api/v1/read"
Memory and CPU Optimization
# Increase memory limit for large deployments
--storage.tsdb.retention.size=50GB
--query.max-concurrency=20
--query.timeout=2m
# Configure scrape intervals based on requirements
scrape_configs:
- job_name: 'critical-services'
scrape_interval: 10s
- job_name: 'regular-services'
scrape_interval: 30s
- job_name: 'batch-jobs'
scrape_interval: 5m
Security Best Practices
Authentication and Authorization
# Enable basic authentication
--web.config.file=/etc/prometheus/web.yml
# Create web configuration file
sudo nano /etc/prometheus/web.yml
basic_auth_users:
admin: $2b$12$hNf2lSsxfm0.i4a.1kVpSOM9uxq0qD5.wLaGz0.j0M2i2UE6i6M2i
tls_server_config:
cert_file: /etc/ssl/certs/prometheus.crt
key_file: /etc/ssl/private/prometheus.key
Network Security
# Configure firewall rules
sudo ufw allow 9090/tcp # Prometheus
sudo ufw allow 9100/tcp # Node Exporter
sudo ufw allow 9093/tcp # Alertmanager
# Restrict access to specific IPs
sudo ufw allow from 10.0.0.0/24 to any port 9090
Troubleshooting Common Issues
Service Discovery Problems
# Check service discovery
curl http://localhost:9090/api/v1/targets
# Verify configuration syntax
promtool check config /etc/prometheus/prometheus.yml
# Check rules syntax
promtool check rules /etc/prometheus/alert_rules.yml
Performance Issues
# Monitor Prometheus metrics
up{job="prometheus"}
prometheus_tsdb_symbol_table_size_bytes
prometheus_tsdb_head_series
prometheus_rule_evaluation_duration_seconds
# Check disk usage
du -sh /var/lib/prometheus/
# Monitor query performance
topk(10, rate(prometheus_http_request_duration_seconds_sum[5m]))
Integration with Grafana
Grafana provides excellent visualization capabilities for Prometheus data:
# Install Grafana
sudo apt-get install -y software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install grafana
# Start Grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
Configure Prometheus as a data source in Grafana using URL: http://localhost:9090
Conclusion
Prometheus provides a comprehensive monitoring solution for Linux systems with its powerful data model, flexible query language, and robust alerting capabilities. By following this guide, you’ve learned how to install, configure, and optimize Prometheus for effective system monitoring. Regular maintenance, proper security configuration, and thoughtful alert design will ensure your monitoring infrastructure remains reliable and valuable for your operations team.
Remember to regularly update your Prometheus installation, review and refine your alerting rules, and monitor the monitoring system itself to maintain optimal performance and reliability.








