mod_gearman is a powerful Nagios module that enables you to distribute monitoring checks across multiple servers using the Gearman job server. This distributed approach significantly improves monitoring performance, reduces server load, and provides better scalability for large-scale environments.
What is mod_gearman?
mod_gearman is an event broker module for Nagios that integrates with the Gearman distributed job queue system. It allows you to:
- Distribute check execution across multiple worker nodes
- Balance monitoring load across servers
- Improve monitoring performance and response times
- Scale your monitoring infrastructure horizontally
- Reduce single points of failure
Architecture Overview
The mod_gearman architecture consists of three main components:
- Nagios Core with mod_gearman NEB module: Sends checks to the Gearman job server
- Gearman Job Server: Manages the job queue and distributes tasks
- mod_gearman Workers: Execute the actual monitoring checks
Prerequisites
Before installing mod_gearman, ensure you have:
- A running Nagios installation
- Root or sudo access
- Development tools and libraries
- Network connectivity between all nodes
Installing Dependencies
First, install the required packages on all servers:
# On Ubuntu/Debian systems
sudo apt-get update
sudo apt-get install build-essential libgearman-dev gearman-job-server
sudo apt-get install nagios-plugins-basic nagios-plugins-standard
# On CentOS/RHEL systems
sudo yum install gcc gcc-c++ make libgearman-devel gearmand
sudo yum install nagios-plugins-all
Installing mod_gearman
Download and compile mod_gearman from source:
# Download the latest version
wget https://github.com/sni/mod_gearman/releases/download/v5.1.1/mod_gearman-5.1.1.tar.gz
tar -xzf mod_gearman-5.1.1.tar.gz
cd mod_gearman-5.1.1
# Configure and compile
./configure --with-nagios-config-dir=/usr/local/nagios/etc
make all
sudo make install
Configuring the Gearman Job Server
Start and configure the Gearman daemon on your job server:
# Start Gearman job server
sudo systemctl start gearmand
sudo systemctl enable gearmand
# Verify it's running
sudo systemctl status gearmand
Create a custom configuration file for Gearman:
# Edit /etc/default/gearman-job-server
sudo nano /etc/default/gearman-job-server
Add the following configuration:
DAEMON_ARGS="--listen=0.0.0.0 --port=4730 --log-file=/var/log/gearman-job-server/gearman.log --verbose=INFO"
Configuring Nagios with mod_gearman
Edit your Nagios configuration to load the mod_gearman module:
# Edit nagios.cfg
sudo nano /usr/local/nagios/etc/nagios.cfg
Add the broker module line:
broker_module=/usr/local/lib/mod_gearman/mod_gearman.so config=/usr/local/etc/mod_gearman/mod_gearman.conf
Create the mod_gearman configuration file:
sudo mkdir -p /usr/local/etc/mod_gearman
sudo nano /usr/local/etc/mod_gearman/mod_gearman.conf
Add the following configuration:
# Gearman job server configuration
server=127.0.0.1:4730
# Security settings
key=your_secret_encryption_key
# Performance settings
job_timeout=60
min_worker=5
max_worker=50
# Logging
logfile=/var/log/mod_gearman/mod_gearman.log
debug=1
# Event handling
events=yes
services=yes
hosts=yes
# Queue settings
queue=host
queue=service
queue=eventhandler
Setting Up Worker Nodes
On each worker node, create the worker configuration:
sudo nano /usr/local/etc/mod_gearman/worker.conf
# Gearman server connection
server=192.168.1.100:4730
# Security (must match server key)
key=your_secret_encryption_key
# Worker settings
debug=1
logfile=/var/log/mod_gearman/worker.log
# Plugin paths
plugin_path=/usr/lib/nagios/plugins
# Performance
job_timeout=60
max_jobs=10
idle_timeout=30
# Queues to process
queue=host
queue=service
Starting the Services
Start all required services in the correct order:
# 1. Start Gearman job server
sudo systemctl restart gearmand
# 2. Start mod_gearman workers on worker nodes
sudo /usr/local/bin/mod_gearman_worker --config=/usr/local/etc/mod_gearman/worker.conf --daemon
# 3. Restart Nagios on the main server
sudo systemctl restart nagios
Monitoring and Verification
Verify that everything is working correctly:
# Check Gearman job server status
gearman_top
# Check worker connections
echo "status" | nc localhost 4730
# Monitor log files
tail -f /var/log/mod_gearman/mod_gearman.log
tail -f /var/log/mod_gearman/worker.log
Expected output from gearman_top:
2025-08-26 06:34:00 - localhost:4730 - v1.1.21 - 3 workers
Queue Name | Worker Available | Jobs Running | Jobs Waiting
--------------+------------------+--------------+-------------
host | 3 | 2 | 0
service | 3 | 5 | 1
eventhandler | 2 | 0 | 0
Load Balancing Strategies
Configure different load balancing methods:
Round Robin Distribution
# In mod_gearman.conf
distribute_method=round_robin
Host-based Distribution
# Assign specific hosts to specific workers
hostgroups=linux-servers:worker1
hostgroups=windows-servers:worker2
Performance Tuning
Optimize performance with these settings:
# Increase worker processes
max_worker=100
min_worker=10
# Adjust timeouts
job_timeout=300
connect_timeout=10
# Enable encryption for security
encryption=yes
key_file=/etc/mod_gearman/secret.key
Troubleshooting Common Issues
Workers Not Connecting
# Check network connectivity
telnet gearman-server 4730
# Verify firewall settings
sudo iptables -A INPUT -p tcp --dport 4730 -j ACCEPT
Jobs Timing Out
# Increase timeout values in worker.conf
job_timeout=120
server_timeout=30
High Memory Usage
# Limit maximum jobs per worker
max_jobs=5
idle_timeout=60
Advanced Configuration
SSL/TLS Encryption
# Generate SSL certificates
openssl req -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 365 -nodes
# Configure SSL in mod_gearman.conf
use_ssl=yes
ssl_cert_file=/etc/ssl/certs/server.crt
ssl_key_file=/etc/ssl/private/server.key
Custom Queues
# Create specialized queues for different check types
queue=ping_checks
queue=disk_checks
queue=network_checks
# Assign workers to specific queues
worker --queue=ping_checks --max-jobs=20
worker --queue=disk_checks --max-jobs=10
Monitoring mod_gearman Performance
Create custom Nagios checks to monitor mod_gearman:
#!/bin/bash
# check_gearman_queue.sh
QUEUE_COUNT=$(echo "status" | nc localhost 4730 | grep "service" | awk '{print $4}')
if [ $QUEUE_COUNT -gt 100 ]; then
echo "CRITICAL - Queue backlog: $QUEUE_COUNT jobs"
exit 2
elif [ $QUEUE_COUNT -gt 50 ]; then
echo "WARNING - Queue backlog: $QUEUE_COUNT jobs"
exit 1
else
echo "OK - Queue backlog: $QUEUE_COUNT jobs"
exit 0
fi
Best Practices
- Security: Always use encryption keys for production environments
- Redundancy: Run multiple Gearman job servers for high availability
- Monitoring: Implement checks to monitor the mod_gearman infrastructure itself
- Scaling: Start with fewer workers and scale up based on load
- Logging: Enable appropriate logging levels for troubleshooting
Conclusion
mod_gearman provides an excellent solution for scaling Nagios monitoring infrastructure. By distributing checks across multiple worker nodes, you can achieve better performance, improved reliability, and easier horizontal scaling. The setup requires careful configuration but provides significant benefits for medium to large-scale monitoring environments.
Regular monitoring of the mod_gearman infrastructure itself, proper security measures, and performance tuning will ensure your distributed monitoring system operates efficiently and reliably.
- What is mod_gearman?
- Architecture Overview
- Prerequisites
- Installing Dependencies
- Installing mod_gearman
- Configuring the Gearman Job Server
- Configuring Nagios with mod_gearman
- Setting Up Worker Nodes
- Starting the Services
- Monitoring and Verification
- Load Balancing Strategies
- Performance Tuning
- Troubleshooting Common Issues
- Advanced Configuration
- Monitoring mod_gearman Performance
- Best Practices
- Conclusion








