mod_gearman Linux: Complete Guide to Distributing Nagios Checks Across Multiple Servers

August 26, 2025

mod_gearman is a powerful Nagios module that enables you to distribute monitoring checks across multiple servers using the Gearman job server. This distributed approach significantly improves monitoring performance, reduces server load, and provides better scalability for large-scale environments.

What is mod_gearman?

mod_gearman is an event broker module for Nagios that integrates with the Gearman distributed job queue system. It allows you to:

  • Distribute check execution across multiple worker nodes
  • Balance monitoring load across servers
  • Improve monitoring performance and response times
  • Scale your monitoring infrastructure horizontally
  • Reduce single points of failure

Architecture Overview

The mod_gearman architecture consists of three main components:

  • Nagios Core with mod_gearman NEB module: Sends checks to the Gearman job server
  • Gearman Job Server: Manages the job queue and distributes tasks
  • mod_gearman Workers: Execute the actual monitoring checks

Prerequisites

Before installing mod_gearman, ensure you have:

  • A running Nagios installation
  • Root or sudo access
  • Development tools and libraries
  • Network connectivity between all nodes

Installing Dependencies

First, install the required packages on all servers:

# On Ubuntu/Debian systems
sudo apt-get update
sudo apt-get install build-essential libgearman-dev gearman-job-server
sudo apt-get install nagios-plugins-basic nagios-plugins-standard

# On CentOS/RHEL systems
sudo yum install gcc gcc-c++ make libgearman-devel gearmand
sudo yum install nagios-plugins-all

Installing mod_gearman

Download and compile mod_gearman from source:

# Download the latest version
wget https://github.com/sni/mod_gearman/releases/download/v5.1.1/mod_gearman-5.1.1.tar.gz
tar -xzf mod_gearman-5.1.1.tar.gz
cd mod_gearman-5.1.1

# Configure and compile
./configure --with-nagios-config-dir=/usr/local/nagios/etc
make all
sudo make install

Configuring the Gearman Job Server

Start and configure the Gearman daemon on your job server:

# Start Gearman job server
sudo systemctl start gearmand
sudo systemctl enable gearmand

# Verify it's running
sudo systemctl status gearmand

Create a custom configuration file for Gearman:

# Edit /etc/default/gearman-job-server
sudo nano /etc/default/gearman-job-server

Add the following configuration:

DAEMON_ARGS="--listen=0.0.0.0 --port=4730 --log-file=/var/log/gearman-job-server/gearman.log --verbose=INFO"

Configuring Nagios with mod_gearman

Edit your Nagios configuration to load the mod_gearman module:

# Edit nagios.cfg
sudo nano /usr/local/nagios/etc/nagios.cfg

Add the broker module line:

broker_module=/usr/local/lib/mod_gearman/mod_gearman.so config=/usr/local/etc/mod_gearman/mod_gearman.conf

Create the mod_gearman configuration file:

sudo mkdir -p /usr/local/etc/mod_gearman
sudo nano /usr/local/etc/mod_gearman/mod_gearman.conf

Add the following configuration:

# Gearman job server configuration
server=127.0.0.1:4730

# Security settings
key=your_secret_encryption_key

# Performance settings
job_timeout=60
min_worker=5
max_worker=50

# Logging
logfile=/var/log/mod_gearman/mod_gearman.log
debug=1

# Event handling
events=yes
services=yes
hosts=yes

# Queue settings
queue=host
queue=service
queue=eventhandler

Setting Up Worker Nodes

On each worker node, create the worker configuration:

sudo nano /usr/local/etc/mod_gearman/worker.conf
# Gearman server connection
server=192.168.1.100:4730

# Security (must match server key)
key=your_secret_encryption_key

# Worker settings
debug=1
logfile=/var/log/mod_gearman/worker.log

# Plugin paths
plugin_path=/usr/lib/nagios/plugins

# Performance
job_timeout=60
max_jobs=10
idle_timeout=30

# Queues to process
queue=host
queue=service

Starting the Services

Start all required services in the correct order:

# 1. Start Gearman job server
sudo systemctl restart gearmand

# 2. Start mod_gearman workers on worker nodes
sudo /usr/local/bin/mod_gearman_worker --config=/usr/local/etc/mod_gearman/worker.conf --daemon

# 3. Restart Nagios on the main server
sudo systemctl restart nagios

Monitoring and Verification

Verify that everything is working correctly:

# Check Gearman job server status
gearman_top

# Check worker connections
echo "status" | nc localhost 4730

# Monitor log files
tail -f /var/log/mod_gearman/mod_gearman.log
tail -f /var/log/mod_gearman/worker.log

Expected output from gearman_top:

2025-08-26 06:34:00 - localhost:4730 - v1.1.21 - 3 workers
Queue Name    | Worker Available | Jobs Running | Jobs Waiting
--------------+------------------+--------------+-------------
host          |        3         |      2       |      0
service       |        3         |      5       |      1
eventhandler  |        2         |      0       |      0

Load Balancing Strategies

Configure different load balancing methods:

Round Robin Distribution

# In mod_gearman.conf
distribute_method=round_robin

Host-based Distribution

# Assign specific hosts to specific workers
hostgroups=linux-servers:worker1
hostgroups=windows-servers:worker2

Performance Tuning

Optimize performance with these settings:

# Increase worker processes
max_worker=100
min_worker=10

# Adjust timeouts
job_timeout=300
connect_timeout=10

# Enable encryption for security
encryption=yes
key_file=/etc/mod_gearman/secret.key

Troubleshooting Common Issues

Workers Not Connecting

# Check network connectivity
telnet gearman-server 4730

# Verify firewall settings
sudo iptables -A INPUT -p tcp --dport 4730 -j ACCEPT

Jobs Timing Out

# Increase timeout values in worker.conf
job_timeout=120
server_timeout=30

High Memory Usage

# Limit maximum jobs per worker
max_jobs=5
idle_timeout=60

Advanced Configuration

SSL/TLS Encryption

# Generate SSL certificates
openssl req -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 365 -nodes

# Configure SSL in mod_gearman.conf
use_ssl=yes
ssl_cert_file=/etc/ssl/certs/server.crt
ssl_key_file=/etc/ssl/private/server.key

Custom Queues

# Create specialized queues for different check types
queue=ping_checks
queue=disk_checks
queue=network_checks

# Assign workers to specific queues
worker --queue=ping_checks --max-jobs=20
worker --queue=disk_checks --max-jobs=10

Monitoring mod_gearman Performance

Create custom Nagios checks to monitor mod_gearman:

#!/bin/bash
# check_gearman_queue.sh
QUEUE_COUNT=$(echo "status" | nc localhost 4730 | grep "service" | awk '{print $4}')

if [ $QUEUE_COUNT -gt 100 ]; then
    echo "CRITICAL - Queue backlog: $QUEUE_COUNT jobs"
    exit 2
elif [ $QUEUE_COUNT -gt 50 ]; then
    echo "WARNING - Queue backlog: $QUEUE_COUNT jobs"
    exit 1
else
    echo "OK - Queue backlog: $QUEUE_COUNT jobs"
    exit 0
fi

Best Practices

  • Security: Always use encryption keys for production environments
  • Redundancy: Run multiple Gearman job servers for high availability
  • Monitoring: Implement checks to monitor the mod_gearman infrastructure itself
  • Scaling: Start with fewer workers and scale up based on load
  • Logging: Enable appropriate logging levels for troubleshooting

Conclusion

mod_gearman provides an excellent solution for scaling Nagios monitoring infrastructure. By distributing checks across multiple worker nodes, you can achieve better performance, improved reliability, and easier horizontal scaling. The setup requires careful configuration but provides significant benefits for medium to large-scale monitoring environments.

Regular monitoring of the mod_gearman infrastructure itself, proper security measures, and performance tuning will ensure your distributed monitoring system operates efficiently and reliably.