Also in Spanish.
- Background
- Configuration
- gmond
- gmetad
- web interface
- Using unicast
- Configuration examples
- Credits and license
Ganglia is very useful to monitor hosts and it is specially useful for clusters. This is not a step-by-step guide, thus experience with system administration will be required. Please add notes to this document if you think you can improve it or add more links. If you don't like to edit it directly, please add a note following the Discussion page.
This image shows how Ganglia can be useful. In the image you can notice that the nodes are spending a lot of time in system tasks (red). 10 hosts and 40 CPUs are being monitored.
Background
gmond
Ganglia uses gmond on each monitored host, and each gmond process should let other know the statistics for the machine it runs on in one of two ways:
- sending UDP messages to a central machine (or set of machines) running gmond
- sending UDP multicast messages
A gmond process can listen to multicast UDP messages in the network. It can also listen to unicast UDP messages sent to it.
gmetad
gmetad is the program that saves the information to disk, using rrdtool. It reads it from one of the available gmond processes receiving the information of other hosts.
More than one source of information can be specified for gmetad, and in this case gmetad will use the first one that is reachable.
web interface
The web interface of ganglia reads the information from gmetad, using TCP.
Configuration
There are three parameters in gmond.conf that are really important:
- udp_send_channel
- udp_recv_channel
- tcp_accept_channel
Please refer to the ganglia documentation. The README that comes with the software is a good source of documentation.
We used Debian, but the instructions might be distribution-independent. Feel free to edit and add notes. We used the latest sources downloaded from http://ganglia.sourceforge.net, correspondign to version 3.0.5.
gmond
Installation
In every node you wish to monitor you need to install gmond.
tar xzvf ../ganglia-3.0.X.tar.gz && cd ganglia-3.0.X ./configure make make install
There is an initialization script that comes with the distribution. We used a simpler one that you can find at the end of this document. Be sure to copy it to /etc/init.d/gmond. The service has to be loaded at boot:
update-rc.d gmond defaults
Configuration
You can get a default configuration useful as a starting point. You might want to read and change some of the default values. Note that we will run gmond in a small trusted network, so we will not care much about security in this installation and we will use most default values.
gmond --default_config > /etc/gmond.conf
Note that this autogenerated configuration for gmond uses multicast. We will provide sample documentation for the unicast configuration, since the multicast is autogenerated. We will only provide a partial configuration, including only what changes after doing gmond --default_config > /etc/gmond.conf.
For the nodes being monitored:
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 0 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
}
cluster {
name = "My Cluster"
owner = "My Institution"
}
host {
location = "Cluster Room"
}
udp_send_channel {
host = server1
port = 8650
ttl = 1
}
udp_send_channel {
host = server2
port = 8650
ttl = 1
}
tcp_accept_channel {
port = 8649
}
/* continues .... */
For the hosts collectiong the information (not being monitored in this case):
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 9600
mute = no
deaf = no
host_dmax = 3600 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
}
cluster {
name = "My Cluster"
owner = "My Institution"
}
/* The host section describes attributes of the host, like the location */
host {
location = "Cluster Room"
}
/*
* We are not sending this information to other hosts.
*
* udp_send_channel {
* }
*
*/
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8650
family = inet4
}
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
tcp_accept_channel {
port = 8649
}
... continues
Testing
Now start the service and check if you can get an XML that will tell us that it is working.
telnet localhost 8649
If don't see the XML, something went wrong. From now on we will assume it worked.
gmetad
Gmetad will aggregate the information of all the nodes.
Before installing it, we installed the rrdtool libraries and programs. In Debian:
aptitude install librrd2-dev rrdtool
Now let's configure and compile gmetad.
tar xzvf ../ganglia-3.0.X.tar.gz && cd ganglia-3.0.X ./configure --with-gmetad make make install
You need a proper configuration file /etc/gmetad.conf. Here is what we included for a test:
data_source "My Cluster" node1 node2 setuid_username "ganglia"
node1 and node2 means that if node1 is not reachable, gmetad will use node2 as the source for the information. This a logical or and not an and.
Note that we created a ganglia user, and therefore we are not running the service as nobody. Also note that the service will refuse to start if a directory /var/lib/ganglia/rrds/ owned by the user uid ganglia runs with is not present.
Look for a startup script for gmond at the end of this document. You'll need to modify it (Instructions after the script) and copy it to /etc/init.d/gmetad and then run:
update-rc.d gmond defaults
web interface
You need to install a web server and the PHP environment. These commands will do it for you in Debian:
aptitude install php5-gd apache2 libapache2-mod-php5 a2enmod php5 /etc/init.d/apache2 restart
Copy the web directory that comes with the installation files to /var/www/ganglia.
Now load http://yourserver/ganglia/ and let's hope it works.
Using unicast
Check this page for details. Since ganglia can be used in many different scenarios, it will be up to you to choose which method to use. You might want to use unicast if you don't have many nodes to monitor and/or you're not allowed to use multicast.
Please check the sample configuration for unicast.
Configuration examples
Startup Scripts
Those are the startup scripts we copied from Debian. We only provide the gmond script. The gmetad script is quite similar, just change DAEMON, NAME and DESC as specified below.
#! /bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/sbin/gmond
NAME=gmond
DESC="Ganglia Monitor Daemon"
test -x $DAEMON || exit 0
set -e
case "$1" in
start)
echo -n "Starting $DESC: "
start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid \
--exec $DAEMON
echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
start-stop-daemon --stop --quiet --oknodo \
--exec $DAEMON 2>&1 > /dev/null
echo "$NAME."
;;
reload)
;;
restart|force-reload)
$0 stop
$0 start
;;
*)
N=/etc/init.d/$NAME
# echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2
echo "Usage: $N {start|stop|restart|force-reload}" >&2
exit 1
;;
esac
exit 0
For gmetad, use the following lines:
DAEMON=/usr/sbin/gmetad NAME=gmetad DESC="Ganglia Monitor Meta-Daemon"
Credits and license
This document has a GNU Free Documentation License.
Authors:
- Your name here
- Please contribute.
- Nelson Castillo
- Started the document.
Last update: 2008-06-23 (Rev 14293)