XMLTagsEditHistoryDiscussion

Also in Spanish.

  1. Background
    1. gmond
    2. gmetad
    3. web interface
  2. Configuration
  3. gmond
    1. Installation
    2. Configuration
    3. Testing
  4. gmetad
  5. web interface
  6. Using unicast
  7. Configuration examples
    1. Startup Scripts
  8. Credits and license

Ganglia is very useful to monitor hosts and it is specially useful for clusters. This is not a step-by-step guide, thus experience with system administration will be required. Please add notes to this document if you think you can improve it or add more links. If you don't like to edit it directly, please add a note following the Discussion page.

This image shows how Ganglia can be useful. In the image you can notice that the nodes are spending a lot of time in system tasks (red). 10 hosts and 40 CPUs are being monitored.

Background

gmond

Ganglia uses gmond on each monitored host, and each gmond process should let other know the statistics for the machine it runs on in one of two ways:

A gmond process can listen to multicast UDP messages in the network. It can also listen to unicast UDP messages sent to it.

gmetad

gmetad is the program that saves the information to disk, using rrdtool. It reads it from one of the available gmond processes receiving the information of other hosts.

More than one source of information can be specified for gmetad, and in this case gmetad will use the first one that is reachable.

web interface

The web interface of ganglia reads the information from gmetad, using TCP.

Configuration

There are three parameters in gmond.conf that are really important:

Please refer to the ganglia documentation. The README that comes with the software is a good source of documentation.

We used Debian, but the instructions might be distribution-independent. Feel free to edit and add notes. We used the latest sources downloaded from http://ganglia.sourceforge.net, correspondign to version 3.0.5.

gmond

Installation

In every node you wish to monitor you need to install gmond.

 tar xzvf ../ganglia-3.0.X.tar.gz && cd ganglia-3.0.X
 ./configure
 make
 make install

There is an initialization script that comes with the distribution. We used a simpler one that you can find at the end of this document. Be sure to copy it to /etc/init.d/gmond. The service has to be loaded at boot:

 update-rc.d gmond defaults

Configuration

You can get a default configuration useful as a starting point. You might want to read and change some of the default values. Note that we will run gmond in a small trusted network, so we will not care much about security in this installation and we will use most default values.

gmond --default_config > /etc/gmond.conf

Note that this autogenerated configuration for gmond uses multicast. We will provide sample documentation for the unicast configuration, since the multicast is autogenerated. We will only provide a partial configuration, including only what changes after doing gmond --default_config > /etc/gmond.conf.

For the nodes being monitored:

globals {
  daemonize = yes
  setuid = yes
  user = ganglia
  debug_level = 0
  max_udp_msg_len = 1472
  mute = no
  deaf = no
  host_dmax = 0 /*secs */
  cleanup_threshold = 300 /*secs */ 
  gexec = no
} 

cluster { 
  name = "My Cluster" 
  owner = "My Institution" 
} 

host { 
  location = "Cluster Room" 
} 

udp_send_channel { 
  host = server1
  port = 8650
  ttl = 1 
} 

udp_send_channel { 
  host = server2
  port = 8650
  ttl = 1 
} 

tcp_accept_channel { 
  port = 8649 
} 

/* continues .... */

For the hosts collectiong the information (not being monitored in this case):

globals {                    
  daemonize = yes
  setuid = yes
  user = ganglia
  debug_level = 0
  max_udp_msg_len = 9600
  mute = no             
  deaf = no             
  host_dmax = 3600 /*secs */
  cleanup_threshold = 300 /*secs */ 
  gexec = no             
} 

cluster { 
  name = "My Cluster" 
  owner = "My Institution" 
} 

/* The host section describes attributes of the host, like the location */ 
host { 
  location = "Cluster Room" 
} 

/*
 * We are not sending this information to other hosts.
 *
 * udp_send_channel { 
 *  } 
 *
 */ 

/* You can specify as many udp_recv_channels as you like as well. */ 

udp_recv_channel { 
  port = 8650
  family = inet4
}

/* You can specify as many tcp_accept_channels as you like to share 
   an xml description of the state of the cluster */ 

tcp_accept_channel { 
  port = 8649
} 

 ... continues

Testing

Now start the service and check if you can get an XML that will tell us that it is working.

 telnet localhost 8649

If don't see the XML, something went wrong. From now on we will assume it worked.

gmetad

Gmetad will aggregate the information of all the nodes.

Before installing it, we installed the rrdtool libraries and programs. In Debian:

aptitude install librrd2-dev rrdtool

Now let's configure and compile gmetad.

 tar xzvf ../ganglia-3.0.X.tar.gz && cd ganglia-3.0.X
 ./configure --with-gmetad
 make
 make install

You need a proper configuration file /etc/gmetad.conf. Here is what we included for a test:

 data_source "My Cluster" node1 node2
 setuid_username "ganglia"

node1 and node2 means that if node1 is not reachable, gmetad will use node2 as the source for the information. This a logical or and not an and.

Note that we created a ganglia user, and therefore we are not running the service as nobody. Also note that the service will refuse to start if a directory /var/lib/ganglia/rrds/ owned by the user uid ganglia runs with is not present.

Look for a startup script for gmond at the end of this document. You'll need to modify it (Instructions after the script) and copy it to /etc/init.d/gmetad and then run:

 update-rc.d gmond defaults

web interface

You need to install a web server and the PHP environment. These commands will do it for you in Debian:

 aptitude install php5-gd apache2 libapache2-mod-php5
 a2enmod php5
 /etc/init.d/apache2 restart

Copy the web directory that comes with the installation files to /var/www/ganglia.

Now load http://yourserver/ganglia/ and let's hope it works.

Using unicast

Check this page for details. Since ganglia can be used in many different scenarios, it will be up to you to choose which method to use. You might want to use unicast if you don't have many nodes to monitor and/or you're not allowed to use multicast.

Please check the sample configuration for unicast.

Configuration examples

Startup Scripts

Those are the startup scripts we copied from Debian. We only provide the gmond script. The gmetad script is quite similar, just change DAEMON, NAME and DESC as specified below.

#! /bin/sh

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/sbin/gmond
NAME=gmond
DESC="Ganglia Monitor Daemon"

test -x $DAEMON || exit 0
set -e

case "$1" in
  start)
        echo -n "Starting $DESC: "
        start-stop-daemon --start --quiet --pidfile /var/run/$NAME.pid \
                --exec $DAEMON
        echo "$NAME."
        ;;
  stop)
        echo -n "Stopping $DESC: "
        start-stop-daemon --stop  --quiet --oknodo \
                --exec $DAEMON  2>&1 > /dev/null
        echo "$NAME."
        ;;
  reload)
  ;;
  restart|force-reload)
        $0 stop
        $0 start
        ;;
  *)
        N=/etc/init.d/$NAME
        # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2
        echo "Usage: $N {start|stop|restart|force-reload}" >&2
        exit 1
        ;;
esac

exit 0

For gmetad, use the following lines:

DAEMON=/usr/sbin/gmetad
NAME=gmetad
DESC="Ganglia Monitor Meta-Daemon"

Credits and license

This document has a GNU Free Documentation License.

Authors:

Your name here
Please contribute.
Nelson Castillo
Started the document.

Last update: 2008-06-23 (Rev 14293)

svnwiki $Rev: 12966 $