skip to main content
Scalable Monitoring
   
Scalable Monitoring
Distributed Monitoring
 
Introduction
Before we start
The Configuration
Setting up the new distributed monitoring solution
Adding a new poller
Adding a new host group to a poller
Removing a poller
Master takeover
File synchronization
Folder synchronization
Access right synchronization
One way connections
Disable Continuous global mon oconf push
Notify through master
Recovery
More information
Introduction
The op5 Monitor backend can easily be configured to be used as a distributed monitoring solution. The distributed model looks like this.
In the distributed monitoring solution
all configuration is done at the Master
all new configuration is distributed to the pollers
each poller is responsible for its own host group (Site).
the Master has all the status information
Before we start
There are a few things you need to take care of before you can start setting up a distributed monitoring solution. You need to make sure
you have at least two op5 Monitor servers of the same architecture and op5 monitor version up and running.
op5 Monitor >=5.2 is installed and running on both machines.
opened up the following TCP ports for communication between the servers
-15551, op5 Monitor backend communication port
-22, ssh (for the configuration sync).
-both included servers are to be found in DNS.
Make sure the host group, the one the poller will be responsible for, is added to the master configuration and that at least one host is added to that host group.
The configuration
Setting up the new distributed monitoring solution
This distributed configuration will have one master and one poller:
master01
poller01
The poller will be monitoring the host group gbg.
During the setup we will use the command:
mon
The mon command is used to make life a bit easier when it comes to setting up a load balanced solution. To get more detailed information about the command mon just execute like this:
mon --help
To setup a distributed monitoring solution with one poller
1 Log in to the master over ssh, as root.
2 Add the new poller to the configuration with the following command:
mon node add poller01 type=poller hostgroup=gbg
3 Create and add ssh keys to and from the second peer by
as root user:
mon sshkey push --all
mon sshkey fetch --all
4 Add master01 as master at poller01:
mon node ctrl --type=poller -- mon node add master01 type=master
5 Set up the configuration sync:
mon node ctrl --type=poller -- sed -i /^cfg_file=/d /opt/monitor/etc/nagios.cfg
6 To make sure you have an empty configuration on poller01:
mon node ctrl -- mon oconf hash

This will give you an hash looking like this (“da39” -hash):
da39a3ee5e6b4b0d3255bfef95601890afd80709
7 Now push the configuration to the poller:
mon oconf push
8 Restart and push the logs from master01 to poller01:
mon node ctrl --self -- mon restart; sleep 3; mon log push
Adding a new poller
In this instruction we will add a new poller to our distributed solution. Here we have the following hosts:
master01
poller01
poller02 (This is the new one.)
To add a new poller
1 Log in to the master over ssh, as root.
2 Add the new poller to the configuration with the following command:
mon node add poller02 type=poller hostgroup=gbg
3 Create and add ssh keys for the root user:
mon sshkey push poller02
mon sshkey fetch poller02
4 Add master01 as master at poller02:
mon node ctrl poller02 -- mon node add master01 type=master
5 Set up the configuration sync:
conf=/opt/monitor/etc/nagios.cfg
mon node ctrl poller02 -- sed -i /^cfg_file=/d $conf
6 To make sure you have an empty configuration on poller01:
mon node ctrl poller02 -- mon oconf hash

This will give you an hash looking like this (“d 55” -hash):
d55d3fa04bdd060bbe821b57c320fe807a096727
7 Now push the configuration to the poller:
mon oconf push
8 Restart and push the logs from master01 from poller01:
mon node ctrl --self -- mon restart; sleep 3; mon oconf push
Adding a new host group to a poller
You might want to add an other host group for to a poller. You need to edit the merlin.conf file to do that. This is not doable with any command as it is today.
To add new host group to a poller
1 Open up and edit /opt/monitor/op5/merlin/merlin.conf.
2 Add a new host group in the hostgroup line like this:
hostgroup = gbg,sth,citrix_servers

Remember to not put any space between the hostgroup name and comma.
3 Restart monitor on the poller
mon restart
4 Send over the new configuration to the poller
mon oconf push
Removing a poller
In this instruction we will remove a poller called:
poller01
The poller will be removed from the master configuration and all distributed configuration on the poller will also be removed.
To remove a poller
1 Log in to the master over ssh, as root.
2 Deactivate and remove all distributed setup on the poller host.
mon node ctrl poller01 -- mon node remove master01
3 Restart monitor on the poller.
mon node ctrl poller02 -- mon restart
4 Remove the poller from the master configuration.
mon node remove poller01
5 Restart monitor on the master.
mon restart
 
Master takeover
If a poller goes down the default configuration is for the master to take over all the checks from the poller. For this to work all hosts monitored from the poller most also be monitorable from the master.
If the master server not should take over the checks from the poller this can be set in the merlin configuration file.
To stop the master from taking over, edit the file /opt/monitor/op5/merlin/merlin.conf
Add the following to the poller that you want the master not to take over.
takeover = no
Note that this is done per poller.
File synchronization
To synchronize files from the master server to the poller add a sync paragraph in the file /opt/monitor/op5/merlin/merlin.conf
In the example below we will synchronize the htpasswd.users file from the master to the poller “poller01”
poller poller01 {
address = <ip>
port = <port>
contact_group = <contactgroup>
 
sync {
/opt/monitor/etc/htpasswd.users = /opt/monitor/etc/htpasswd.users
}
}
Note that this is done per poller
Folder synchronization
To synchronize folders to pollers add a sync paragraph in the file /opt/monitor/op5/merlin/merlin.conf
In the example below we will synchronize the /opt/plugins/custom folder to the poller “poller01”
poller poller01 {
address = <ip>
port = <port>
contact_group = <contactgroup>
 
sync {
/opt/plugins/custom/
}
}
Note that this is done per poller
Access right synchronization
To synchronize access rights the folder containing the access rights files must be added manually to the configuration. This will synchronize both local users and group right settings.
To do this, add the following sync command to /op5/monitor/op5/merlin/merlin.conf
 
sync {
/etc/op5/
}
One way connections
If one peer is behind some kind of firewall or is on a NAT address it might not be possible for the master server to connect to the peer.
To tell the master not to connect to the poller and let the poller open the session we need to add a option to the file /opt/monitor/op5/merlin/merlin.conf.
Under the section for the poller that the master should not try to connect to add the following:
connect = no
Example
In the example below we have a master “master01” that can not connect to “poller01” but “poller01” is allowed to connect to “master01”.
poller poller01 {
address = <ip>
port = <port>
contact_group = <contactgroup>
connect = no
}
Is is also possible to set this option on the peer instead then the master will always initiate the session.
Disable Continuous global mon oconf push
Normally Merlin will try to sync with poller until the connection is reestablished but in some cases this is not a desired feature.
To stop the masters behavior add the the following argument to the pollers configuration in /opt/montor/op5/merlin/merlin.conf
max_sync_attempts = X
This will stop the master from trying to sync with the poller after X times.
Notify through master
When a poller does not have the possibilities to send notification, either it does not have access to the SMTP gateway or it does not have an SMS gateway, it can send the notifications thought the master.
To enable the poller to notify through the master set notifies = no on both the poller configuration and the master configuration. In the example below we have configured poller1 to notify through the master server.
Edit /opt/monitor/op5/merlin/merlin.cfg on the master server:
poller poller-01 {
address = 10.11.12.13
hostgroups = poller-01-hosts
notifies = no
# other vars...
}
Edit /opt/monitor/op5/merlin/merlin.cfg on the poller:
module {
notifies = no
# other vars...
}
Recovery
After a poller as been unavailable for a master (i.e of network outage) the report data will be synced from the poller to the master.
The report data on the poller will overwrite the data on the master system
More information
For more information and a more complex example please take a look at the howto in the git repository of the opensource project of Merlin:
http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=HOWTO;hb=master