skip to main content
Monitoring
   
Monitoring
Introduction
The monitoring section in the web menu is related to problem management and status of your network.
It is here that you will spend most of your time when using op5 Monitor. In the monitoring section you can
view host and service problems
view performance graphs
execute service and host commands
show objects on maps
handle schedule downtime.
This chapter will give you information about the most commonly used parts of the monitoring part of op5 Monitor.
Hosts and services
Hosts and services are the objects that are monitored by op5 Monitor.
A host in detail
A host can be any kind of network device, virtual device and other objects that you might reach from the op5 Montor server.
Let us take a look at the Host information view and see what parts it is built upon. In the coming sections we will go through each part and learn how they can be used..
The table below describes each part of the Host information view briefly.
Nr
Part
Description
1
Page links
Quick links to other information about the host
Status detail list all services on this host.
Alert history show the alert log if the host.
Alert histogram show a graphical view, or trend, of the problems on the host.
Availability report of the host.
Notifications shows all notifications that has been sent out about this host.
2
Host information header
Displays brief information about the host and its surroundings like
Host name and address.
Parent host.
Hostgroup membership
Extra actions and notes.
Links to configure and graphs.
Host notifications.
3
Host state information
Here you can see status information for the host like
Current status.
Current attempt.
Last state changes and notification.
What is enabled or not on this host.
4
Host commands
Here you can perform different commands for the host and/or all services on that host.
5
Comments
Manually added comments and comments from the system are shown here.
Page links
The page links gives you a couple of shortcuts to more information about this host and its services.
Host information header
Here you will get a short summary of the host.
The host header information contains
the host address.
the parent host.
what host groups it’s member of.
what group will get the notifications for this host.
links to extra service actions, service notes and the performance graphs.
a link to the object in the configuration GUI.
Host state information
In this view you get all kind of status information about the host. This is the most detailed view you can get over a host.
Host commands
The host commands part gives you a various commands to handle the host. Here you can
locate the host in a status map
add a host comment
re-schedule the next check for this host
disable and enable active and passive checks
disable and enable notifications
schedule downtime
disable and enable event handlers.
send custom notifications
Comments
There are two types of comments:
automatically added
manually added
Automatically added comments can be
acknowledged comments
scheduled downtime comments
As a manually added comment you can type in almost anything you like.
Comments are designed to be short texts. If you would like to add documentation, longer descriptions and so on you should consider using the Dokuwiki on page 81 that is included in op5 Monitor.
A service in detail
A service is practically anything that can be measured. A service must be connected to a host.
Let us take a look at the Service information view and see what parts it is built upon. In the coming sections we will go through each part and learn how they can be used.
The picture below shows the Service information view.
 
Nr
Part
Description
1
Page links
Quick links to other information about the service and the host it is connected to.
Information for this host.
Status details for the host.
Status detail list all services on this host.
Alert history show the alert log if the service.
Alert histogram show a graphical view, or trend, of the problems on the service.
Availability report of the service.
Notifications shows all notifications that has been sent out about this service.
2
Service information header
Displays brief information about the service, host and its surroundings like
Host name and address.
What service groups the service belongs to.
Extra actions and notes.
Links to configuration and graphs.
3
Service state information
Here you can see status information for the service like
Current status.
Current attempt.
Last state changes and notification.
What is enabled or not on this service.
4
Service commands
Here you can perform different commands for the service.
5
Comments
These are comments you put there either by adding a scheduled downtime or just a comment of it own.
Page links
The page links gives you a couple of short cuts to more information about this service and the host it is connected to.
Service header information
Here you will get a short summary of the service.
Here you may see things like
What host it belongs to.
The service groups it is a member of.
What contact groups that will get the notifications.
Service notes.
Links to extra service actions, service notes and performance graphs.
A link to the object in the configuration GUI.
Service state information
In this view you get all kind of status information about the host. This is the most detailed view you can get over a service.
Service commands
The service commands part gives you a various commands to handle the service. Here you can
Disable and enable active and passive checks
Reschedule the service check
Disable and enable notifications
Schedule downtime
Disable and enable event handlers.
Submit a service comment
Send custom notification
Comments
There are two types of comments:
Automatically added
Manually added
Automatically added comments can be
acknowledged comments
scheduled downtime comments
As a manually added comment you can type in almost anything you like.
Comments are designed to be short texts. If you would like to add documentation, longer descriptions and so on you should consider using the Dokuwiki on page 81 that is included in op5 Monitor.
Parenting
Parenting in op5 Monitor is used to determine whether a host is down or unreachable.
A host is...
down if the host is the first one it can not reach in the “tree”
unreachable if the host is after the host described above.
 
Example 1 This example describes how the parenting works in practice
The picture below shows how a network looks like from the monitor servers point of view.
As you can se everything starts with the op5-monitor server. If fw-01 is down, as shown in the picture above, all child hosts of fw-01 is considered as unreachable.
 
The example above shows that you can use parenting to exclude a lot of unnecessary alerts and notifications. This because you can tell op5 Monitor not to send any notifications for a host that is unreachable. That means you will only get notification about fw-01 in this case, not the hosts “below” fw-01.
Host and service groups
Using Host groups
A host is normally placed in one or more host groups. A host group can contains any kind of hosts in any way you want to. You can use host groups to:
group hosts from the same geographic area in the same host group.
put the same type of hosts in the same host group.
place all hosts in a special service in the same group.
place a customer’s host in a host group of its own.
Beside just being a way of sorting hosts in you can use host groups to decide what user is supposed to be able to see what hosts. More about that in Access rights on page 118.
Using host groups makes it easy to find hosts that got something in common. Let us say you have a whole bunch of Citrix servers you can show just these servers in a listview.
Host group commands
By clicking on the “Action” icon on a host group you will get a menu to control the host group.
From this menu you can:
Schedule downtime for all host and/or services in the host group.
Enable and disable notifications for all hosts and/or services in the host group.
Enable and disable active checks for all hosts and/or services in the host group.
Go directly to the configuration for this host group.
Host group reporting
From the host group command menu (see above) there are also a couple of reporting tools
From this menu you can view Availability reports and Alert history for the host group.
Using Service groups
One of the most useful things with service groups is to group them by what useful service they are giving the users.
Example 2 A service group example
Let us say you have a mail service for you customers. This mail service needs the following components to be working as it should:
DNS
MTA
IMAP-/POP-server
Webmail
Storage
On the hosts listed above there are services that must be working otherwise your customer will not be able to user the email service you shall deliver to them.
Place all the important services in one service group and you can then easily see if an alert and/or notification says anything about the email service in the example.
Service group commands
By clicking on a service group name (the name within parentheses) in any of the service group views you will get a menu to control the service group.
From this menu you can:
Schedule downtime for all host and/or services in the service group.
Enable and disable notifications for all hosts and/or services in the service group.
Enable and disable active checks for all hosts and/or services in the service group.
Go directly to the configuration for this service group.
Service group reporting
From the service group command menu (see above) there are also a couple of reporting tools
From this menu you can view Availability reports and Alert history for the service group.
 
Another good way to use service groups is to create Service Level Agreement (SLA) reports based on service groups. If you take the example above and create a SLA report from it you will directly see if you can deliver your service the way you promised your customers.
 
 
Problem handling
Much of your work with op5 Monitor is about problem handling. In the beginning when you start working with op5 Monitor normally most of the time is about configuring, tweaking and fixing problems. After a while you will see that you can start working in a proactive way instead of how it used to be.
In this section we will take a look at how you can work effectively with op5 Monitor as a great help during your problem handling.
Hard and soft states
A problem is classified as a soft problem until the number of checks has reached the configured max_check_attempts value. When max_check_attempts is reached the problem is reclassified as hard and normally op5 Monitor will send out a notification about the problem. Soft problems do not result in a notification.
Alerts and notifications management
Alerts and notifications are two of the most important things for you as a system administrator who depends on a monitoring tool like op5 Monitor.
Alerts, alarm and notifications are called different things in most monitoring system. In op5 Monitor we define them like this:
 
 
Description
Alerts
An alert is when any kind of status changes on a host or a service, like:
host up
host down
service critical
service ok
and so on.
Notifications
Notifications are the messages sent out to the contacts associated with the object the notification is sent about.
Notifications are sent out on state changes. A notification is sent during one of the following alerts:
any service or host problem or recovery
acknowledgements
flapping started, stopped and disabled
downtime started, stopped and canceled
Notifications can be sent by almost anything. The following are included by default in op5 Monitor:
email
sms
dial up
Of course there are a lot of other ways to send notifications like sending them to a database, ticket handling system etc.
 
An alert can happen any time and it does not necessary needs to be associated with a notification but a notification is always associated with an alert.
Unhandled problems view
As you can see in the GUI there are many views in op5 Monitor to show you host and service status in. One of the most useful, for a system administrator, is the unhandled problems view.
In this view you will only find unacknowledged problems.
This view can be accessed from the quickbar menu.
Acknowledge problems
When a new problem is discovered you need to take care of it. The first thing you should do is to acknowledge the problem. There are many ways to acknowledge a problem.
When you acknowledge a problem you will:
make sure no more notifications are sent out.
show other users that you have seen the problem and are aware of it.
We will here take a look at two of them, acknowledge by:
the GUI
SMS
Acknowledging a problem in the GUI
The most common way to acknowledge a problem is to do it in the GUI. This is easy and you will also be able to add a comment to your acknowledge. It is also the same routine no matter if it is a host or service problem you are about to acknowledge.
To acknowledge a host problem:
1 Look up the host in the GUI and click on the host name.
2 Click on Acknowledge This host problem in Service commands.
3 Fill in a comment and click Submit.


With the Sticky options all notifications are suppressed until the problem goes to OK or UP. Uncheck this box to remove the acknownlegement even when the problem goes to another problemstate, for example from WARNING to CRITICAL or from CRITICAL to WARNING.

Use the Notify checkbox to send out a notification that this problem has been acknowledged.

With every acknowledgement a comment is added to the object. If you would like this comment to remain after the problem has retured to OK or UP use the Persistent checkbox.
4 Click Done and you will be directed back to the host you where on when you started.
Acknowledging a problem by sms
If you have received your notification by sms you can acknowledge it by sending a sms back to the op5 Monitor server.
To acknowledge a problem by sms
1 Pick up the notification sms in your mobile phone.
2 Forward it to the op5 Monitor server (you must forward the complete sms just the way it looked like when you got it).
If you now take a look at the host or service you will see that it has been acknowledged and a small comment is placed in the comment part for the object.
Removing an acknowledge
Sometimes you might need to remove an acknowledge. Maybe you acknowledged the wrong problem or you for some reason need to stop working on it but you like more notifications to be sent out.
To remove an acknowledge for a host:
1 Pick up the host or service in the gui.
2 Click on Remove Problem acknowledgement
Now the notifications will continue as it is setup for the object.
Note: The comment for the acknowledge is not removed.
Removing multiple acknowledgements
To remove several acknowledgements:
1 Go to “tactical overview” and in the “acknowledge service problem” widget and click on “X Acknowledged services”
2 Click Send Multi Action below the search field
(It is located in the top right of the list.)
3 Chose Acknowledge in Select Action drop down list just below the list and click Submit.
Schedule downtime
Using scheduled downtime enables you to plan for system work ahead. When a host or service is scheduled for downtime op5 Monitor suppresses alarms for that host or service. Furthermore op5 Monitor informs you about when a host or service is scheduled for downtime through the web interface. Information about the scheduled downtime is also stored so that planned system work does not affect availability reports.
It is possible to schedule downtime for
hosts
services
all members of a host group
all members of a service group.
You can also configure triggered downtime for hosts located below a host currently in scheduled downtime. To do this you need to have your parenting configured correctly. Read more about Parenting on page 60.
Viewing scheduled downtime
Basically the Scheduled Downtime view is a summary of all currently configured scheduled downtime for hosts and services.
In this view you can also remove scheduled downtime
To view all scheduled downtime
1 Click Scheduled downtime in the main menu under the Monitoring menu.
Scheduling downtime
As you have seen we can schedule downtime for both hosts and services. Now we will take a look at how to schedule downtime for a host and a host group. The procedure is the same for services and service groups.
When the scheduled downtime starts a notification is sent saying that the scheduled downtime has started.
When adding a retroactively downtime, this will be noted in the log for the service or host.
To schedule downtime for a host
1 Find the host you like to schedule downtime for and pick up the host information page (A host in detail on page 47).
2 In the Host commands click Schedule Downtime For This Host.
3 Fill in the form
a Enter start and end time.
b Choose between fixed or flexible.
Fixed downtime starts and stops at the exact start and end times that you specify when you schedule it.
Flexible is used when you know for how long a host or service will be down but do not know exacly when it will go down.
c Use Triggered by if you would like another schedule downtime to start the downtime. For instance, if you schedule flexible downtime for a particular host (because its going down for maintenance), you might want to schedule triggered downtime for all of that hosts's "children".
Note that this option is hidden if no other scheduled downtimes are available.
d If you chosen flexible in b then type in how long the scheduled downtime is supposed to be active.
e Add a comment about this scheduled downtime.
f Choose what to do with the child host of this host (if there are any).
4 Click Submit.
5 Click Done.
To schedule downtime for a host group
1 Locate the host group you like to schedule downtime for by clicking on Hostgroup summary in the main menu under Monitoring.
2 Click on the hostgroup “Action” icon
3 Click Schedule downtime for all hosts in this Hostgroup in the list of Hostgroup Commands.
4 Follow a-e in step 3 in To schedule downtime for a host on page 71.
5 Click Submit.
6 Click Done.
Remove a scheduled downtime
Sometimes it is necessary to remove a scheduled downtime. This can be done both before the scheduled downtime has started and during the downtime. If the scheduled downtime has been canceled before it has reached its end time a notification will be sent saying that the scheduled downtime has been canceled.
Removing a scheduled downtime
To remove a scheduled downtime
1 Open up the scheduled downtime view by follow the instructions in To view all scheduled downtime on page 70.
2 Click the delete icon under Actions.
3 Click Submit.
Now the scheduled downtime and the comment saved when you created the scheduled downtime is removed.
Schedule recurring downtime
As a good practice you shall put your hosts and services in scheduled downtime when you are planing to take them down. Many downtime events are recurring and it is pretty easy to forget to put your objects in scheduled downtime.
This is when Recurring Downtime is a great help for you.
Scheduling a recurring downtime
Let us say that you are using Citrix and you need to reboot your citrix servers once per week. This is a perfect case of when you should use a recurring downtime schedule.
To add a recurring downtime
1 Click Recurring downtime in the Monitoring menu.
2 Choose the object type.
3 Chose objects to use, in this case the citrix host group.
4 Add a comment.
5 Set start and end time.
6 Choose day of week and months of the year this schedule shall be used.
7 Click Add schedule.
Viewing your recurring downtime schedules
Once you have created a recurring downtime schedule you may
view it
edit it
delete it.
This is done from the Schedules tab.
The view looks like this
Editing a recurring downtime
To edit a recurring downtime
1 Click Recurring downtime and then Schedules.
2 Click Edit.
3 Edit the fields you like to change and click Add schedule.
Deleting a recurring downtime
To delete a recurring downtime
1 Click Recurring downtime and then Schedules.
2 Click Delete.
3 Click Ok.
Business Services
The business services view is designed to combine your IT monitoring and your business service management (BSM) to give an overview of the applications and/or services that your organisation is providing either to customers or internally.
Viewing Business Services
To access the Business Services view click on Business Services in the main menu.
The Business Services view gives an easy overview of how your Business Processes are working.
For better viewing the following screenshot has been divided in to two pieces.
 
Nr
Description
1
Business Object
List all the Business service objects. An object can be one of the following items
Group
Service
Host.
Random value
Constant value
2
Rule
Shows which rule is applied to the group.
For more information about the different rules see Rules types on page 154 in op5 Administrator manual.
3
Actions
A list of action buttons.
Click the icons to
Look up service/host in op5 monitor
Go to the configuration for the host or service
Add sub element, only available on groups
Edit object
Remove object
Clone object, only available on groups
4
Last check.
This will show when the object was last checked.
The time on a group is the time for when the last sub element was checked.
5
Duration
Displays how long the group or service has been in it’s current state.
6
Status Information
Displays in what state the current group is in. For hosts and services the output from the op5 monitor check is displayed.
 
Graphs
op5 Monitor includes support for graphing what's known as "performance data" returned by check plugins that support this feature.
Performance data can be anything that gives a more detailed picture of a particular check's performance characteristics than the OK/WARNING/CRITICAL levels that Monitor reacts to.
For example, check_ping returns performance data for packet loss and round trip times. This data is stored by Monitor and used to create graphs for different time periods, such as the last 24 hours and past week. This feature can be very helpful in identifying trends or potential problems in a network.
Viewing graphs
From most of the views in op5 Monitor you can find the graph icon looking like this:
To view the graphs for a service or a host click on the graph icon and you will get the graph view.
The table below describes the parts of the service overview which is where all graphs are being displayed.
Nr
Description
1
The graphs. Except for the graphs in it self they shows information like
host and service name
warning and critical levels
last, average and max values.
2
Here you can quickly get the graphs of an other host. Just type in the correct name of the host and press Enter.
3
Exports and calendar.
Click the icons to
export to PDF or XML
open up the calendar to view old data.
4
Zooming and reports
Click the icons to
zoom in the graph
show most resent alert for this time period for this host
create an availability report for this time period for this host.
5
Host information
Here you see a short information about the host. Click the host or service name to get extended details.
6
Other graphs on this host
The list shows the rest of the graphs available for this host. Just click on one of them to view the graphs of an other service.
Adding graphs for custom plugins
Sometimes you find a plugin you like to use but there are no graphs made from the output of the plugin. Then you need to create your own template.
To create a template of your own follow the HOWTO that can be found in the documentation area of the support part at www.op5.com.
Graph basket
To view graphs from multiple sources it’s possible to add graphs to the basket.
By adding a graph to the basket it will be possible to view the basket with the selected graphs below each other.
This will give you an easy way to compare graphs from one or more hosts.
To add a graph to the basket select the graph that you would like to add then click on the + icon above the graph
After adding the desired graphs select graphs from the menu
then click on show basket
Hyper Map
Hyper map visualises the relationships between hosts in a scrollable map.
To access the Hyper Map click on the icon in the menu
You need to accept the java-applet to run.
This map is autogenerated by the parent/child relationships of the hosts. If a host does not have any parent it is connected directly to the “op5 Monitor Process”.
To navigate in the hyper map use the mouse to drag the map in the direction you want to go.
Dokuwiki
op5 Monitor comes with an dokuwiki that gives you a great way to document both your environment and things related to your monitored systems.
Of course you can also use this dokuwiki to save other kind of related information in too. This makes it easy to reach and you will ensure you have all documentation in the same place.
Editing a wiki page
To edit an existing page, go to the page you want to edit and select ‘Edit this page’ in the top right corner.
A backup of the previous page will automatically be created.
Formatting a wiki page
You can format your text by using wiki markup. This consists of normal characters like asterisks, single quotes or equal signs which have a special function in the wiki, sometimes depending on their position. For example, to format a word in italic, you include it in two pairs of single quotes like ''this''.
 
Description
you type
Italic
//italic//
Bold
**bold**
Underline
__underline__
Bold & Italic
**//bold & italic//**
Headings of different levels
==== Headline Level 3 ====
=== Headline Level 4 ===
== Headline Level 5 ==
 
Note:
An article with 3 or more headings automatically creates a table of contents.
For more information about formatting text please go to http://www.dokuwiki.org/syntax
More information about how to use the dokuwiki in op5 Monitor can be found in op5 Monitor Administrator Manual or at http://docuwiki.net/
Agents
op5 Monitor can do a lot on its own. But to get the most out of op5 monitor you should use our agents.
The following agents are available from the download section in the support section at http://www.op5.com/get-op5-monitor/download/#Agents-tab.
op5 NSClient++
NRPE
MRTGEXT
Windows syslog Agent
Nagstamon
The table describes each agent briefly
Name
Description
op5 NSClient++
This is the agent used for monitoring Microsoft Windows operating systems.
You can use it to monitor things like
CPU, memory and disk usage
services, windows events and files
You can also use the built-in NRPE support to create your own commands for op5 NSClient++
NRPE
This is the most commonly used agent for Linux and Unix systems. NRPE is used to execute plugins on an remote machine and then send the results back to op5 Monitor.
You may also send arguments to the NRPE daemon on the remote machine to make it a bit more flexible. This must be turned on before you use the feature.
MRTGEXT
MRTGEXT was originally written as an NLM for Novel Netware to obtain values used with the widely known MRTG, but it can also be used to poll values from op5 Monitor.
op5 Syslog Agent
op5 Syslog Agent runs as a service under Windows. It formats all types of Windows Event log entries into syslog format and sends them to a syslog host (The op5 Monitor server or the op5 LogServer).
The agent can also forward plaintext log-files.
Nagstamon
Nagstamon is a status monitor for the desktop. It can connect to several servers and resides in the systray or as a floating statusbar at the desktop showing a brief summary of critical, warning, unknown, unreachable and down hosts and services and pops up a detailed status overview when moving the mouse pointer over it
More information about the agents can be found in the op5 Monitor administrator manual.