Additional Information
Disk Usage
Estimating the disk usage on a op5 Monitor server is not an exact science. The exact disk usage depend on the amount of services, the output of the plugins, the number of state changes etc. The numbers presented below should only be considered estimates since there are a lot of factors influencing the actual storage needed.
The data that is persistent can be divided into three categories; performance graphs, database (reporting data) and logfiles.
Performance graphs
Performance data will be stored in Round Robin Databases using RRDtool. That means that after some time the oldest data will be dropped at the 兎nd and it will be replaced by new values 殿t the beginning. The good thing about this approach is that the size of one particular rrd file is fixed, it does not grow over time. New data is stored with high precision and older data is stored with lower precision. When data is really outdated it is discarded.
Estimates provided by the maintainers of pnp4Nagios (the graphing engine) suggest that roughly 400KB per datasource is needed.
Example: 10 hosts, 50 services per host and 10 datasources per service each mean a total of ~2GB of storage. (500*10*400KB)
Sizing formula: num hosts * num services/host * num datasources/service * 400KB
If non default precision settings are used the amount of required storage will change.
Database usage
Using Oracle database the initial database size will vary depending on database settings.
The data stored in database can be divided into two categories, status information and reporting data. Status information contain the current state (and associated data) for all services. Reporting data contain historical data used when creating availability and SLA reports. Please note that report data only contain state changes (both soft and hard states) so the amount of space needed depend on the how often a service changes state.
Using the example above a default Oracle database, in standalone mode on Windows, result in 1-1.5GB of storage for storing current status information and oracle internals.
One row in the reporting table consume about 1KB of storage (when the service output is 1KB).
The database storage is highly dependent on the stability of the monitored environment. In an unstable environment there is a lot of state changes which generate reporting data and hence the storage required increase.
Normally a stable environment, consisting of 10 hosts with 50 services each, produce less then 100 state changes per day but when calculating storage requirements one should count on higher numbers.
Sizing formula: 1.5GB + avg service output size * number of state changes.
An extremely unstable environment, generating 1000 state changes per day would, in one year, require:
1.5GB + 1KB * 1000 * 365 =~ 1.9GB of storage (the actual data file storage is larger and depend on the allocation scheme used)
Logfile usage
Estimating logfile usage is the biggest challenge. This since the storage needed depend on the number of monitor process restarts, the amount of state changes, whether external commands and passive checks are logged, the number of notifications send etc. The main logfile (nagios.log) contain information of state changes and other information needed to generate availability reports. Optionally one could also include passive checks and external commands in the log file.
Each state change generate a log containing the service output + some additional information.
Calculating the storage needed for logfiles when passive checks and external commands are not logged would require (roughly):
Sizing formula: num state changes * 1.5KB + num hosts * num services * process restarts
Since the number of process restarts are difficult to calculate and since there are also other events that are logged the example above, if monitor is restarted 10 times a day, would generate atleast:
1000 state changes* 1.5KB * 365 days + 500 services * 1KB * 10 restarts * 365 days =~ 2.5GB
Also, each notification are also logged to file which mean that if several contacts get notified by the same service the amount of logfile space increase even further.
Apart from nagios.log a number of other logfiles exist which consume relatively small amounts of data.
Hardware requirements
Since logfile usage in particular are difficult to estimate the op5 recommended hardware specifications should be honored. The hardware specifications for a op5 Monitor server include >140 GB of hard drive which is enough for a system monitoring up to 10 000 services during the expected hardware lifetime (3-4 years). The following configuration options are covered by the setup tool.