一、 Performance Improvements:
The performance improvements in Nagios Core 4 come primarily from the following areas:
1、Core Workers - Core workers are lightweight processes whose only job is to perform checks. Because they are smaller they spawn much more quickly than the the old process which forked the full Nagios Core. In addition, they communicate with the main Nagios Core process using in-memory techniques, eliminating the disk I/O latencies that could previously slow things down, especially in large installations.
2、Configuration Verification - Configuration verification has been improved so that each configuration item is verified only once. Previously configuration verification was an O(n2) operation.
3、Event Queue -The event queue now uses a data structure that has O(log n) insertion times versus the O(n) insertion time previously. This means that inserting events into the queue uses much lessCPU than in Nagios Core 3.
4、Macro Resolution - Macros are now sorted on startup so macro lookup can use a binary search. In addition, frequently accessed macros $USERx$, $ARGx$, and$HOSTADDRESS$ are given special case, early lookups.
二、 Object Definitions:
The following changes have been made to object definitions:
1、 The host address attribute is now optional. The address attribute is set to the host name when it is absent. Most configurations set the host name attribute to the DNS host name making the address attribute redundant.
2、 Both hosts and services nowsupport an hourly value attribute. The hourly valueattribute is intended to represent the value of a host or service to an organization and is used by the new minimum value contact attribute.
3、 Services now support a parents attribute. A service parent performs a function similar to host parents and can be used in place of servicedependencies in simple circumstances.
4、 The failure_prediction_enabled flag has been removed from both host and service object definitions.
5、 Contacts now support a minimum value attribute. The mininum value attributeis used with the host and service hourly value attributesto determine whether to notify a contact on host and serviceproblems.
6、 The host obess_over_host and the service obsess_over_service attributes can now both use the shortened attribute obsess.
1、 Contact Inheritance - According to the documentation, contacts should only be inherited from host to service if the service has no other contacts whatsoever (and the same goes for escalations), but the way the code previously worked was that it handled contact_groups and contacts directives separately, meaning services with only 'contacts' specified were still eligible for inheriting 'contact_groups' from the host. This has beenupdated to comply with the documentation.
2、 Timeperiods - There were several issues processing timeperiods whenboth exclusions and exceptions were involved. The issueshave been corrected.
The following changes have been made to the main Nagios Core configuration, nagios.cfg:
1、Because there are many ways to obtain object information, the object information is no longer stored if in the object cache if the configuration variable object_cache_file equals '/dev/null'. Setting the variable to '/dev/null' willreduce the disk I/O load.
2、Because there are many ways to obtain status information, the status information is no longer stored if in the status data file if the configuration variable status_file equals '/dev/null'. Setting the variable to '/dev/null' willreduce the disk I/O load.
3、There is a new configuration variable, log_current_states,which determines whether current states will be logged inthe log files when they are rotated. In Nagios Core 3, thiswas always the behavior and it is the default in Nagios Core 4. Disabling the logging of current states on log rotation can save considerable disk space for largeinstallations.
4、There is a new configuration variable, check_workers,which specifies how many worker processes are created when Nagios Core starts. If not specified, the number of worker process is determine by the number of CPUs on the system.
5、There is a new configuration variable, query_socket,which specifies the location of the queryhandler socket. The default location is /usr/local/nagios/var/rw/nagios.qh.
6、The configuration variables, check_result_reaper_frequency and max_check_result_reaper_time, have been deprecated. Because of the new worker architecture, checks are no longer reaped, but they are fed back to core by theworker processes. As a result, these variables no longermake sense.
7、All file and directory configuration variables in the mainnagios.cfg can now use paths that are relative to the location of nagios.cfg.
8、Although rarely used in the past, creating nagios objectsin the main nagios.cfg configuration file was allowed. Thisis now prohibited.
1、Additions - A new macro, $CHECKSOURCE$, has been added which containsinformation about what process performed a check.
2、Changes - If use_large_installation_tweaks is set, the $HOSTGROUPMEMBERS$ and $SERVICEGROUPMEMBERS$macros are no longer exported because they can consume the available space for environment variables.
3、Macros are normally available as environment variables when check, event handler, notification, and other commands are run.This can be rather CPU intensive in large Nagios installations, so you can disable the export of environment variables completelywith the enable_environment_macros option.
4、Macro information can be found here.
The query handler is a general purpose communication mechanism that allows external entities to communicate with Nagios Core in a well-defined manner.As of this writing, all communication with the query handler takes place through a Unix-domain socketwhose location is defined by the query_socket configuration variable.
There are currently 5 built-in query handlers.
More information about the query handler interface, includingan introduction to creating a custom query handler, can be foundin the source-supplied documentation.
core - provides Nagios Core management and information
wproc - provides worker process registration, managementand information
nerd - provides a subscription service to the Nagios Event Radio Dispatcher (NERD)
help - provides help for the query handler
echo - implements a basic query handler that simplyechoes back the queries sent to it
Previously, all host and service checks were performed by the full Nagios Core process. This required forking the Nagios Core process for every check. The full Nagios Core processincludes a lot of things that are not required to actuallyperform the check, including check scheduling, downtimehandling, processing external commands, etc. As a result, forking the Nagios Core process was much slower than was necessary. When the actual check was run, the forked processagain forked a shell to run the check and the shell forkedto run the plugin.
In addition, disk files were used as the inter-process communication (IPC) mechanism between the forked Nagios process doing the checking and the main Nagios process handling the check results.
In Nagios Core 4, the process of performing host and service checks is now accomplished using a lightweight worker processes.Standard worker processes start up with the main Nagios Coreprocess and additional, special-purpose workers, can be startedat any time after Nagios Core starts. If the check command is"simple" (no shell escapes), the worker process can run thecommand directly, avoiding the 2 additional forks previouslyrequired.
Also in Nagios Core 4, the worker processes report the checkresults to the main Nagios Core process using in-memory IPCmechanisms (the query handler interface), eliminating the disk I/O bottleneck that used to be an issue in large installations.
When a worker process registers with the main Nagios Core process, it tells Nagios Core what checks it will handle. This feature allows external authors to create special-purposeworkers which are optimized to perform certain checks.A sample special-purpose ping check worker is included withthe Nagios Core source code in the worker/ping subdirectory.
More information about workers, including an introduction to creating custom workers can be found in the source-supplied documentation.
八、Nagios Event Radio Dispatcher (NERD):
The Nagios Event Radio Dispatcher (NERD) is a query handler based service that streamsNagios Core events to the subscriber. Currently, thereare three channels that can be subscribed to: hostchecks, servicechecks and opathchecks.
libnagios is a library of functions that can be used bydevelopers of query handlers and worker processes. libnagioscurrently contains the following components.
bitmap -bitmap library for calculating dependency graphs
dkhash -dual-keyed hash api
fanout -sparsely populated array used for downtime, comments,and worker jobs
iobroker - I/O broker library for multiplexing between running tasks and the master nagios process.
iocache - I/O caching libary for bulk-reading requests and parsing them
kvvec - key/value library for parsing requests and building responses
nsock - socket library for connecting to and communicating through the qh socket
nspath - general purpose path library for converting between relative and absolute paths
nsutils -small library with worker related utilities
pqueue -pqueue library written by Volkan Yazici
runcmd - for spawning and reaping commands
skiplist - skiplist library used within Nagios Core
squeue - for maintaining a queue of the running job's timeouts
worker - for utils and stuff nifty to have if you're a worker
Documentation of Nagios Core internals is now provided as partof the source distribution. To create an HTML version of thisdocumentation run 'make dox' from the root of the sourcedistribution tree. The doxygen utilities must be installed tomake this documentation.
A much more complete test suite is now incuded with the Nagios Core source distribution.
十二、RPM Spec File:
The RPM spec file has been completely overhauled to supportmore current standards.
Extended Host and Service Information - The hostextinfo and serviceextinfo objects are now deprecated and should not be used. Supportfor them will be removed in a future version. The sameinformation specified in the hostextinfo and serviceextinfoobjects can be specified in the host and service objectrespectively.
-x/--dont-verify-paths command line option (Don't check for circular object paths) - Because configuration checking is now so much faster, theoption to skip checking for circular object paths has been deprecated.
The following configuration variables have been deprecated: check_result_reaper_frequency, max_check_result_reaper_time, sleep_time, external_command_buffer_slots, command_check_interval
Failure Prediction - As noted above, the failure_prediction_enabled flag has been removed from both host and service object definitions.Failure predition was never fully implemented and would require breaking the paradigm that Nagios Core knows nothing about the performance data returned by plugins.Failure prediction is much more approprately handled by an add-on than by Nagios Core.
-o/--dont-verify-objects command line option -This option, while accepted in Nagios Core 3, has neitherbeen advertized nor has had any effect for quite some time. The option has been removed in Nagios Core 4.
Embedded Perl - Embedded Perl has historically been the least tested andthe most problem prone part of Nagios Core. A significantpart of the issue is that there are so many versions ofPerl available. The performance enhancements provided by the new worker process architecture make up for anyperformance loss due to the removal of embeddd Perl. Inaddition, the worker process architecture makes possiblethe implementation of a special purpose worker topersistently load and run Perl plugins. The followingconfiguration variables that were related to embedded Perlhave been obsoleted: use_embedded_perl_implicitly,enable_embedded_perl, p1_file.
Object IDs - Primarily only of interest to developers, all of the first-class objects now have object IDs. First-class objects are timeperiod, command, contact, host, service, escalations, dependencies and all kinds of groups. ObjectIDs are not persistent and are recreated on each restart.