Creating Alert Checkers
To create an alert check you need to go to
/add_alert_check/ or click on
Add Checker on the menu bar of the alert checkers page.
First of all when you create an alert, you need to choose the 'entity' type that you are building the alert for.
Examples of entity types include
You can see a complete list of entity types in the Entity Type Reference, complete with metric and attribute lists and examples.
They are mostly self-explainatory, if you want to alert on metrics that exist on ports, pick
'Port', if you want something that has to do with a sensor, pick
Device is a special case,
and will allow you to alert on device-level metrics, such as whether the device is up or down, its uptime
and the ping/snmp response times, the entity type
Device has nothing to do with
ports or sensors on the device itself.
Once you picked the entity type, there's a couple of more things that need to be filled in but these are simple, pick a name for the alert, and pick a message you want to be included once an alert is sent out.
Entity typeshould be set to the entity type that you're creating a checker for.
Alert Nameis a unique text id used to identify your checker in the UI. It must be unique or adding the checker will fail.
Messageis a meaningful text message send along with any alerts generated by this checker. It should be used to direct the recipient to the cause and importance of the problem.
Alert Delayallows you to delay an alert for X checks before it's alerted. You can use this to 'smooth' noisy alerts and suppressing alerts for traffic and processor spikes.
Send Recoveryallows you to enable or disable the sending of recovery notifications.
Severityis currently locked at critical.
Alert Delay to set the amount of poller runs your alert checker should wait until it generates notifications.
An alert entry which is being delayed will be in the
delayed state and show as orange in the UI.
This is useful when you're creating a check for processor usage, but you don't want to be alerted on every temporary CPU spike,
but only on persistently high load conditions. If you set a delay of say,
2, it'll take 3 poller runs before setting the state to
failed and generating notifications.
Next we have the Checker Conditions pane.
This pane allows you configure the actual rules that will trigger your alert. The conditions are entered in text, with one condition per line. A condition consists of three values:
- the name of the metric to be tested
- a 'test' evaluator (
- a value to test against
Syntax of test evaluators:
||less or equals||
||greater or equals||
||match with wildcard||
You can use
||not match with wildcard||
You can use
||match for regular expression||
||not match for regular expression||
||in a list||
||not in a list||
In this pane you also configure whether your checker requires all conditions to match to trip, or any condition.
An example of a condition to test if traffic on a port exceeds 80% would be:
ifInOctets_perc gt 80 ifOutOctets_perc gt 80
You might want to set this checker to have a delay of 4, so that the alert only trips when the port exceeds 80% capacity for 20 minutes or more.
The associations pane allows you to define an initial set of rules to match entities to your checker. These rules define the subset of entities the entity type you've chosen that this alert checker will apply to. The initial form allows the creation of a single association, but multiple associations can be added later for more flexibility.
The format of both the device and entity association conditions are the same as for the checker conditions and use the same test evaluators explained in the table above. The device association conditions match against device attributes like hostname, os, distribution, location and sysObjectID. The Entity association conditions likewise match against entity attributes like a port's ifDescr, ifAlias or ifSpeed, a processor's description or a BGP session's remote AS.
There are some differences between checker conditions and associations:
- instead of using metrics, you’ll be using attributes
- you can’t use a device attribute twice in the same association rule, so for example multiple “hostname match bla” statements with in the same association rule won’t work. You will need to add multiple association rules.
- for a single device association line, you can have multiple entity association lines
That last exception allows for more specific filtering, for example, you would want to match against all sensors that are of class
airflow, but when that nets you to many results, you can add a match for its description
sensor_descr, or you’d want to match all ports of type
ifType ethernetCsmacd, but you only want certain ones with a specific description
An example to match all "Processor" memory pools on all Cisco IOS devices would be
os equals ios
mempool_descr match *processor
To match all possible devices or entities, simply use an asterisk
After creating a checker you need to go back to the alerts_checks or alerts page and run "regenerate". This will rebuild the alerts table to include the newly associated alerts. Alert entries are automatically regenerated at the end of a discovery run, keeping your alerts updated as you add and removed components from your network.
You should end up with all of your Cisco processor memory pools added to the checker.
Alert Checker Examples
Pre-written examples of alert checkers complete with example association rules can be found on the Alerting Examples page.