IT Alerting has become a premier field of application for operational alerting. There is a growing number of vendors claiming to do IT alerting. But look closer, there are couple of requirements that should sit on top of the list when choosing the right vendor:
- Multi-modality of notifications is crucial: Voice call with text-to-speech, text, IM, email and push should be available
- So is the ability of the alerting system to track delivery and to escalate automatically, e.g. to a backup person
- The capability of integrated on-call duty scheduling is essential. As IT alerting mostly matters during after business hours when IT engineers are not in front of their screens
- The more APIs the better. Because your ITSM and BSM systems might talk special languages…
- Of course, high availability of the alerting solution is crucial
So far for the basic requirements which most vendors will fulfill. But this is not the end of the story. It is a little bit more sophisticated.
Automation? Please!
Please, no web masks and manual alert submission. We got the year 2016! Automation is the holy grail of effective IT alerting. A contemporary IT alerting solution should process data and events from ITSM/BSM systems and the automatically alert the right people.! It bears so many advantages, most importantly the overall speed of alerting (also know as meantime-to-know). It also reduces errors in compiling meaningful alert messages. And it can reduce the amount of hours spent in the NOC. I know a large number of companies who went away from staffing a NOC and replaced it entirely with on-call teams. Simply because they had super-reliable alerting they could trust.
Targeted alerting isn’t easy
In order to do targeted notifications you need to make sure alerts can exactly be addressed to the right team and/or the right person. This usually requires very fine granularity in processing events/alerts from 3rd sources like an IT monitoring system. Many vendors offer basic email/SMTP interfaces. This is good for a start. But all you get is an email with a lot of wild text. Now, try to find the right sniplets in the email body telling you which team to alert.
What it really takes is a parameterized API; ideally a plug & play product connector to your ITSM or BSM systems. Because only dedicated access to all relevant 3rd party event parameters provides you with the very details that determine the exact destination for your alerts.
Why 2-way integration matters
Well, imagine the following: Your ITSM fires an event. The alerting system picks it up and starts an alert notification process. Now, in the ITSM system the incident is closed but your alerting process goes on. What a noise for nothing! Your IT engineers won’t like it. Especially at night. Instead, your ITSM system should be able to update that status in your alerting system telling it to stop the alert notifications.
Or imagine that your alerting engineer acknowledges the alert. Wouldn’t it be cool if his or her name appears along with the incident in the ITSM and BSM system. And any remote annotations, too? You guessed right – this requires 2-way integration. Needs to be secure, though.
Acknowledging an alert is all you can do? Really?
Once an IT engineer gets an alert notification and acknowledges it – is he really required to fire up the notebook? Where is the convenience? Give him a chance to solve an issue from his mobile device. Wouldn’t it be great to fire up a remedial task with a touch of a button on his iPhone? Sounds cool?
Alone in the dark? Better not.
Solving an IT issue often is a cross-team effort. Why not have instant access to engineers on call of other teams? Network, applications, databases and so on. And have one-touch collaboration options like sharing incident & alert details or setting up an ad-hoc conference bridge. This accelerates analysis and resolution significantly.
Beware – noise alert!
Cutting through the noise is one of the most important tasks an IT alerting solution should do. There is almost no greater nightmare than to be woken up at night for an insignificant IT issue (which our ITSM systems will raise for sure). The only greater nightmare probably is to sleep through a critical incident that evolves into an IT disaster. So, without suppressing the real important alerts, the alerting solution needs to filter out the non-relevant incidents (which also brings us back to above topic of granular filtering). It also needs to make sure that you are not annoyed by duplicates. And it needs to effectively handle alert storms. You certainly don’t need a congestion on your mailbox or dozens of text messages piling up. So, take a closer look at how smart the noise reduction really is.
I could go on and list tons of things that a great IT Alerting system should provide. Have a look at what Enterprise Alert provides out-of-the-box. We addressed all those points above. And a lot more.