As a sysadmin, is there really anything worse than an incessant notification noise? Hearing the same ping over and over and over again?
Yes, there is — and that’s when you have to contend with the gnawing feeling that a few of those alerts actually matter. Alert fatigue is a real problem for IT professionals. But never fear: In this blog, we’ll walk you through how to develop an alerting strategy that works.
Otherwise, you risk having the impossible-to-ignore urge to throw your noisy phone across the room — thus missing important notifications.
1. Decide which information deserves an alert
At the risk of your exiting this blog immediately, let’s revisit our days in class and talk philosophy. How do you determine which information warrants an alert?
Reddit user Aggietallboy had an interesting take on how to separate the important and unimportant alerts and reports:
"What are you going to do or change about your behavior, based on the content of this report?”
This calls forth an important alert consideration for sysadmins: Will you sleep better with informational alerts, actionable alerts, or a mix of both?
What Aggietallboy alludes to is actionable alerts: alerts that require you to do something. For example, if you get a disk space alert, you’ll likely want to act sooner rather than later.
In contrast, we have informational alerts: alerts that include nice-to-know information. However, these alerts usually won’t result in the world — AKA, your environment — imploding upon itself if you ignore them. They’re “nice to know" versus “need to know.”
Ultimately, your alerting strategy boils down to knowing your own tolerance for alerts before alert fatigue sets in.
Struggling to fine-tune what’s important enough to be an alert and what isn’t? It’s tough to strike that perfect balance, so we relied on our trusty in-house sysadmins — and a relevant Reddit thread — to find out how sysadmins should think about their alerting strategy.
2. Identify actionable alerts
Actionable alerts are the need-to-knows. These are the alerts that can make you cry, ruin your day, and warrant an emergency delivery of pineapple-flavored Monster.
What constitutes an actionable alert is up to you. Ask yourself this question: What do I need to know so I can potentially save my environment from catastrophe — and myself from a weekend the devil himself can’t even imagine?
Those are your actionable alerts.
Below are a few examples of actionable alerts sysadmins may find useful.
Disk space alerts
If you’ve ever been in the unfortunate circumstance of running out of disk space, you’ll know why this is an actionable alert.
Running out of disk space makes your server unresponsive. And there’s nothing fun about the tedious, slow, horrible process of upgrading disk space to make the server operational again. It’s best to get in front of the issue and fix it before it becomes an issue — and that’s why these alerts live in so many sysadmins’ repositories.
Virus detection alerts
Virus detection alerts often require immediate action. You need to know if a malware stub — or worse, a threat actor — has snuck into your environment. The sooner you catch these threats, the better your chances of intervening before any real damage is done to your environment.
Even if you have the best antivirus software available, it’s always good practice to know when a threat is found so you can make sure the threat has been properly mitigated. Automated solutions can only get you so far, and nothing replaces a human who can double-check to make sure the threat has been quarantined or eliminated from the environment.
Help desk tickets
We know — but hear us out.
If there’s a spike in help desk tickets, it may indicate an issue that a sysadmin needs to solve. For example, what if tickets come flooding in that indicate that users with a certain computer are having trouble connecting to the internet? It may be up to a sysadmin to see if there’s a driver update that fixes a known flaw.
Keeping an eye on help desk tickets can also help you identify trends over time and make informed decisions, and continuous improvement is a help desk best practice. Is this the third time this month that these network drives have caused issues? If so, it may be time to evaluate other brands.
Service outage alerts
And now, a joke:
Q: How do you know when a service your organization uses experiences an outage?
A: Your users — all of them — will tell you.
When Debbie from finance and Harold from HR send you panicked messages asking why they can’t access their email, it’s better to be armed with information ahead of time.
Service outage alerts tell you when a service you use is degraded or offline. Having this information can help ease concerns before entire departments panic because they can’t access a specific tool or service.
These alerts are the gifts that keep on giving. If you get an alert when an outage happens, you can proactively let your organization know about the outage. This saves you from a barrage of Slack messages from panicked h̶i̶g̶h̶e̶r̶-̶u̶p̶s̶ professionals who run to you, the expert, for answers on why something isn’t working.
3. Identify informational alerts
Let’s talk more in-depth about informational alerts, which are the alerts that provide good-to-know — not need-to-know — context.
Here are a few examples of informational alerts and their value to sysadmins.
Backup alerts
Did your devices successfully get backed up at their regularly scheduled time? Awesome! Did something go haywire and interrupt the backup process? Not so awesome — but you should know either way.
Backup alerts save you a lot of time in the long run. When things go smoothly, they save you the time and stress of not knowing whether all your devices are backed up. And if a backup fails, you should know so you can remedy the issue and make way for a successful backup to occur.
While your business may not immediately crumble to the ground if a backup is unsuccessful, it’s nice to know if something went wrong so you can be prepared when threat actors sneak in and steal your information — or when Lucy from HR spills coffee all over her computer.
Service usage alerts
Who does the CEO run to when they have questions about how much money goes to Microsoft for that volume license? They run to you, my friend, the sysadmin. Please pause for a virtual hug.
Service usage alerts can keep you posted on which services you’re using, how much you’re using, and how much you’re paying. About to run out of cloud service bandwidth? There should be an informational alert for that. And even better, once you’ve got this alert up and running, you can easily generate a report to show your CEO when they come running to your cubicle in a panic because why are you paying for volume licensing — and what even is a volume license?
Does the thought of streamlined alerts make you happy? We’re fans of streamlined things, too. PDQ Deploy & Inventory and PDQ Connect can help you streamline systems management by keeping track of what’s installed on which machines, making deployment a breeze.
Give the MacGyver of sysadmin tools a try during a 14-day trial of PDQ Deploy & Inventory. Or take advantage of a 14-day trial of PDQ Connect, our agent-based solution, to simplify your remote device management.