37 lines
2.6 KiB
Markdown
37 lines
2.6 KiB
Markdown
|
We manage how we get alerted based on many factors such as the customers contractual SLA, the urgency of their request or incident, etc.. **an alert or notification is something which requires a human to perform an action**. Based on the severity of the issue (service request or incident) we prioritize accordingly in [DoIT](http://doit.sphs.ro).
|
||
|
|
||
|
!!! warning "Major Priority Alerts"
|
||
|
Anything that wakes up a human in the middle of the night should be **immediately human actionable**. If it is none of those things, then we need to adjust the alert to not page at those times.
|
||
|
|
||
|
| Priority | Alerts | Response |
|
||
|
| -------- | ------ | -------- |
|
||
|
| Major | Major-Priority Spearhead Alert 24/7/365. | Requires **immediate human action**. |
|
||
|
| Normal | Normal-Priority Spearhead Alert during **business hours only**. | Requires human action that same working day. |
|
||
|
| Minor | Minor-Priority Spearhead Alert 24/7/365. | Requires human action at some point. |
|
||
|
| Notification | Suppressed Events. No response required. | Informational only. We do not need these to clutter out ticketing or inboxes. If they are enabled they should be sent only to required/specific people, not groups. |
|
||
|
|
||
|
Both IN and SR (incidents, service requests) share the same priorities. The actual response / resolution times vary and are based upon contractual agreements with the customer. These details (SLA) are available in DoIT on the organization page of the respective customer.
|
||
|
|
||
|
If you're setting up a new alert/notification, consider the chart above for how you want to alert people. Be mindful of not creating new high-priority alerts if they don't require an immediate response, for example.
|
||
|
|
||
|
!!! info "Alert Channels"
|
||
|
Presently we use email as the only notification method. This means keeping an eye on your email is essential!
|
||
|
SMS and Push notifications are in the pipeline for DoIT.
|
||
|
|
||
|
## Examples
|
||
|
|
||
|
#### "Production service is failing for 75% of requests, automation is unable to resolve."_
|
||
|
This would be a **Major** priority IN, requiring immediate human action to resolve.
|
||
|
|
||
|
![Major Urgency](../assets/img/screenshots/prio-high.png)
|
||
|
|
||
|
#### "A customer sends an email stating that "Production server disk space is filling, expected to be full in 48 hours. Log rotation is insufficient to resolve."
|
||
|
This would be a **Normal** priority SR, requiring human action soon, but not immediately.
|
||
|
|
||
|
![Normal Urgency](../assets/img/screenshots/prio-norm.png)
|
||
|
|
||
|
#### "An SSL certificate is due to expire in one week."
|
||
|
This would be a **Minor** priority SR, requiring human action some time soon.
|
||
|
|
||
|
![Minor Urgency](../assets/img/screenshots/prio-low.png)
|