<!DOCTYPE html> <!--[if lt IE 7 ]><html class="no-js ie6"><![endif]--> <!--[if IE 7 ]><html class="no-js ie7"><![endif]--> <!--[if IE 8 ]><html class="no-js ie8"><![endif]--> <!--[if IE 9 ]><html class="no-js ie9"><![endif]--> <!--[if (gt IE 9)|!(IE)]><!--> <html class="no-js" lang="en"> <!--<![endif]--> <head> <meta charset="utf-8"> <title>Alerting Principles - Spearhead Systems Incident Response Documentation</title> <!-- Author and License --> <meta name="author" content="Spearhead Systems, Inc." /> <meta name="dcterms.license" content="http://www.apache.org/licenses/LICENSE-2.0" /> <!-- Page Description --> <meta name="keywords" content="pagerduty, incident, response" /> <meta name="robots" content="index, follow, noarchive" /> <!-- Mobile --> <meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0" /> <meta name="theme-color" content="#1f293a" /> <!-- Canonical Link --> <link rel="canonical" href="https://response.spearhead.systems/oncall/alerting_principles/"> <!-- Favicon --> <link rel="shortcut icon" type="image/x-icon" href="../../assets/img/icon.png" /> <link rel="icon" type="image/x-icon" href="../../assets/img/icon.png" /> <!-- Apple --> <meta name="apple-mobile-web-app-title" content="Spearhead Systems Incident Response Documentation" /> <meta name="apple-mobile-web-app-capable" content="yes" /> <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" /> <link rel="apple-touch-icon" href="../../assets/img/icon.png"> <!-- Open Graph --> <meta property="og:url" content="https://response.spearhead.systems/oncall/alerting_principles/" /> <meta property="og:title" content="Alerting Principles - Spearhead Systems Incident Response Documentation" /> <meta property="og:site_name" content="Spearhead Systems Incident Response Documentation" /> <meta property="og:description" content="A collection of information about the Spearhead Systems incident response process. Not only how to prepare new employees for on-call responsibilities, but also how to handle major incidents, both in preparation and after-work." /> <meta property="og:image" content="https://response.spearhead.systems/assets/img/cover.png" /> <meta property="og:locale" content="en_US" /> <meta property="og:type" content="website" /> <!-- Twitter --> <meta name="twitter:card" content="summary_large_image" /> <meta name="twitter:title" content="Alerting Principles - Spearhead Systems Incident Response Documentation" /> <meta name="twitter:description" content="A collection of information about the Spearhead Systems incident response process. Not only how to prepare new employees for on-call responsibilities, but also how to handle major incidents, both in preparation and after-work." /> <meta name="twitter:image" content="https://response.spearhead.systems/assets/img/cover.png" /> <!-- Style --> <style> @font-face { font-family: 'Icon'; src: url('../../assets/fonts/icon.eot?52m981'); src: url('../../assets/fonts/icon.eot?#iefix52m981') format('embedded-opentype'), url('../../assets/fonts/icon.woff?52m981') format('woff'), url('../../assets/fonts/icon.ttf?52m981') format('truetype'), url('../../assets/fonts/icon.svg?52m981#icon') format('svg'); font-weight: normal; font-style: normal; } </style> <link rel="stylesheet" href="../../assets/stylesheets/application-a422ff04cc.css"> <link rel="stylesheet" href="../../assets/stylesheets/palettes-05ab2406df.css"> <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Colfax Regular:400,700|Roboto+Mono"> <style> body, input { font-family: 'Colfax Regular', Helvetica, Arial, sans-serif; } pre, code { font-family: 'Roboto Mono', 'Courier New', 'Courier', monospace; } </style> <link rel="stylesheet" href="../../assets/css/extra.css"> <!-- Scripts --> <script src="../../assets/javascripts/modernizr-4ab42b99fd.js"></script> </head> <body class="palette-primary-green palette-accent-blue-grey"> <div class="backdrop"> <div class="backdrop-paper"></div> </div> <input class="toggle" type="checkbox" id="toggle-drawer"> <input class="toggle" type="checkbox" id="toggle-search"> <label class="toggle-button overlay" for="toggle-drawer"></label> <header class="header"> <nav aria-label="Header"> <div class="bar default"> <div class="button button-menu" role="button" aria-label="Menu"> <label class="toggle-button icon icon-menu" for="toggle-drawer"> <span></span> </label> </div> <div class="stretch"> <div class="mainlogo"> <a href="/" title="Go to homepage."> <img src="../../assets/img/logo.png" title="PagerDuty" /> </a> </div> <div class="title"> <span class="path"> Incident Response <i class="icon icon-link"></i> </span> <span class="path"> On-Call <i class="icon icon-link"></i> </span> Alerting Principles </div> </div> <div class="button button-twitter" role="button" aria-label="Twitter"> <a href="https://twitter.com/spearhead_sys" title="@spearhead_sys on Twitter" target="_blank" class="toggle-button icon icon-twitter"></a> </div> <div class="button button-github" role="button" aria-label="GitHub"> <a href="https://github.com/spearheadsys" title="@spearheadsys on GitHub" target="_blank" class="toggle-button icon icon-github"></a> </div> <div class="button button-search" role="button" aria-label="Search"> <label class="toggle-button icon icon-search" title="Search" for="toggle-search"></label> </div> </div> <div class="bar search"> <div class="button button-close" role="button" aria-label="Close"> <label class="toggle-button icon icon-back" for="toggle-search"></label> </div> <div class="stretch"> <div class="field"> <input class="query" type="text" placeholder="Search" autocapitalize="off" autocorrect="off" autocomplete="off" spellcheck="false"> </div> </div> <div class="button button-reset" role="button" aria-label="Search"> <button class="toggle-button icon icon-close" id="reset-search"></button> </div> </div> </nav> </header> <main class="main"> <div class="drawer"> <nav aria-label="Navigation"> <a href="https://github.com/spearheadsys/issue-response-docs" class="project"> <!-- <div class="banner"> <div class="logo"> <img src="../../assets/img/icon.png"> </div> <div class="name"> <strong> Spearhead Systems Incident Response Documentation <span class="version"> </span> </strong> <br> spearheadsys/issue-response-docs </div> </div> --> </a> <div class="scrollable"> <div class="wrapper"> <!-- <ul class="repo"> <li class="repo-download"> <a href="https://github.com/spearheadsys/issue-response-docs/archive/master.zip" target="_blank" title="Download" data-action="download"> <i class="icon icon-download"></i> Download </a> </li> <li class="repo-stars"> <a href="https://github.com/spearheadsys/issue-response-docs/stargazers" target="_blank" title="Stargazers" data-action="star"> <i class="icon icon-star"></i> Stars <span class="count">–</span> </a> </li> </ul> <hr/> --> <div class="toc"> <ul> <li> <a class="" title="Home" href="../.."> Home </a> </li> <li> <span class="section">On-Call</span> <ul> <li> <a class="" title="Being On-Call" href="../being_oncall/"> Being On-Call </a> </li> <li> <a class="current" title="Alerting Principles" href="./"> Alerting Principles </a> <ul> <li class="anchor"> <a title="Examples" href="#examples"> Examples </a> </li> </ul> </li> </ul> </li> <li> <span class="section">Before an Incident</span> <ul> <li> <a class="" title="Severity Levels" href="../../before/severity_levels/"> Severity Levels </a> </li> <li> <a class="" title="Different Roles" href="../../before/different_roles/"> Different Roles </a> </li> <li> <a class="" title="Call Etiquette" href="../../before/call_etiquette/"> Call Etiquette </a> </li> </ul> </li> <li> <span class="section">During an Incident</span> <ul> <li> <a class="" title="During An Incident" href="../../during/during_an_incident/"> During An Incident </a> </li> <li> <a class="" title="Security Incident" href="../../during/security_incident_response/"> Security Incident </a> </li> </ul> </li> <li> <span class="section">After an Incident</span> <ul> <li> <a class="" title="Post-Mortem Process" href="../../after/post_mortem_process/"> Post-Mortem Process </a> </li> <li> <a class="" title="Post-Mortem Template" href="../../after/post_mortem_template/"> Post-Mortem Template </a> </li> </ul> </li> <li> <span class="section">Training</span> <ul> <li> <a class="" title="Overview" href="../../training/overview/"> Overview </a> </li> <li> <a class="" title="Incident Commander" href="../../training/incident_commander/"> Incident Commander </a> </li> <li> <a class="" title="Deputy" href="../../training/deputy/"> Deputy </a> </li> <li> <a class="" title="Scribe" href="../../training/scribe/"> Scribe </a> </li> <li> <a class="" title="Subject Matter Expert" href="../../training/subject_matter_expert/"> Subject Matter Expert </a> </li> <li> <a class="" title="Glossary" href="../../training/glossary/"> Glossary </a> </li> </ul> </li> <li> <a class="" title="About" href="../../about/"> About </a> </li> </ul> </div> </div> </div> </nav> </div> <article class="article"> <div class="wrapper"> <h1>Alerting Principles</h1> <p>We manage how we get alerted based on many factors such as the customers contractual SLA, the urgency of their request or incident, etc.. <strong>an alert or notification is something which requires a human to perform an action</strong>. Based on the severity of the issue (service request or incident) we prioritize accordingly in <a href="http://doit.sphs.ro">DoIT</a>.</p> <div class="admonition warning"> <p class="admonition-title">Major Priority Alerts</p> <p>Anything that wakes up a human in the middle of the night should be <strong>immediately human actionable</strong>. If it is none of those things, then we need to adjust the alert to not page at those times.</p> </div> <table> <thead> <tr> <th>Priority</th> <th>Alerts</th> <th>Response</th> </tr> </thead> <tbody> <tr> <td>Major</td> <td>Major-Priority Spearhead Alert 24/7/365.</td> <td>Requires <strong>immediate human action</strong>.</td> </tr> <tr> <td>Normal</td> <td>Normal-Priority Spearhead Alert during <strong>business hours only</strong>.</td> <td>Requires human action that same working day.</td> </tr> <tr> <td>Minor</td> <td>Minor-Priority Spearhead Alert 24/7/365.</td> <td>Requires human action at some point.</td> </tr> </tbody> </table> <p>Both IN and SR (incidents, service requests) share the same priorities. The actual response / resolution times vary and are based upon contractual agreements with the customer. These details (SLA) are available in DoIT on the organization page of the respective customer.</p> <p>If you're setting up a new alert/notification, consider the chart above for how you want to alert people. Be mindful of not creating new high-priority alerts if they don't require an immediate response, for example.</p> <div class="admonition info"> <p class="admonition-title">Alert Channels</p> <p>Presently we use email as the only notification method. This means keeping an eye on your email is essential! SMS and Push notifications are in the pipeline for DoIT. </p> </div> <h2 id="examples">Examples<a class="headerlink" href="#examples" title="Permanent link">#</a></h2> <h4 id="production-service-is-failing-for-75-of-requests-automation-is-unable-to-resolve_">"Production service is failing for 75% of requests, automation is unable to resolve."_<a class="headerlink" href="#production-service-is-failing-for-75-of-requests-automation-is-unable-to-resolve_" title="Permanent link">#</a></h4> <p>This would be a <strong>Major</strong> priority IN, requiring immediate human action to resolve.</p> <p><img alt="Major Urgency" src="../../assets/img/screenshots/prio-high.png" /></p> <h4 id="a-customer-sends-an-email-stating-that-production-server-disk-space-is-filling-expected-to-be-full-in-48-hours-log-rotation-is-insufficient-to-resolve">"A customer sends an email stating that "Production server disk space is filling, expected to be full in 48 hours. Log rotation is insufficient to resolve."<a class="headerlink" href="#a-customer-sends-an-email-stating-that-production-server-disk-space-is-filling-expected-to-be-full-in-48-hours-log-rotation-is-insufficient-to-resolve" title="Permanent link">#</a></h4> <p>This would be a <strong>Normal</strong> priority SR, requiring human action soon, but not immediately.</p> <p><img alt="Normal Urgency" src="../../assets/img/screenshots/prio-norm.png" /></p> <h4 id="an-ssl-certificate-is-due-to-expire-in-one-week">"An SSL certificate is due to expire in one week."<a class="headerlink" href="#an-ssl-certificate-is-due-to-expire-in-one-week" title="Permanent link">#</a></h4> <p>This would be a <strong>Minor</strong> priority SR, requiring human action some time soon.</p> <p><img alt="Minor Urgency" src="../../assets/img/screenshots/prio-low.png" /></p> <aside class="copyright" role="note"> Copyright © Spearhead Systems, Inc. – Documentation built with <a href="http://www.mkdocs.org" target="_blank">MkDocs</a> using the <a href="http://squidfunk.github.io/mkdocs-material/" target="_blank"> Material </a> theme. </aside> <footer class="footer"> <nav class="pagination" aria-label="Footer"> <div class="previous"> <a href="../being_oncall/" title="Being On-Call"> <span class="direction"> Previous </span> <div class="page"> <div class="button button-previous" role="button" aria-label="Previous"> <i class="icon icon-back"></i> </div> <div class="stretch"> <div class="title"> Being On-Call </div> </div> </div> </a> </div> <div class="next"> <a href="../../before/severity_levels/" title="Severity Levels"> <span class="direction"> Next </span> <div class="page"> <div class="stretch"> <div class="title"> Severity Levels </div> </div> <div class="button button-next" role="button" aria-label="Next"> <i class="icon icon-forward"></i> </div> </div> </a> </div> </nav> </footer> </div> </article> <div class="results" role="status" aria-live="polite"> <div class="scrollable"> <div class="wrapper"> <div class="meta"></div> <div class="list"></div> </div> </div> </div> </main> <script> var base_url = '../..'; var repo_id = 'spearheadsys/issue-response-docs'; </script> <script src="../../assets/javascripts/application-997097ee0c.js"></script> </body> </html>