Deployed ac0e045 with MkDocs version: 0.16.1

This commit is contained in:
Marius Pana 2017-01-11 09:59:16 +02:00
parent 8ea0b783b3
commit dd08fa3110
6 changed files with 61 additions and 57 deletions

1
CNAME
View File

@ -1 +0,0 @@
response.spearhead.systems

View File

@ -281,14 +281,14 @@
<ul>
<li class="anchor">
<a title="Team Leader (IC)" href="#team-leader-ic">
Team Leader (IC)
<a title="Team Leader (TL)" href="#team-leader-tl">
Team Leader (TL)
</a>
</li>
<li class="anchor">
<a title="Deputy" href="#deputy">
Deputy
<a title="Sysadmin" href="#sysadmin">
Sysadmin
</a>
</li>
@ -466,11 +466,11 @@
<h1>Different Roles</h1>
<p>There are several roles for our incident response teams at Spearhead Systems. Certain roles only have one person per incident (e.g. support engineer), whereas other roles can have multiple people (e.g.System/Solution Architects, juniors, etc.). It's all about coming together as a team, working the problem, and getting a solution quickly.</p>
<p>There are several roles for our incident response teams at Spearhead Systems. Certain roles only have one person per incident (e.g. support engineer), whereas other roles can have multiple people (e.g. Sysadmins, Solution Architects, etc.). It's all about coming together as a team, working the problem, and getting a solution quickly.</p>
<p>Here is a rough outline of our role hierarchy, with each role discussed in more detail on the rest of this page.</p>
<p><img alt="Incident Response Structure" src="../../assets/img/misc/incident_response_roles.png" /></p>
<hr />
<h2 id="team-leader-ic">Team Leader (IC)<a class="headerlink" href="#team-leader-ic" title="Permanent link">#</a></h2>
<h2 id="team-leader-tl">Team Leader (TL)<a class="headerlink" href="#team-leader-tl" title="Permanent link">#</a></h2>
<h3 id="what-is-it">What is it?<a class="headerlink" href="#what-is-it" title="Permanent link">#</a></h3>
<p>A Team Leader acts as the single source of truth of what is currently happening and what is going to happen during an major incident. They come in all shapes, sizes, and colors. TL's are also the key elements in a project (boards in DoIT).</p>
<h3 id="why-have-one">Why have one?<a class="headerlink" href="#why-have-one" title="Permanent link">#</a></h3>
@ -508,25 +508,27 @@
<h3 id="how-can-i-become-one">How can I become one?<a class="headerlink" href="#how-can-i-become-one" title="Permanent link">#</a></h3>
<p>Take a look at our <a href="../../training/incident_commander/">Team Leader training guide</a>.</p>
<hr />
<h2 id="deputy">Deputy<a class="headerlink" href="#deputy" title="Permanent link">#</a></h2>
<h2 id="sysadmin">Sysadmin<a class="headerlink" href="#sysadmin" title="Permanent link">#</a></h2>
<h3 id="what-is-it_1">What is it?<a class="headerlink" href="#what-is-it_1" title="Permanent link">#</a></h3>
<p>A Deputy is a direct support role for the Incident Commander. This is not a shadow where the person just observes, the Deputy is expected to perform important tasks during an incident.</p>
<p>A Sysadmin is a direct support role for the Team Leader. This is not a shadow where the person just observes, the Sysadmin is expected to perform important tasks during an incident.</p>
<h3 id="why-have-one_1">Why have one?<a class="headerlink" href="#why-have-one_1" title="Permanent link">#</a></h3>
<p>It's important for the IC to focus on the problem at hand, rather than worrying about documenting the steps or monitoring timers. The deputy helps to support the IC and keep them focussed on the incident.</p>
<p>It's important for the TL to focus on the problem at hand, rather than worrying about documenting the steps or monitoring timers. The Sysadmin helps to support the TL and keep them stay focussed on the incident.</p>
<h3 id="what-are-the-responsibilities_1">What are the responsibilities?<a class="headerlink" href="#what-are-the-responsibilities_1" title="Permanent link">#</a></h3>
<p>The Deputy is expected to:</p>
<p>The Sysadmin is expected to:</p>
<ol>
<li>Bring up issues to the Incident Commander that may otherwise not be addressed (keeping an eye on timers that have been started, circling back around to missed items from a roll call, etc).</li>
<li>Be a "hot standby" Incident Commander, should the primary need to either transition to a SME, or otherwise have to step away from the IC role.</li>
<li>Page SME's or other on-call engineers as instructed by the Incident Commander.</li>
<li>Manage the incident call, and be prepared to remove people from the call if instructed by the Incident Commander.</li>
<li>Liaise with stakeholders and provide status updates on Slack as necessary.</li>
<li>Bring up issues to the TL that may otherwise not be addressed (keeping an eye on timers that have been started, circling back around to missed items from a roll call, etc).</li>
<li>Be a "hot standby" TL, should the primary need to either transition to a SME, or otherwise have to step away from the TL role.</li>
<li>Page SME's or other on-call engineers as instructed by the Team Leader.</li>
<li>Manage the incident call, and be prepared to remove people from the call if instructed by the Team Leader.</li>
<li>Liaise with stakeholders and provide status updates on DoIT (using worklogs and comments), Slack and email/telefone as necessary.</li>
</ol>
<h3 id="who-are-they_1">Who are they?<a class="headerlink" href="#who-are-they_1" title="Permanent link">#</a></h3>
<p>Any Incident Commander can act as a deputy. Deputies need to be trained as an Incident Commander as they may be required to take over command.</p>
<p>Any Team Leader can act as a Sysadmin. Sysadmins need to be trained as an Team Leader as they may be required to take over command.</p>
<h3 id="how-can-i-become-one_1">How can I become one?<a class="headerlink" href="#how-can-i-become-one_1" title="Permanent link">#</a></h3>
<p>Take a look at our <a href="../../training/deputy/">Deputy training guide</a>. Deputies also need to be <a href="../../training/incident_commander/">trained as an Incident Commander</a>.</p>
<p>Take a look at our <a href="../../training/deputy/">Sysadmin training guide</a>. Sysadmins also need to be <a href="../../training/incident_commander/">trained as an Team Leaders</a>.</p>
<hr />
<p>TODO:::move scribe responsibilities to TL and Sysadmin
::: or assign this to our juniors?</p>
<h2 id="scribe">Scribe<a class="headerlink" href="#scribe" title="Permanent link">#</a></h2>
<h3 id="what-is-it_2">What is it?<a class="headerlink" href="#what-is-it_2" title="Permanent link">#</a></h3>
<p>A Scribe documents the timeline of an incident as it progresses, and makes sure all important decisions and data are captured for later review.</p>
@ -547,12 +549,13 @@
<p>Anyone can act as a scribe during an incident, and are chosen by the Incident Commander at the start of the call. Typically the Deputy will act as the Scribe, but that doesn't necessarily need to happen, and for larger incidents may not be possible.</p>
<h3 id="how-can-i-become-one_2">How can I become one?<a class="headerlink" href="#how-can-i-become-one_2" title="Permanent link">#</a></h3>
<p>Follow our <a href="../../training/scribe/">Scribe training guide</a>, and then notify the Incident Commanders that you would like to be considered for scribing for the next incident.</p>
<p>TODO::: END move scribe responsibilities to TL and Sysadmin</p>
<hr />
<h2 id="subject-matter-expert">Subject Matter Expert<a class="headerlink" href="#subject-matter-expert" title="Permanent link">#</a></h2>
<h3 id="what-is-it_3">What is it?<a class="headerlink" href="#what-is-it_3" title="Permanent link">#</a></h3>
<p>A Subject Matter Expert (SME), sometimes called a "Resolver", is a domain expert or designated owner of a component or service that is part of the PagerDuty software stack.</p>
<p>A Subject Matter Expert (SME), sometimes called a "Resolver" or "Architect", is a domain expert or designated owner of a component or service that is part of the Spearhead Systems service delivery concept.</p>
<h3 id="why-have-one_3">Why have one?<a class="headerlink" href="#why-have-one_3" title="Permanent link">#</a></h3>
<p>The IC and deputy are not all-knowing super beings. When there is a problem with a service, an expert in that service is needed to be able to quickly help identify and fix issues.</p>
<p>The TL and Sysadmins are not all-knowing super beings. When there is a problem with a service or a particular system, an expert in that service is needed to be able to quickly help identify and fix issues.</p>
<h3 id="what-are-the-responsibilities_3">What are the responsibilities?<a class="headerlink" href="#what-are-the-responsibilities_3" title="Permanent link">#</a></h3>
<ol>
<li>Being able to diagnose common problems with the service.</li>
@ -576,8 +579,8 @@
<p>All of the other roles will be actively working on identifying the cause and resolving the issue, we need a role which is focused purely on the customer interaction side of things so that it can be done properly, with the due care and attention it needs.</p>
<h3 id="what-are-the-responsibilities_4">What are the responsibilities?<a class="headerlink" href="#what-are-the-responsibilities_4" title="Permanent link">#</a></h3>
<ol>
<li>Post any publicly facing messages regarding the incident (Twitter, StatusPage, etc).</li>
<li>Notify the IC of any customers reporting that they are affected by the incident.</li>
<li>Post any publicly facing messages regarding the incident (DoIT, Twitter, StatusPage, etc).</li>
<li>Notify the TL of any customers reporting that they are affected by the incident.</li>
</ol>
<h3 id="who-are-they_4">Who are they?<a class="headerlink" href="#who-are-they_4" title="Permanent link">#</a></h3>
<p>Any member of the Support Team can act as a customer liaison.</p>

View File

@ -517,6 +517,7 @@
<li><a href="http://shop.oreilly.com/product/9780596001308.do">Incident Response</a> (O'Reilly)</li>
<li><a href="http://extfiles.etsy.com/DebriefingFacilitationGuide.pdf">Debriefing Facilitation Guide</a> (Etsy)</li>
<li><a href="https://www.fema.gov/national-incident-management-system">US National Incident Management System (NIMS)</a> (FEMA)</li>
<li><a href="https://www.heavybit.com/library/video/every-minute-counts-coordinating-herokus-incident-response/">Every Minute Counts: Leading Heroku's Incident Response</a> (Blake Gentry)</li>
</ul>
<aside class="copyright" role="note">

File diff suppressed because one or more lines are too long

View File

@ -4,7 +4,7 @@
<url>
<loc>https://response.spearhead.systems/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -13,13 +13,13 @@
<url>
<loc>https://response.spearhead.systems/oncall/being_oncall/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/oncall/alerting_principles/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -29,19 +29,19 @@
<url>
<loc>https://response.spearhead.systems/before/severity_levels/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/before/different_roles/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/before/call_etiquette/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -51,13 +51,13 @@
<url>
<loc>https://response.spearhead.systems/during/during_an_incident/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/during/security_incident_response/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -67,13 +67,13 @@
<url>
<loc>https://response.spearhead.systems/after/post_mortem_process/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/after/post_mortem_template/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -83,37 +83,37 @@
<url>
<loc>https://response.spearhead.systems/training/overview/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/training/incident_commander/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/training/deputy/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/training/scribe/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/training/subject_matter_expert/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
<url>
<loc>https://response.spearhead.systems/training/glossary/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>
@ -122,7 +122,7 @@
<url>
<loc>https://response.spearhead.systems/about/</loc>
<lastmod>2017-01-10</lastmod>
<lastmod>2017-01-11</lastmod>
<changefreq>daily</changefreq>
</url>

View File

@ -463,6 +463,7 @@
<p>While it might not initially seem that this would be applicable to an IT operations environment, we've found that many of the lessons learned from major incidents in these situations can be directly applied to our industry too. The principles are the same and span many different environments.</p>
<p><a href="https://www.fema.gov/pdf/emergency/nims/NIMS_core.pdf"><img alt="NIMS" src="../../assets/img/thumbnails/nims_core.png" /></a> <a href="https://www.fema.gov/pdf/emergency/nims/nims_training_program.pdf"><img alt="NIMS Training" src="../../assets/img/thumbnails/nims_training.png" /></a></p>
<p>If you want to learn more about NIMS, we recommend the <a href="https://training.fema.gov/is/courseoverview.aspx?code=IS-100.b">ICS-100</a> and <a href="https://training.fema.gov/is/courseoverview.aspx?code=IS-700.a">ICS-700</a> online training courses, which go over NIMS and the Incident Command System (You can also take an online examination after training in order to get a certificate from FEMA). There is also a wealth of <a href="https://training.fema.gov/nims/">additional training material and courses from FEMA</a> on NIMS, which I would encourage you to look at.</p>
<p>If you're based in the US and interested in taking a more active incident response role in your community, we recommend investigating your local <a href="https://www.fema.gov/community-emergency-response-teams">CERT programs</a> (Community Emergency Response Teams). Many cities offer CERT training, after which you can volunteer as a CERT contributor within your community. Not only is it an opportunity to get real world experience with disaster response, but the skills you learn can be applied to everyday life too.</p>
<p>Also take a look at the <a href="../../#additional-reading">Additional Reading</a> section on the home page.</p>
<aside class="copyright" role="note">