58 lines
4.7 KiB
Markdown
58 lines
4.7 KiB
Markdown
This documentation covers parts of the Spearhead Systems Issue Response process. It is a copy of [PagerDuty's](https://github.com/PagerDuty/incident-response-docs/) documentation and furthermore a cut-down version of our own internal documentation, used at Spearhead Systems for any issue (incident or service request), and to prepare new employees for on-call responsibilities. It provides information not only on preparing for an incident, but also what to do during and after. It is intended to be used by on-call practitioners and those involved in an operational incident response process (or those wishing to enact a formal incident response process). See the [about page](about.md) for more information on what this documentation is and why it exists. This documentation is complementary to what is available in our [existing wiki](https://sphsys.sharepoint.com) and may not yet be open sourced.
|
|
|
|
!!! note "Issue, Incident and Service Request"
|
|
At Spearhead we use the term *issue* to define any request from our customers. Issues fall into two categories: "Service Requests (SR)" and "Incidents (IN)". Note that we use the term Incident to describe both a service request as well as incidents. For brevity we will use SR and IN throughout this documentation.
|
|
|
|
A "service request" is usually initiated by a human and is generally not critical for the normal functioning of the business while an "incident" is an issue that is or can cause interruption to normal business functions.
|
|
|
|
![Issue Response at Spearhead Systems](./assets/img/headers/sph_ir.jpg)
|
|
|
|
## Being On-Call
|
|
|
|
If you've never been on-call before, you might be wondering what it's all about. These pages describe what the expectations of being on-call are, along with some resources to help you.
|
|
|
|
* [Being On-Call](oncall/being_oncall.md) - _A guide to being on-call, both what your responsibilities are, and what they are not._
|
|
* [Alerting Principles](oncall/alerting_principles.md) - _The principles we use to determine what things page an engineer, and what time of day they page._
|
|
|
|
## Before an Incident
|
|
|
|
Reading material for things you probably want to know before an incident occurs. You likely don't want to be reading these during an actual incident.
|
|
|
|
* [Severity Levels](before/severity_levels.md) - _Information on our severity level classification. What constitutes a Low issue? What's a "Major Incident"?, etc._
|
|
* [Different Roles for Incidents](before/different_roles.md) - _Information on the roles during an incident; Incident Commander, Scribe, etc._
|
|
* [Incident Call Etiquette](before/call_etiquette.md) - _Our etiquette guidelines for incident calls, before you find yourself in one._
|
|
|
|
## During an Incident
|
|
|
|
Information and processes during an incident.
|
|
|
|
* [During an Incident](during/during_an_incident.md) - _Information on what to do during an incident, and how to constructively contribute._
|
|
* [Security Incident Response](during/security_incident_response.md) - _Security incidents are handled differently to normal operational incidents._
|
|
|
|
## After an Incident
|
|
|
|
Our followup processes, how we make sure we don't repeat mistakes and are always improving.
|
|
|
|
* [Post-Mortem Process](after/post_mortem_process.md) - _Information on our post-mortem process; what's involved and how to write or run a post-mortem._
|
|
* [Post-Mortem Template](after/post_mortem_template.md) - _The template we use for writing our post-mortems for major incidents._
|
|
|
|
## Training
|
|
|
|
So, you want to learn about incident response? You've come to the right place.
|
|
|
|
* [Training Overview](training/overview.md) - _An overview of our training guides and additional training material from third-parties._
|
|
* [Incident Commander Training](training/incident_commander.md) - _A guide to becoming our next Incident Commander._
|
|
* [Deputy Training](training/deputy.md) - _How to be a deputy and back up the Incident Commander._
|
|
* [Scribe Training](training/scribe.md) - _A guide to scribing._
|
|
* [Subject Matter Expert Training](training/subject_matter_expert.md) - _A guide on responsibilities and behavior for all participants in a major incident._
|
|
* [Glossary of Incident Response Terms](training/glossary.md) - _A collection of terms that you may hear being used, along with their definition._
|
|
|
|
## Additional Reading
|
|
|
|
Useful material and resources from external parties that are relevant to incident response.
|
|
|
|
* [Incident Management for Operations](http://shop.oreilly.com/product/0636920036159.do) (O'Reilly)
|
|
* [Incident Response](http://shop.oreilly.com/product/9780596001308.do) (O'Reilly)
|
|
* [Debriefing Facilitation Guide](http://extfiles.etsy.com/DebriefingFacilitationGuide.pdf) (Etsy)
|
|
* [US National Incident Management System (NIMS)](https://www.fema.gov/national-incident-management-system) (FEMA)
|