A couple of weeks ago while teaching the HDI Problem Management Professional course, I had an interesting topic come up regarding service disruption reporting. The very first unit in the course sets the context for problem management as one of two IT service management processes we refer to as “service resolution and restoration” processes. The other process is incident management. The fact that problem management was presented as a service restoration process led to a discussion on whether and what kind of communication should go to the affected stakeholder(s) after the service is restored. If a communication is sent, who should draft and send it, when, and what should the content consist of? While there are varying opinions, guidance and examples on this topic on the web, I offered the following guidance to the students.
Incidents and problems will vary in impact and can result in a service disruption to one or more users. When the service disruption has been significant, it is often appropriate to send out a communication to the affected users and other stakeholders providing them information about what, when, why, and how the service disruption occurred, along with what was done to restore the service and actions being taken to prevent it from happening again. This information is often documented in what is referred to as an “Incident Report” or “Service Disruption Report”.
The purpose of the Service Disruption Report is to:
- Serve as a communication tool to users and stakeholders
- Build and maintain confidence and trust in the service provider
- Document details of the incident or problem, its impact, and the steps taken to resolve it
- Indicate the actionable steps that have been or will be taken to prevent recurrence
The report should be sent out after service is restored and ideally, after root cause has been identified. It should be precise, honest, empathetic, serious, and reflect a positive tone. Finger pointing should never occur. In fact, you should seldom, if ever, use names. The report should not contain empty promises such as “We will do everything we can to ensure this never happens again.” It should state clearly what went wrong, the steps taken to restore service, and what you’re doing to prevent it from happening again, by when. It should look and sound professional with well constructed, grammatically correct sentences in “customer-speak” not “tech-speak.” The use of acronyms should be limited and well defined.
When writing the report, consider how and what you’re saying from the following three perspectives:
- Confidentiality – what can and cannot be included
- Liability – be careful not to place your organization at risk (especially true with external customers)
- Competition – do not leave yourself vulnerable
Common content in a service disruption report usually consists of the following:
- Start/End date and time of the service outage
- Customer(s) affected (estimate, if necessary)
- Outage summary - Short description, its duration, impact and root cause (if known)
- Outage details - Sequence (timeline) of events that occurred and actions taken
- Business impact
- Business processes impacted
- Locations affected
- Root cause (provide details)
- Preventive actions - Itemized list of actions to prevent it from happening again (with completion date)
Technical details can be provided by the support staff, but the writing of the report should be limited to a few skilled writers, perhaps within your company’s communications department. If the report is to go to external customers, a review by the legal department is recommended.
See also the HDI white paper Communicating and Staffing for Unplanned Outages.
See more about HDI courses here.