Incident Categorization: A Method to the Madness


by Julie Mohr
May 22, 2012

 

Incident categorization is a challenge for many organizations. Whether it is due to culture, politics, complexity, or an inability to agree, every organization, at some point, runs up against incident categorization. Why does it cause so much difficulty? Every organization is different. Their products and services are different. Their service levels are different. Their customers are different and their knowledge is different. Even their reports are different. Each one of these distinctive factors affects how incidents are tracked and monitored. Thus, there can be no master categorization scheme because each organization must figure out what works best in their unique environments.

What Is Categorization?

Incident management drives process improvement by using the accurate analysis of incidents to identify improvement opportunities. To facilitate this analysis, incidents are prioritized and categorized during the incident management process. Prioritization relates the importance of the incident to the impact on the business and the urgency, relative to the timing of the incident (that is, when the incident occurred). Categorization is the process of arranging the incidents into classes or categories. In the incident management process, this provides us with the ability to track similar incidents related to the products and services provided to the business.

Why Is Categorization Important?

When an incident is first categorized, it enables the analyst to run a search for knowledge in the form of incidents, problems, or known errors. When an incident can be categorized in only one way, the search against previous knowledge is more effective. If knowledge is not available, categorization provides the structure to begin gathering the information necessary to diagnose and categorize the new knowledge. Categorizing the incident speeds up the process and creates greater efficiency within the process flow.

If an issue cannot be resolved, the next value-add of categorization is identifying the group(s) to which a given incident can be escalated. Once escalation groups have been tied to specific categories, the organization can begin eliminating errors in the escalation process. Finally, another benefit of effective categorization is the ability to produce meaningful reports and conduct trend analysis, which helps the organization take a more proactive approach to managing services.

Categorization Dependencies

Within the service lifecycle, other aspects of value delivery are tied to categorization. The service catalog provides a view of the services that are in the operational environment. Organizing services into categories that make sense to your customers makes it easier for them to find information about those services and request them. Incident categorization is directly related to service categorization. All too often organizations try to categorize incidents before they fully understand how to categorize their services. Even worse, if you try to categorize incidents without understanding your services, then then your categories are likely to be very technology-focused and will lack the ability to provide a view of the impacted service from the customer’s point of view. This limits your organization’s ability to proactively manage services.

Categorization has several other critical dependencies. It is linked to the specific skills needed to support the organization’s products and services, which impacts your analysts’ training and career development opportunities. But it is also an essential step in establishing expectations when the organization develops its operational level agreements (OLAs). Going back to our escalation example, what types of incidents should be solved at the service desk? Which can be immediately escalated to level 2? The organization needs to have a foundational understanding of the relationship between incidents and services, and of the level of support provided at the service desk, level 2, and beyond.

Event management also depends directly on incident categorization. Developing automation tools and features that support event filtering and correlation, which will help you identify incidents and select the appropriate control actions, is important to ensuring the success of a given process. Likewise, proactive problem management is nearly impossible to achieve without good categorization. If an analyst can log a single incident under five or six different categories, just imagine trying to run a master report that includes all of the incidents and reports related to a specific service, issue, or component. Such a report might identify some similarities between incidents and problems, but without the full picture we may not be able to conduct trend analysis.

With so many dependencies and requirements, it’s no wonder why incident categorization is difficult. So how do we actually create a workable categorization scheme?

The Basics

Categorization is based upon a hierarchical structure that has multiple levels of classification. The hierarchy is often described as a category/type/item (CTI) structure. Once the analyst picks a high-level category, he will next select a type, followed by an item. If this is done effectively, the category defines a subset of types and the selection of a type identifies a subset of items. This type of hierarchy simplifies the incident categorization, reduces error, and helps tie unique CTIs to their owners. At its core, then, categorization is like a set of buckets. Each bucket holds a bunch of incidents and these incidents are logically grouped according to a subset of characteristics. The first decision to make has to do with identifying the highest level of the hierarchy.

  • Step One: Identify the Buckets

Incidents can be categorized by type, by caller, by technology, by incident, or by service. The first question to ask is, Which of these is most important to the customer? Typically, organizations that are implementing service management will start with the service. This provides substantial value because it helps the organization understand service performance and identify service improvements. But this high-level classification will not work for all organizations. External service providers, for example, may choose to choose make the customer the highest level (bucket). The key is to keep the upper level (or primary level) broad, but not too broad. Ten to fifteen high-level choices should keep the level of detail manageable.

  • Step Two: Verify the Buckets

How do you get these high-level choices? Start with three to six months of historical records and begin to sort the incidents according to your high-level criteria. Be sure to limit the organization to those ten to fifteen available selections.

  • Step Three: Dump the Buckets

The next step involves identifying the next level of classification, which is accomplished by looking at the incidents that were put into the Category bucket and deciding how to further divide those tickets up effectively into Type. The second level should be specific, but not too specific. The next level (Item) will provide the details, giving you greater insight into a given incident. Again, the level of detail here has to be driven by the organization’s needs and the type of incidents it captures. An example of this type of structure would be as follows:

  • Choice 1: What is the affected service? (Select service.)
  • Choice 2: What is the type of issue affecting the service? (Select type.)
  • Choice 3: What is the specific item that has a fault? (Select item.)
  • Step Four: Pilot the Buckets

The next step is to establish a structure that can be tested in the live environment. At this point the CTI structure is just temporary, to allow for modifications based upon actual calls that are received. Each call that does not fit into the structure should be reviewed to determine whether or not a change to the temporary structure is required. To keep the flow and handling of incidents moving, most structures include an “Other” category that accommodates incidents that do not fit within any existing structures. “Other” incidents should be analyzed on a weekly or biweekly basis to determine whether additional CTI structures are needed. However, long-term use of “Other” is not recommended and should be avoided. Moreover, if a high percentage of incidents are going into the “Other” bucket, a re-examination of the temporary structure is needed.

Possible Pitfalls and Stumbling Blocks

There are many pitfalls and stumbling blocks to look out for when it comes to incident categorization. If your categorization scheme has lots of CTI structures (buckets) and too many/too few tickets in each bucket, this is an indication that you may not have the right buckets or number of buckets. The ideal number is hard to define exactly, but can be more easily expressed in percentages. For example, if you have a CTI element (bucket) that holds 25 percent of your ticket volume, then the structure may not be detailed enough. If a bucket holds less than two percent, then it is probably too specific.

Above all, you should avoid making any changes to the categorization scheme as you move through the incident management process. If you find that a particular incident has been incorrectly categorized, the best way to handle this is to create a closure categorization, not to change the categorization it was assigned when it was opened. This gives the organization the opportunity to improve the process and better train the analysts recording the incidents.

It is also important to focus on capturing information that is fact-based, not symptom-based. A specific incident can have many different symptoms. Categorizing by symptom may lead to multiple categorizations a single type of incident, which would generate unreliable data, and make accurate reporting difficult at best. Also, you may find that some incidents present initial “symptoms” that point to a particular category, type, or item, but deeper analysis proves that the true issue was something very different. It will be easier to find the solution the next time the issue occurs if both categorizations are recorded. (This goes back to something we discussed at the beginning of this article: searching for knowledge.)

Finally, IT organizations tend to focus their CTI structures on the internal view of IT. While this will help the organization identify component improvements, it will not drive service improvements. Data collection must be business-driven, not IT-driven. Taking the external view will provide data that supports better decision making and analysis, based upon the business’s needs.

Critical Success Factors

First, figure out what data you need from the incident management system. If you can get all parties involved to agree on the content of incident management process and service reports, then it will help the organization define the outcome of the categorization activity. Service level agreements are instrumental in identifying what we should measure; categorization makes those measurements possible.

Second, training is essential. Even the most well-defined categorization scheme is subject to error. Organizations that train their analysts to categorize correctly and handle exceptions will reap the benefits of high-quality data and will minimize categorization redundancies.

Third, because any change to the categorization scheme could change the way existing data is structured, once the environment has stabilized, categorization should undergo the change management process. This will mitigate the effect of any changes to the structure on the underlying data, which will help maintain the highest possible level of accuracy and the validity of historical analysis. You should plan and prepare for any possible risks, but you should also encourage change. Organizations are not static, and neither should the categorization scheme be.

Last, reporting is important for overall quality improvement: of services, of processes, of technologies, of people, and of the overall customer experience. All service management processes use this data to support decision making. It is important to keep this in mind when data is structured, captured, and used in reports that are inputs to these processes. Their needs must be taken into account.

The Benefits of Good Categorization

The benefits of a good categorization scheme are many. Categorization can simplify the incident-logging process, reduce redundancy, and strengthen the organization’s ability to manage knowledge and use it to support decision making. Understanding the underlying data can enable the organization to take a proactive, crossfunctional view of service management and identify improvement opportunities. It can also provide a better overall picture of an organization’s services and how they are meeting customer expectations and service level targets.

Someone once said that nothing worth doing is easy. This is the case with categorization, as it is with life. This is a tough exercise, no doubt, but one that will pay off in the end. The data collected in the incident management process represents every touch point, every aspect of the customer experience. If we capture that knowledge in such a way that it can be reused to support continual improvement, we can improve our services, improve our customer satisfaction, and improve our operational efficiency and effectiveness. That is definitely something worth doing!

 

Julie Mohr, president of Mind the IT Gap, is a dynamic, engaging change agent who brings integrity and passion to everything she does. Through her books, articles, speaking, consulting, and teaching, Julie’s purpose is to change the world through thought-provoking dialogue and interaction. She received her BS in computer science from The Ohio State University and she currently runs an online university that provides exceptional learning experiences. Feel free to contact Julie at jlmohr@mindtheitgap.com

Tag(s): process, practices and processes, metrics and measurements

Related:

More from Julie Mohr :


Comments: