HDI’s SPOCcast is your single point of contact podcast for service management and support insights. For Episode 4, I conducted an interview with Jim Bolton via Skype to discuss how problem management can help businesses and organizations find causes for and permanently fix what’s going wrong in the IT realm. Here are some excerpts from that conversation, including some notes about HDI’s service management training and certification courses.
RA: When we conduct our research, we see that the majority of support organizations are handling more cases…and the majority of those—53 or 54 percent—are incidents (unplanned interruptions). How much can a good problem management program help?
JB: Historically…we’ve spent so much time getting better and better and better at fixing it fast…but we really haven’t done as much about fixing it permanently, and I think that’s the sad part and the missing part…. We’re not investing in fixing it so that it doesn’t fail. And there are some challenges associated with that…. A lot of organizations haven’t adopted the concept that “a penny saved is a penny earned.” So we’re spending a lot of time getting better and better and better…about fixing it fast, knowledge management, and many other techniques that we’re using. And what we hopefully would be moving toward is looking at those [incident] counts…and saying, “Wow. Maybe we can eliminate this one permanently.”
The challenge, then, is we haven’t done a good job of quantifying that work. We haven’t done a good job—from the problem management perspective—of going back to the customer and saying, “Just so you know, here are the things we did to keep these incidents from occurring.” We have a tendency, I think, to forget that we’re not needing to call as much anymore. Things become stable, and it becomes expected—and, by the way, that’s a good thing. However, in the process of things becoming stable…we really need to kind of stay in front of our customer and stay in front of our user and let them know, “Oh, and by the way, here are things we did this month to make it more stable….” The customers would have no visibility of that if we didn’t share it.
RA: And when we do share, we have a tendency to do it in IT-speak. “Our RC1024.5 server did not go down this month.” And people say, “Great! What’s that?”
JB: Exactly. And we neglected to mention that when that goes down, it affects 1,500 users, or 15,000 users at X cost per minute or X cost per second when the system goes down. “Which services for which customers?” That’s always the question that I ask. Which services for which customers? Who all was affected by that?
RA: Perhaps the best-known part of problem management is…root cause analysis. There’s been a good bit of discussion around that topic lately that says, “Why are we wasting time on this? There is no root cause.”
JB: We sometimes need to talk about root causes instead of root cause, because a lot of organizations will invest some level of energy in getting to a root cause but haven’t investigated all of the potential root causes…. The example that I like to use was the Gulf oil spill a few years ago, which we’re all very familiar with. And in the end, after they had spent probably close to a year doing root cause analysis, they actually found eight root causes, that all eight needed to be addressed in order to make sure that something like this didn’t occur again…. We need to ask, “Well, why did that happen? And why did that happen? Well, why didn’t that get seen?” One of the techniques we talk about is 5 Whys….
We sometimes need to talk about root causes instead of root cause.
RA: What a lovely segue into a question that I have about doing and documenting problem management. You mentioned the 5 Whys; and there’s also the Ishikawa diagram and also some other ways of documenting root cause analysis and in the process of problem management. What do you think is the most effective approach, if you have a favorite, and why?
JB: I should preface my answer to say, in the [HDI] problem management course we talk specifically about which techniques are best to use based on the type of problem that folks ae looking at…. There are a couple that I believe are effective initially, and those are maybe the most important for many of our listeners, because we need to get problem management started so that the organization begins to see the value of it. And we should start with the techniques that are simpler and easier to understand. I love 5 Whys…I started using it and realized, oh—what an eye-opener! We haven’t really thought about this.
I’m not a big fan of Ishikawa diagrams…what I’d say is this: We have a tendency to ignore certain areas that we should be considering…. We should make sure—and that’s one of the values of Ishikawa diagrams—that we’re looking at…did we consider process? Did we consider people? Did we consider technology? Have we got some ideas on each one of those branches? The reason I say I’m not a big fan of Ishikawa diagrams, now that I’ve already sold it to you, is that sometimes we spend way too much time in the meeting then trying to figure out, “Was that really a people issue, or was that a training issue? Do we need a separate bone for training?” And what I would say is, skip that conversation entirely.
I love brainstorming because it’s quick and it allows everybody to put up their ideas on the board without being contested.... We have to be careful that the brainstorming meeting doesn’t turn into “blamestorming.”
RA: One of the things I like best about the culture of DevOps is blameless post mortems. If something doesn’t go right, we don’t try to pin the tail on whoever it is that may have made a mistake.
JB: That goes back to culture, and that goes back to leadership, and that goes back to the fact that we want to be teaming together.
[We have a] …course coming out: Service Management Optimization. [It] starts with “Which services are we providing to which customers?” so it gets us into service catalog and service level management. If we don’t have that core conversation up front, we can’t then move to the next process we should be looking at…which is incident management, because we can’t get our categorization done correctly, we can’t get our incident prioritization done correctly, which then moves us to change management—how do we get these changes to be approved more quickly? And then lastly, if we’re logging all of our changes into our change management tool…prepares us for process number 4…problem management…All four of those are so tightly tied together.
How do we get this to happen in my organization? How do we get my organization to invest in problem management? We get what we measure, and when we measure our success based on fixing things fast—which we should, that’s an important part of our success—we often miss the point…How do we reduce the counts entirely?
Listen to the entire podcast for additional insights from Jim.
About Jim Bolton
Jim Bolton is the founder and president of Propoint Solutions and the co-author of Problem Management: A Practical Guide, published by TSO. Jim has more than a decade of experience in architecting and delivering IT service management solutions and has extensive experience in diagnosing and solving complex organizational, process, and technical challenges. He has been named an IT Industry Legend by Cherwell Software.
Roy Atkinson is one of the top influencers in the service and support industry. His blogs, presentations, research reports, white papers, keynotes, and webinars have gained him an international reputation. In his role as senior writer/analyst, he acts as HDI's in-house subject matter expert, bringing his years of experience to the community. He holds a master’s certificate in advanced management strategy from Tulane University’s Freeman School of Business, and he is a certified HDI Support Center Manager. Follow him on Twitter @RoyAtkinson.