Predictive IT Analytics

by Jake McTigue


Manufacturing, financial, retail, and pharmaceutical companies have long used predictive analytics for everything from anticipating parts shortages to calculating credit scores. But across industries, the software can also warn IT organizations about potential system failures before they affect the business.

For enterprise IT, we’re not advocating all-purpose analytic workbenches that demand deep, and expensive, data mining and statistics expertise. Rather, we’re recommending that organizations consider a focused analytics suite, such as Hewlett-Packard’s Service Health Analyzer, IBM’s SPSS-powered Tivoli product, Netuitive’s eponymous offering, and systems from the likes of CA and EMC. All these products come with ready-made dashboards, reports, alerts, and key performance indicators set up for IT system and application measurement and prediction.

Just tap into your data sources, tune the dials, and start seeing the future.

Poised For Growth

“Everyone has expected an explosion in ‘analytics for IT’ for some time,” says David Stodder, director of business intelligence research at TDWI (The Data Warehousing Institute). “But from what I see, the usage is still fairly selective.” However, embedding analytics in IT management consoles will help get these capabilities into more shops, Stodder says, as not only HP and IBM but also specialized application performance management providers incorporate analytics.

David Menninger, research director at Ventana Research, agrees hat adoption of predictive analytics systems—for IT or any other purpose—is still modest. Of the 2,400 organizations in Ventana’s Business Analytics benchmark research, only about one in five is using predictive IT analytics.

So most IT organizations have lived just fine without these suites until now. What’s changed? The push toward private clouds and service-oriented IT, combined with an unprecedented number of data sources and the attendant complexity. A highly virtualized infrastructure, big data, and predictive analytics go together. In fact, big, clean, fast-moving flows of real-time information are the lifeblood of predictive IT analytics systems.

Unfortunately, IT hasn’t always been good at collecting operational data. In the age of single-server, single-application architectures, infrastructure fragmentation made it prohibitively costly to build data silos filled with accurate transactional information on which to perform analysis. But virtualization and the cloud model have changed everything. Centralization is back, and it’s better than before.

However, a cloud architecture—public, private, or hybrid—also brings complexity, which is the enemy of uptime. And this is the main reason predictive analytics will become a must-have for enterprise IT sooner rather than later. Consider that in the most recent InformationWeek Virtualization Management Survey, respondents ranked high availability number one among a dozen features. Similarly, the most-cited driver for private clouds is improved application availability.

Predictive IT analytics can get us to higher availability by helping us cut through the complexity inherent in modern cloud infrastructures, which are built one layer at a time, with each layer dependent on the one before it. This complexity makes it difficult for even experienced network architects to understand the interrelations among infrastructure components. It also makes failures substantially more difficult to troubleshoot. We’d better get a handle on this problem now, because complexity will only increase as enterprises adopt more advanced converged architectures.

Tell Me Everything

Delivering insight into IT environments with thousands of variables begins with detailed operational data. Analytics engines take common environmental metrics, like disk queue length, processor and storage link utilization, and application monitoring statistics, and separate them into dependent and independent variables. The systems explore the relationships among these variables and ultimately model them using equations. Advanced predictive analytics engines don’t necessarily store the original metric data, though some can. Rather, they store the mathematical model that governs the relationships among variables, allowing for faster performance and smaller data volumes—which equates to speed, critical in an industry that delivers value by notifying operators of failures that are about to occur.

Predictive analytics systems don’t inherently understand the difference between a server and a router. They don’t know about causality. They do, however, know about correlation. These systems see and connect variables; predictive power is derived from the strength of the data set and the degree of correlation the model is capable of seeing. The further into the future the analytics system can look, the better the deliverable.

One wrinkle is that if you’re dealing with an infrastructure that’s inherently unpredictable—say, subject to random spikes in demand—it’s very, very difficult to predict failure. Difficult, but not impossible, because predictive IT analytics vendors realize that no single technique is going to cut it. Modern systems are hybrids, combining multiple types of analysis with dependency graphs that link the discrete elements of a given critical application or infrastructure element into functional units, which are then correlated, analyzed, and sliced in many ways to yield useful analytical results.

In the full InformationWeek Predictive Analytics IT Pro Impact report, we delve into exactly how these systems work. But the general process for enabling incident prediction follows a similar pattern regardless of which package you select.

First, metrics are collected from all applicable infrastructure silos and defined for the analysis engine. Then these variables are put into “buckets,” which contain data that together predicts the health of a given system. Using a database utilization example, we might put the application monitor, underlying network, storage system, and virtualization host metrics in one bucket and label it “Mission-Critical Application.” The analytics system then knows that these metrics are connected and begins developing models from there.

Technically, you could also do this with a pure statistical analysis product, like the SPSS SDK offered by IBM. But without a great deal of expertise, you’d be in for a world of hurt. The beauty of predictive analytics systems designed for IT is the ease with which we can perform preliminary correlation without extensive specialized training.

But don’t lose sight of the ultimate goal of predictive analytics systems: enabling automation.

Do You Trust Us?

All the predictive data in the world does us no good if we’re not acting on it. In our most recent InformationWeek IT Automation Survey, fewer than half (40%) of our 388 respondents describe their organizations’ use of data center automation tools as significant or extensive. Those with little or no use of data center automation blame too many other high-priority projects and too little budget. We get that, but look at the results when we asked about frequency of network problems: 27 percent of respondents encounter issues every day or week; 32 percent say they have server problems just as often. Especially with massive, distributed infrastructures, there isn’t always time for human operators to verify every recommendation before taking preventative action, perhaps as simple as moving a struggling virtual machine to a host with more memory.

So the question is, in a wide-open, highly virtualized, or cloud infrastructure, why can’t the analytics system itself drive that automation? That’s exactly what vendors have in mind.

The catch? IT automation is a matter of trust, and we’re not the profession most likely to take a leap of faith. “Automation earns its way into an organization,” says Ventana’s Menninger, starting with manual reviews of the suggested actions and then, over time, more automation. “It’s never fully on autopilot.”

Jeff Scheaffer, HP’s director of business service management, says he sees customers using HP’s Service Health Analyzer to incorporate analytical responses directly into automation runbook systems and using analytical data to take direct action without any human intervention. We’re betting your first thought is: “Too risky.” But it’s not such a hare-brained idea. Early adopters have expert teams poised to evaluate the recommendation to trigger a runbook action. Only after a few recommendations are vetted does the team begin to let the analytics system drive automation. The system is permitted to trigger some simple tasks; those with potentially negative consequences are still evaluated by the infrastructure team. As trust grows, so does automation.

Is Your Network Ready?

Not all analytics suites are created equal in terms of IT focus, and they’re expensive to buy and set up. Netuitive, for example, prices its product based on the number of managed “elements,” which could be a physical component (such as a server or a virtualization host) or something like a specific transaction set that makes up a business process (such as payments). It also levies a small charge for each integration to a data source. Netuitive says deals are typically in the low six figures but can reach eight figures for the largest customers.

Expect to do a detailed evaluation. Consider how well your systems will integrate. If an operating system or network doesn’t have the instrumentation to describe its own behavior, these analytics tools will predict squat. Start by reviewing case studies and references. A solid predictive analytics system should be able to do a couple of things almost out of the box: 

It should quickly and easily plug into your data silos for collecting metrics and begin to develop correlations. If your network/system management product is already collecting metrics from pretty much everything, you’re off to a good start. If you have to manually integrate the analysis engine into metric streams at the hardware, software, and application levels, you’re not. 

It should then begin predicting performance across multiple categories. If the system under evaluation doesn’t generate useful predictive information within a very short time period—a month, tops—keep looking. After all, these systems are engineered to handle volumes of data and correlation that would boggle manual analysis techniques. If an analytics product can’t handle your network without handholding, throwing labor at it isn’t going to help.

And there’s no substitute for testing. All the major vendors will cite Bayesian logic, neural networks, and multivariate regression techniques, but the ways in which these techniques are implemented and correlated dictate how effectively the system predicts problems. IBM’s acquisition of SPSS added a bewildering 350 algorithms to the company’s analytical portfolio, for example.

Another important point to remember is that you have to hook in all your data streams for these systems to be of any use. No metrics, no predictive analytics. A powerful network and systems management product that’s able to collect metrics from all corners of the infrastructure and feed the analytics suite is worth its weight in proprietary API calls.

What About SMBs?

Predictive analytics software for IT is not yet available as a service, though it’s only a matter of time. Predixion Software, Opera Solutions, and other vendors offer general predictive analytics in the form of software as a service. Apptio focuses on IT management and planning in a software-as-a-service model, but it doesn’t yet have predictive analytics, Ventana’s Menninger says.

TDWI’s Stodder has a slightly different take. “The SaaS model could be compelling,” he says. “But the issue is, how do the tools get to the data? If they have to embed a lot of code or sensors into the on-premises systems, at some point, one has to ask just how much of the product is really SaaS?” One option is that the dashboard or portal is SaaS, perhaps lowering the cost of having in-house people develop the interface. A more likely scenario, Stodder says, is that the analytics is offered as part of a platform- or infrastructure-as-a-service offering.

At that point, the technology will be more suitable for smaller shops. For now, though, you need to run performance monitoring at most every level of the stack and have some automation and analytics expertise in place.

*    *    *    *    *

Predictive IT analytics use may be in its infancy, but there’s been considerable development in the last year. “The problem is the ‘shoes for the cobbler’s children’ thing,” says Stodder. “IT usually does not have the budget for analytics. However, as IT becomes a critical part of customer-facing operations, we could see more use.” Along with analytics for infrastructure monitoring, Stodder expects these systems will be used to improve the alignment of IT with business objectives. For example, executives could do “what if” analyses to understand the IT resources that would be required for an initiative.

With technology available to safeguard critical applications, aid in business alignment, and help with convergence, can you really afford not to know what’s coming next?


For more of Jake’s take on predictive analytics, check out the InformationWeek Analytics report “Predictive Analytics for IT” at (ID = S4530312).


Jake McTigue is the IT manager for Carwild Corp. and a senior consulting network engineer for NSI. He is responsible for IT infrastructure and has been involved in server virtualization since 2002. He has been a project lead on consolidation and virtualization projects for public safety, education, and private-sector applications, and he has been instrumental in articulating the benefits of virtualization for organizations all over the northeast.

Tag(s): metrics and measurements, technology


More from Jake McTigue :

    No articles were found.