Shining a Light on IT Operations Metrics

Massive amounts of data streaming out of the systems can help IT administrators and others keep up to date about that performance and the overall end-user experience. Here are some key metrics to consider that can help you measure what’s actually happening.

by Jay Menon
Date Published July 27, 2021 - Last Updated July 26, 2021

This article was originally published in InformationWeek.

Modern IT departments operate in a pressure cooker. Their highly distributed infrastructures are spread across legacy data centers and hybrid and multiple public clouds. Now, the Internet of Things and the edge are part of the mix.

Massive amounts of data are flowing through these multifaceted environments, and it falls on IT to make sense of it all, ensuring that issues don't become problems and that it’s viable to produce measurable outcomes for the business. Dashboards are instrumental for visualizing and quickly making sense of that data.

Metrics – numeric representations of data measured over time – give IT operators and site reliability engineers (SREs) a window into how a system has behaved historically. That data provides insights as to how the system(s) should perform in the future and aids investigations when something goes awry. IT monitoring tools act as giant data lakes holding time series data, while dashboards give everyone from IT administrators to C-level executives ways to easily digest the data.

The visualization market is evolving. We’re seeing different kinds of databases, such as Prometheus, which bring with them new tools to leverage the data. Prometheus (PromQL) is growing in the DevOps space and becoming the standard for monitoring containers and microservices because it makes it easier to manage time series data. PromQL also comes with client libraries that include four core metric types:

Counter: Represents a single monotonically increasing counter where the value can only increase or be reset to zero on restart.
Gauge: Represents a single numerical value that can arbitrarily go up and down, such as CPU utilization.
Histogram: Samples observations and counts them in configurable buckets. It also can provide a sum of all observed values.
Summary: Samples observations and provides a total count of observations and a sum of all observed values.
Configurable quantiles are calculated over a sliding time window.

Wait, there’s more to learn about this subject! You can read the full article at InformationWeek by clicking here.

Jay Menon is Assistant Product Manager at OpsRamp.

Tag(s): supportworld, technical support, technology, costs, ITSM, IT service management, cost per ticket, cost models, methodology, service desk

Documentation is Key to Efficiently Solving IT Crises

Bridging the AI Gap - Best Practices for Adoption and Transformation

Meet the 2018 HDI Featured Contributors

Shining a Light on IT Operations Metrics

Shining a Light on IT Operations Metrics

Massive amounts of data streaming out of the systems can help IT administrators and others keep up to date about that performance and the overall end-user experience. Here are some key metrics to consider that can help you measure what’s actually happening.

Related:

More from Jay Menon :

Comments: