How to measure observability
It's easy to make something complicated, but it’s hard to make something simple. In my recent white paper, Add observability to your monitoring strategy, I strove to make understanding observability simple.
I wanted to clarify this very important (and hot) topic so that anyone, irrespective of profession, domain, or subject knowledge, could understand observability. I defined observability as the use of data, in context, to understand your IT business environment, so you can act now and plan for tomorrow. It is a nice, simple sentence that anyone can understand, and applies to any scenario, with any tech stack, in any industry.
Simply knowing what observability means is not enough. When I ask IT owners if they have observability, their answers range from a confident yes, to a categorical no. However, when I ask, “how do you know?” there is one answer that I receive more than others… “Well, we’re not exactly sure, but it’s what we think.”
This sounds like a lot of confusion to me. And that’s why I created the observability matrix.
What is it?
The observability matrix is a tangible tool that’s product and industry agnostic. It is a way of measuring observability maturity, i.e. the level of your understanding, for the corresponding data you deal with.
Why use it?
It helps you deliver your strategic observability goals. As any seasoned programme or product manager, engineer - or anyone involved in product, service, or goal delivery - will tell you, it is easier to deliver something if you can watch it and measure it. Because the observability matrix measures observability, in a way anyone can understand, you can use the matrix to help provide clarity to all those involved in your strategic observability goal execution plan.
The ITRS Observability Matrix
I’ve designed a chart that allows you to plot a measurement of observability, against the type of data being monitored.
Observability maturity (x-axis)
If observability is another word for understanding, then you should measure observability based on the ability to understand. Steps 1-5 on this axis show the progressive outcomes to achieve operational resilience, with each step being more difficult to achieve than the one before.
Each step is simple to understand and gives you are a tangible outcome to achieve. If that’s not clear enough, you can add the functionality needed to achieve these steps. The beauty of this matrix is that the functionality can change over time as new functions get invented. But the five outcome-based steps will remain the same.
Remember that question I ask clients? How do they know if they are achieving observability? Now I can point them to each step on the x-axis and say this is what it looks like. Can you, or can you not, do this? Here’s the functionality you need to achieve it. If you don’t have this functionality, then you can’t do it.
Monitoring maturity (y-axis)
Showing observability maturity is not enough. Remember, the tool needs to help deliver strategic observability goals. So, I added a second axis to show the type of data being monitored. This was the tried-and-tested monitoring maturity scale that has been used for years to denote the type of monitoring a client wishes to achieve.
There are two benefits to doing this:
1. Users are only interested in achieving observability within their world. Their world is defined by the scope of the data to be monitored. For example, an infrastructure team only wants to achieve infrastructure observability.
The reality is, they have no interest in whether payments have been settled on time. Having both axes allows you to plot differing use cases on one chart, giving you an enterprise view.
2. The more complex the data, the harder observability is to achieve.
For example, achieving Level 5 observability for one user executing one action on one application on one server is simple to achieve. But doing that across a multi-region, multi-institution, multi-tech stack with dynamic on-demand entities that come and go is more difficult.
Showing this on the chart allows you to recognise this fact when creating execution plans. We noted this with the “Easier” and “Harder” tags in two of the chart corners.
How do you fare on the matrix?
Many organisations have transaction-level monitoring and some pockets of time-series data. Few, however, have a comprehensive approach that aggregates data to understand the impact of IT performance on transactions or customer experience.
We want to help you with your observability quest. To find out how we can help, please read the white paper by clicking here.