Observability is a management strategy that’s focused on keeping the most relevant, important and core issues at or near the top of an operations process flow. This first edition of the Performance Lab was the perfect opportunity to gather with a group of invitees, to explore how companies use this term to improve online performances and user experiences.
On November 2nd, MeasureWorks gave several speakers the opportunity to talk about the impact of performance and Observability on their online operations This included major names like Ayeleth Klokman of UWV, Myrese Sonneville of HEMA, Jeroen Tjepkema of MeasureWorks and Henrix Rexed of Dynatrace. New insights were shared with the motto ‘Get fast, stay fast’.
You can read more about the speakers down below.
UWV: Managing customer experience
Ayeleth Klokman from UWV was the first to share her experience with Observability. At UWV, they are the largest service provider in the public domain. The company uses over 200 different applications to process client data, which is a lot of silos to keep track of. Ayeleth is part of the UWV monitoring center (UMOC), responsible for delivering application insights to the individual departments within UWV. With so many moving parts, this can be a daunting task. Not only can users complain when things are not working, but at the same time the internal departments are critical about delivery quality.
To make sure the UMOC has a clear overview, they use a method called application chain monitoring. This way of monitoring allows UWV to track all operational processes, both offline and online, top down. From functional processes to individual components, they can automatically identify when something isn’t going as planned and zoom in on the elements that need improvement. On top of that, there’s also real user monitoring, which keeps track of multiple data streams from end-user perspective.
Foundation of the application chain monitoring is their Observability monitoring environment. This setup gives them all the insights into infrastructure and application performance. By using open telemetry, they can import outside sources into the same environment, giving them better insights into their total business processes. Through evaluation, UWV knows – by following these 4 steps – if they did the right thing for their internal customers and end-users:
- Understanding the customer (Which answers do you need? What is your route?)
- Fit for purpose (Making sure they don’t spend too much money and seeing which applications or individual components are most important)
- Requirements of the customer (Which adjustments are needed? Is everything done in time? Does this happen through the right authorizations, etc.)
- Monitoring as a service (Making changes as soon as possible, looking for what’s needed most, taking everything into account)
“We often hear: ‘Oh, is that possible? That’s exactly what we’re looking for!’”Ayeleth Klokman, UWV
All of this ensures that the internal departments get the right amount of data for the precise things they want. Ayeleth summarized it in her presentation by saying, “We often hear: ‘Oh, is that possible? That’s exactly what we’re looking for!’”
HEMA: The importance of site speed and awareness
HEMA is one of the most famous Dutch brands who recognized that part of their brand experience is to improve the performance of their webshop. Myrese Sonneville is one of the three people challenged with the task to improve site speed and climb the SEO ranks. Since 2 years, the goal is to always stay close to the customers’ wants. But how do you achieve this through SEO and site speed?
The trigger for HEMA to invest in site speed was the rising amount of customer complaints. The site would become too slow and tough to navigate, especially when there was high traffic. Through customer service, Mopinion and NPS data, HEMA started to focus on the main issues. With the help of two external parties, they started taking the first steps into the process. The developers of Emakina improved HEMA’s web performance (and therefore site speed as well), while a Google consultant made sure their SEO performance would improve by reducing the loading time and creating a long-term vision.
“The important benefit of our approach is that site speed starts to come alive within HEMA”Myrese Sonneville, HEMA
But HEMA wanted stronger results and started cooperation with MeasureWorks for the final steps. In Q2 of this year, they created a Scrum team and integrated MeasureWorks into this team. This enabled them to improve the site speed and reach their targets for the mobile and desktop website. They did this by initially focusing on 2 KPI’s: Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS) and on high traffic pages such as the product page and listing page. And continuously measuring and sharing progress through web performance data.
But what was the most important part of success? Making sure the entire organization felt the importance of this task. “It’s important that site speed starts coming alive within HEMA”, were the exact words of Myrese. Raising awareness, inspiring people and encouraging them to improve together is the real way to push forward. It’s by standing on a soap box and telling people from every team why what you’re doing matters so much. That’s what gives them the drive and motivation to do this as a team.
It’s not without reason that Myrese gave three pieces of advice at the end of her talk:
- Have a clear focus and integrate expertise into your team
- Share your knowledge among other teams within your company frequently
- Keep focus (and faith)
MeasureWorks: Why Observability matters
Both HEMA and UWV showed us cases of Observability. However, both did it with a different perspective. This raises the question ‘what is Observability?’ In his presentation, Jeroen Tjepkema tried to provide a definition of Observability and why it matters.
The definition of Observability states that Observability is a performance management strategy, focused on keeping the most relevant, important and core issues at or near the top of an operations process flow. In other words, remove the noise and focus on what matters right now.
But how do you filter noise? The foundation is measuring performance itself. Performance is the sum of site speed, reliability, availability, and reachability. If the sum is positive, this is called time neutralization. The moment when the delivery of quality is noticeable, but doesn’t influence the user’s preference of one service over another, is when you gain a position from your competitors or become more productive.
Measuring performance is not easy. We can measure anything, from clicks to infrastructure to databases. We build dashboards to visualize and create thresholds for alerting. But how do you know what’s right?
There are 3 key take-aways to build a solid performance strategy:
- 1. Look at the whole, not the silo: the most important factor in any monitoring strategy is to be complete in what you measure. The best way to make sense of the complexity of your application delivery chain, is by looking at the whole picture rather than by splitting it down into silos. This is called system thinking.
- 2. Measure from the outside-in: 100% CPU by itself means nothing, unless you know that you have 300% more visitors than usual. Behavioral trends are the best leading indicator of performance errors. Therefore, always start from the user’s perspective and then connect the dots through measuring application, network and infrastructure metrics
- 3. Force outliers: Measuring behavior of a complete system turns the separate silos into noise and allows the actual anomalies to stand out. By measuring the behavior of all elements as a whole, we turn causality into a likelihood
With Observability, consider system thinking
Answering the question, why Observability matters so much, boils down to avoiding failures. The last thing we all want is to be in the news because of malfunctions, a lack of capacity, or long loading times. In the real-world, long queues are easy to spot. When you’re in a traffic jam, waiting at the supermarket (or even worse nowadays, at the airport), you can see the queue grow and adjust your perception. However, online queues are a whole different story. Users are simply browsing, assuming everything works without visibility of queues. If it doesn’t work, there are only a few options for them: refresh or click away.
However, there is a good way to trace when it’s time to step in. And that’s where Observability comes into play. By having insights into all your data, you can – in this case – recognize patterns and anticipate when you need to step in before failure happens.
When executed right, Observability becomes your superpower. Who doesn’t want that?
You can watch all slides of the MeasureWorks presentation here as visual support. If you have any questions regarding the presentation, make sure to reach out to us. We will quickly answer them.
If you want to know more about Observability at MeasureWorks, it may be time to get your Observability Assessment. We collect telemetry data from the entire application chain and transform it into preventative actions. Some examples are identifying the root faster, predictive alerting and a better user experience for happy users.
You can get your 30 day free trial now and we’ll get you started within 24 hours.