Note: This post was originally written for the Netreo blog. You can check out the original here.
With the rapid pace of change in applications and the infrastructure they run on, it’s more important now than ever to monitor what’s happening. Using an application performance monitoring (APM) tool can help provide the necessary visibility and insight you need for proactive and reactive problem resolution. But whether you’re looking for a replacement or have no monitoring in place, where do you start?
The APM market is flooded with a whole host of monitoring tools. There are front-end tools, back-end tools, and full-stack monitoring tools. The options seem almost limitless. Do you just go with what tools make Gartner’s Magic Quadrant? Maybe. But I argue that that’s not the end all be all. Having the right tool is a very important decision. The cost, both in price and people, of choosing the wrong monitoring tool could be the difference between troubleshooting through the night or sleeping through the night.
In this post, you’ll learn about five critical capabilities you should look for when evaluating an APM tool.
APM Basics
Application performance monitoring refers to the process of collecting and analyzing data generated by application use across the infrastructure. An APM tool’s main job is to store all of this data and give you the visibility needed to assess any application performance. This visibility should help meet your service-level agreements (SLA) and keep happy users with a low mean time to repair (MTTR).
Essential Capabilities
The five capabilities to look for are features that, at a minimum, any APM tool you’re considering must have to help meet its job requirements.
1. Efficient and Scalable Data Collection
As application use generates data, the APM tool needs to be able to collect and store all of that data. There are various forms of generated data like metrics and logs, both structured and unstructured. Data can also vary depending on the programming language used in your applications. All of this must be done across complex application architectures like on-prem and multiple cloud networks. Your APM tool has to be able to collect this data into its storage system and keep it there until you need it or see it.
Data collection and storage are often an issue for APM tools. Some can collect all of your data, but it can take too much time. Others solve this by only collecting and storing a sample of that data. But you lose the granularity needed to find a needle in a haystack. An APM tool should be able to both effectively collect and store data and do it at scale when required.
2. Effective Root Cause Analysis
When an application running on your infrastructure has a problem, you need a way to get to the root cause of this problem as quickly as you can. Your users likely don’t care that you and your team need to triage the situation and track down the issue. You need an APM tool that helps parse through all of the data it has collected about your application and its infrastructure and tells you exactly what the cause of the problem is. At a minimum, the tool should help reduce the analysis you need to do by specifying the infrastructure component or lines of code that are likely a source of the problem.
Root cause analysis has been a key requirement of all APM tools for many years, with many sadly falling short. With today’s much more complex applications and services, this is a must. Some APM tools utilize artificial intelligence to develop patterns from the data being collected. These patterns help the tool more quickly identify the cause of any performance degradation and also notify you of the specifics. Something that is becoming more common is AIOps, which uses AI capabilities to automatically fix certain operations issues for you. An APM tool that doesn’t utilize AI to help with IT operations is becoming less useful in today’s big data world.
3. Anomalous Behavior Detection
Today’s applications are very complex, and they sometimes run in even more complex environments. Your organization may still have old monolithic applications running alongside new cloud-native applications across public and private cloud infrastructure. The idea of looking at a map of your application’s architecture and seeing anything out of the ordinary is obsolete. You need to be able to identify when something that has never occurred before has happened. This is where anomaly detection can help.
An APM tool that can detect anomalous behavior can help to greatly reduce MTTR. It should be able to collect all the data in real time, and when something happens that hasn’t been seen before, you can get an alert.
Relying on configuring all of your alerts is not an option. You shouldn’t consider any APM tool that can only provide this alerting capability. You need a tool that can intelligently identify patterns of data that aren’t consistent with previous data and then lets you know.
This can even extend to applications. Does it detect new applications running across the infrastructure and start collecting data on them? This is a must nowadays for an APM tool that monitors modern applications.
4. Application Transaction Tracing
With older monolithic applications, monitoring was a lot simpler. Sometimes all of the various components of an application were on one server, representing one application tier. Over time, applications would run on two or three different tiers, such as application and database tiers. In those architectures, it was easier to track down the various application transaction requests and events.
Now with applications running as microservices, containers, and serverless functions in various infrastructure environments, doing this is impossible. You need your APM tool to be able to identify an application transaction and trace it across all the application’s components no matter what its architecture or where it’s at in your infrastructure. This can help identify the exact location of a specific issue when needed.
Transaction tracing also helps you better understand how your applications work. Whether during development or in production, tracing helps give you some visibility into the inner workings of your applications. You can then see where in the code or the infrastructure an application may tend to suffer some performance degradation that may not be impacting users but can be a clear area of concern to address. Only consider an APM tool that provides this critical capability.
5. Business Impact Analysis
APM tools have historically been more of a concern for IT professionals. But with the shift to cloud infrastructure and cloud-native applications, where an organization pays per resource used, having more business-friendly interfaces is a critical capability. IT leaders need a way to utilize all this generated data to determine how the business is impacted.
One way an APM tool is helpful here is when it provides visibility into how much of a cloud resource an application is using. The APM tool can show that one application is using more cloud storage than others. Not only could this be affecting performance, but it could also be increasing server costs for the business.
An APM tool needs to be able to provide user interface views tailored to the business analyst, so the UI must be intuitive, interactive, and easy to use. Many in IT are entirely comfortable with a command-line if needed. But that’s not the case for all, and certainly not for more business-minded leaders. The APM tool must allow teams to build dashboards and reports that can provide business-focused insights to help make application decisions that can benefit the business.
Netreo Can Help
Netreo includes a suite of tools that can help you effectively monitor the performance of your applications. The Netreo platform includes various features that fit right into the five capabilities mentioned above.
Netreo’s Stackify Retrace is strong on collecting data because it’s capable of importing various forms of data from metrics to logs. It does this for applications written in over six languages, including Java, .NET, and Python. Collection can further include data from different infrastructure environments like Kubernetes and serverless functions. Retrace also provides insights into the ways that satisfied users are using an application from a performance perspective. This certainly can affect the business’s bottom line.
Netreo’s main platform product fulfills many of the five capabilities as well. It collects data from across various infrastructures and detects applications. Its cloud monitoring feature also gathers data from AWS, Azure, and GCP cloud applications and services. Netreo uses AIOps to process all the large volumes of collected data that can help find the root cause of application issues or any sudden anomalies. And its use of synthetic transactions to help identify application dependencies that could adversely affect performance also adds to its critical capabilities.
Go and Look
As you can see, there are many different capabilities to consider when choosing an APM tool. You learned about five of them. Each organization is different, so you may have some additional capabilities specific to your organization.
There are many more APM tools to consider than there are capabilities. But if you focus on the tools that provide these critical capabilities and do a good job of them, you’ll be well on your way to effective application performance monitoring.