5 reasons to avoid agents for application performance management

Comments

This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.

Despite rapid evolution in the application performance management (APM) market, few enterprise IT organizations would say they have sufficiently solved their application performance problems. If anything, the complexity challenges posed by virtualization, agile development practices, multi-tier application architectures and other IT mega-trends are outpacing the capabilities of legacy APM products.

In light of this, IT organizations must judiciously evaluate the effectiveness of technologies and management practices they use to manage application performance. One conventional APM practice that deserves some scrutiny is the derivation of application health and performance data from host-based instrumentation.

SURVEY SAYS: Bad alignments hamper app management

Most APM technologies rely on agents deployed on servers or within application components to gather diagnostic data. These agents typically perform byte-code instrumentation or call-stack sampling within the Java Virtual Machine (JVM) or the .NET Common Language Runtime (CLR) -- basically using profiling techniques common to software development tools.

Certainly, this practice can yield useful information for managing application performance, including memory usage and the frequency and duration of function calls. However, this legacy APM approach suffers from five inherent drawbacks that make it increasingly untenable in today's IT environments.

* Susceptibility to changes in application code, architecture and environment. During test and development, software engineers often use profilers to locate hot spots and remove bottlenecks in their code. While annotated source code and deep call stacks are acceptable for the developers, they are less useful to operations teams. In production, operations teams need to answer higher-level questions about application health and performance. To provide this view, agent-based APM tools require complicated configurations that are sensitive to changes in the application code, architecture or environment.

This limitation may not have been a serious problem in the static environments of the past, but today's applications undergo ongoing, iterative development, use loosely coupled multi-tier architectures, run on heterogeneous software and hardware platforms, and operate in virtualized environments where virtual machines are spun up, spun down and migrated across the data center. With such rapid change at the application tier, host-based data gathering requires continual recertification and redeployment to ensure that it is functioning properly.

* System and network overhead. APM vendors that rely on host-based data gathering claim that their approach imposes "minimal overhead" or "low overhead" on system performance, yet these vendors seldom offer guarantees. While the actual overhead incurred depends on the specificity of the data gathered and the application itself, less than 5% performance overhead is an optimistic general estimate.

Five percent overhead might be acceptable for some applications, but the problem is compounded when organizations use multiple monitoring tools, each with their own agent consuming system resources. In addition, host-based data gathering consumes bandwidth, creates significant noise on the network as data is sent to a central server and can perturb the very system being monitored.

* Management complexity. Host-based APM tools require agents or other collectors to be deployed and maintained on every system they monitor. They must be patched, updated and upgraded regularly. With each operating-system service pack, these tools need to be recertified. Many IT organizations take this management burden for granted, treating it as a necessary evil.

Newer entrants to the APM market have rightfully targeted traditional vendors such as CA and HP for the extraordinary time and cost required to deploy their technologies. These newer vendors skirt the complexity problem by limiting the deployment options and the level of detail of the collected data. In essence, these low-cost, host-based APM tools trade specificity for simplicity.

* Limited visibility of network performance. A strictly host-based view of application performance can provide only secondary indicators of network performance issues that affect application delivery. To compensate, most APM tools that rely on host-based data also offer a separate network-monitoring component.

* Skewed end-user experience measurements within virtualized environments. In virtualized environments, the hypervisor schedules CPU time across multiple guest operating systems. As a side effect, the guests' sense of time becomes confused. When a guest is not scheduled, time stops from its perspective. When it is scheduled, the hypervisor must catch it up again by advancing time rapidly. This stopping and speeding of time results in incorrect measurements when a host-based agent attempts to measure the end-user experience. In some cases, the agent can query the hypervisor to determine how long time was stopped. Although this workaround can provide a rough sense of how inaccurate the metrics are, there is no definite solution for this problem.

Not surprisingly, APM vendors that rely on host-based data differentiate themselves in their marketing claims by minimizing these limitations. Thankfully, detailed health and performance information does not require gathering data directly from the hosts at all.

Recent gains in processing power and storage capacity have made a network-based APM approach feasible that can perform deep, real-time analysis of application transactions as they pass over the wire. By reassembling application transactions from network traffic and analyzing the application details contained at Layer 7, network-based APM can provide IT teams with valuable insights into application performance, such as a particularly slow database procedure, a specific web server error, or a method used in a transaction.

In addition, this approach extracts valuable network performance metrics from Layer 2 through Layer 4, providing industry-leading TCP analysis and other network-level information.

Now is the time to reassess old assumptions. Already-pressed IT teams cannot afford to deal with the headaches associated with host-based instrumentation any longer. IT organizations owe it to themselves to consider new approaches to managing their application performance. Network-based APM offers an elegantly simple alternative that delivers comprehensive health and performance data while completely avoiding problems inherent to gathering host-based performance data.

ExtraHop Networks is a leading provider of network-based application performance management (APM) solutions. The ExtraHop Application Delivery Assurance system performs the fastest and deepest analysis in the industry, achieving real-time transaction monitoring at speeds up to a sustained 10Gbps in a single appliance and application-level visibility with no agents, configuration, or overhead. For more information, visit www.extrahop.com.

Read more about infrastructure management in Network World's Infrastructure Management section.