When instrumentation is needed

The agent and process attach connectors become interesting when sampling isn’t enough on its own to identify the root cause of your problem. For instance, if you need to know exactly how many times a method is called within a given time frame or exactly how long each call lasts.

However, as we will be pointing out in many of our case studies, we’ve noticed over the years that the vast majority of the problems can be understood and diagnosed without instrumentation. On top of querying stacktraces with the thread, stacktrace and node filter and aggregation, an approximation of a given method’s response time and a min-bound value of the method count can be revealed via stacktrace sampling. In a many cases, it’s enough to understand relationships between method calls.
Nevertheless, instrumentation introduces a new notion of *completeness *and it also allows deep object inspection which can be unavoidable. That’s why we’ve made two different connectors available to deploy our agent in a remote JVM and perform bytecode instrumentation.

The connectors

The piece of code we use to instrument java bytecodes can be loaded in two different ways: via the -javaagent (when the JVM is booting) and via Sun’s Virtual Machine attach code (in memory).

The details you need to connect to a remote JVM via either of these modes are provided on our installation page.

Javaagent

Our java agent has two roles :

receiving, interpreting and executing sampling and instrumentation directives from the client
shipping the results back to the client

The two main advantages of using “javaagent” mode are as follows :

you’ll be able to connect to that agent from just about anywhere (provided the network routes are in working order)
it’s mostly platform / jvm vendor / jdk agnostic (the agent will work with JDK 1.6+)

The main drawback being that you need to reboot the JVM that you wish to instrument.

Process Attach

Process Attach allows us to deploy our agent code dynamically, without rebooting the target JVM or modifying any flags or options. However, in order to process attach successfully to a target JVM, your djigger client instance needs to meet the following requirements :

it has to run on the same host as the monitored JVM
it has to run on the same JRE as the target JVM
the tools.jar of the exact target JVM version has to be made available on the client’s classpath

PS: we might work on a PA-proxy in the future if people find it useful, which would essentially eliminate the first requirement.

Instrumenting a method

After establishing a connection to your agent with one of the two aforementioned connectors, you’ll notice a new tab at the bottom of the client’s window. That tab will be used to keep track of which methods you’ve instrumented, what their statistics are and will also be the starting point for deep-dive transaction analysis.

Let’s say we’re starting a javaagent-enabled JVM from eclipse (of course, it could be any application, but for illustration purposes, I’m using the BasicJMXJVM class from the collector project, available here.) :

We’re connecting locally via agent on port 13987 :

Now we can see the additional subscription tab at the bottom :

Exact path vs All paths to node

As we support what we call contextual instrumentation, we provide two mechanisms to instrument a method. You can either decide to gather statistics on all the invocations made on that method or only on the invocations made via the exact path leading to the node you’ve right-clicked. This is an important distinction and you can choose which analysis you want to perform by either clicking the “all paths” or “only this path” entry.

Let’s right click the “all paths” option for instance :

You’ll now see that the method you’ve chosen is now listed on the left side of the subscription tab :

At this point, the agent will start gathering statistics on calls made to that method.

Object Content Capture

While instrumenting a method you can enrich the captured event with additional data such as the values of the method parameters or the returned value. To do so, you need to add a new subscription directly from the subscription tab and provide a capture expression (javassist).

Javassist reference: https://www.javassist.org/

Example to extract data from the returned value of the intrumented method:

Understanding and analyzing the results

As of djigger 1.4.2, we provide two ways of visualizing the resulting data : a statistics table and a list of the individual calls made to that method, also known as “transactions”.

Statistics table

If the method has been called at least once after instrumenting it, you should see statistics appear in the Event tab about15 seconds later. A timestamp and duration are provided by default.

Deep analysis of an individual call

In the real world, it’s likely that not every call made on that method will be slow, so you might want to look at individual transactions and figure out how the response time was wasted in these specific instances.

By right-clicking on the event and selecting the “Analyze Transaction in - Sampling tree”, you will be able to analyze any individual call for this specific transaction :

This will result in a new window containing the visualization trees you’re now familiar with, except for in this window, stacktraces and events has been filtered in order to only only provide information relevant to this specific transaction.

You can then use the stacktrace and node filters as you would if you were in sampling mode and if you were using the regular tree pane of djigger.

Recommendations

While sampling is a near-free operation for the JVM, instrumentation can be expensive as it’s overhead is a direct function of the frequency at which the instrumented method is called in the application’s code (as well as the complexity of the probe’s code). You should keep this in mind at all times and prior to instrumenting a new node, ask yourself if it’s a reasonable idea.

However, on dev or test environments, and for debugging purposes, even if you cause high overhead, you might still want to pay the extra CPU cost and instrument a frequently called node in order to better understand what’s going on in the code. In the end, you’re responsible for what you do with the tool.