Aggregation

The logic behind thread stack aggregation

Interpretation of our sampler’s data relies on the result of grouping stacktraces that have identical paths, starting at the root (usually Thread.run() or a main() call), and then splitting into a new branch when the methods diverge into different code paths. Therefore, next to each node in the trees, you’ll see a stacktrace count (between brackets : [20]) and a percentage value (for instance “7%”) :

io.djigger.collector.test.BasicJMXJVM.main 7% [20]

This is how it looks in djigger :

The percentage is calculated based on top level node (root of the tree).

If you’re filtering your tree correctly (via stacktrace and node filters, see sections below), this % of stacktraces can be interpreted as a high-fidelity approximation of the percentage of time that your application and/or thread(s) spend in that node.

Why does it work?

It works because of the same reasons watching a movie works. Feeding your brain anywhere between 20 and 60 frames per second will be enough to create the illusion of real motion. In the same way, you don’t have to know everything a thread is doing while executing code (that’s too expensive anyways, and would bend the results). You just need to know enough to be able to understand and diagnose your problem.

So using a very simple and intuitive statistical approach called “sampling” and various connectors at the technical level (JMX, Process Attach, etc) we poll the JVM threads and “ask” them at a certain rate what they’re currently working on, i.e what does the stack look like. The more often we ask (sampling rate), the more accurate the resulting picture will be. And in that picture, the symbolic names of the java packages, classnames and methods and code line numbers along with their order in the stacktrace are the primary pieces of information that you’ll use to investigate code behaviour.

Do you aggregate everything or just per-thread?

Both. Actually per default, everything, but we do both.

Our aggregation mechanism allows you to group not only stacktraces coming from the same thread or transaction over time, but also stacktraces coming from different threads doing the same thing (and which are likely having the same problem). This is a very powerful feature.

Of course, the thread timeline allows for filtering at the thread level too and will let you identify stacktraces coming from a specific thread id and in a specific time window (see the section “Thread Timeline”).

If you’d like to read more arguments in favor of stacktrace sampling, take a look at our FAQ which lays out most of our sampling-oriented philosophy.

We also encourage you to take a look at this slide deck presented in front of the Swiss Java User Group in 2016, which still contains relevant information and explanations regarding approaches to performance analysis and latency monitoring in a JVM. In particular, aggregation and stacktrace sampling concepts are detailed in this subset of slides.

Filtering

Stacktrace filter

The stacktrace filter will modify the tree based on your query so that you can select the stacktraces (branches of the trees) that are relevant for your analysis. That allows you to filter out the noise of all of the threads that do nothing in the JVM or that do things you’re not currently interested in. It’s a very important lever if you’re going to calculate percentages. Let’s say you’re only interested in the stacktraces that have something to do with the io.djigger packages, just type the following text into the Stacktrace filter box and hit RETURN :

io.djigger

Here’s an example where I’m filtering a specific method name :

Node filter

The node filter is designed as clarity and analytic filter. It allows you to filter out packages in the branches of your tree. For instance, when you see a bunch of chained filters or multiple invocation calls that hinder to readability of your stack, you can simply filter them out by using the “not operator”.

NOT (sun OR invoke)

Operators

You can combine operators to build complex queries :

At this point, the following operators are supported :

Regex support is in our plans.

Visualization

We’re providing 4 different kinds of tree views, each providing different insight into the data. You can instantiate and delete as many visualization panes as you wish.

Tree View & Reverse Tree View

The tree view’s roots become leaves and vice-versa. The reverse Tree View is very useful to quickly get an idea of what the threads are actually doing (on-cpu vs off-cpu, io, sync / wait, etc).

Block view & Reverse Block View

Block views are great for quickly providing a visual overview of the distribution of the execution time throughout the different layers of packages.

Pattern-based coloration goes even further by grouping the nodes matching a pattern into the same color for instantaneous layer-based interpretation.

Roots are on the left side of the frame, and leaves on the right side. The vertical length of a node (or rectangle) indicates the % of time spent in it. You can drag and drop the rectangle edges in order to increase their width and be able to better read class and method names in the nodes you want to analyze.

Thread timeline

A chronological view of each thread’s state. Each thread timeline shows the different states that a thread was in at the time of each thread dump (or “snapshot”).

Each state has a different color. The description of each java thread state can be found on this page.

Thread selection

You can select one or more threads in the Thread timeline pane. It will automatically filter the stacktraces in the lower (tree) pane.

You can unselect or reselect all of the threads via the Right-Click menu available in the Thread timeline pane.

Zooming and Unzooming

You can zoom into the time window of your choice via drag & drop :

You can then reset the zoom by right-clicking any thread timeline and choosing “Zoom out”.

Reverse Thread Lookup

From the tree view and reverse tree view, you can see which threads are going through the node that you are currently selecting. They’ll be highlighted in the Thread timeline frame (each individual thread timeline will actually be surrounded by a thin red rectangle).