We don’t need to redirect you to some kind of fancy white paper page. Our entire java performance approach and philosophy is laid out right here, on this page.

Why is sampling so freakin’ great?

Here are the reasons why we believe you should always have a ²⁴⁄₇ sampler running alongside your applications.

It’s very cheap & safe

If you’re using JMX connections or signals, then you’re mostly relying on your JVM vendor’s implementation of the MX Beans and JMX server. You don’t really have to trust our code, as we’re only polling those beans. And hopefully you trust your JVM vendor’s code.

Sure you’re only polling the beans, but doesn’t that cause overhead?

You’re right, just because our code isn’t executed within your JVM doesn’t mean it’s necessarily safe. However, 4 years of experience performing sampling with djigger and other tools have shown us that it’s essentially overhead-free. We’re planning on releasing detailed benchmark results under real-life conditions proving that in the near future. For right now, you just have to trust us, or run your own tests.

It provides a lot of information

When it comes to monitoring and trying to understand a dynamic system such as a java application and the underlying stacks, there’s always two levers which you can use to get better :

drawing more information out of the system (and doing it in a more efficient way)
improving the analytics you run against that information (and doing it in a more efficient way)

Those two areas are absolutely independant and we’re putting tremendous effort in both areas. However, we believe it’s always better to solve a problem via option 2) than via option 1) and prior to intruding into application code any further and thus introducing more overhead we always ask ourselves “is it really necessary?” and “can’t I get the same answer to my question by analyzing my sampling results in a smarter way?”.

It’s enough for most serious problems

Having worked in the traditional IT industry for over seven years now, for a variety of clients (banks, administrations, insurances) in a few different countries, I can assure you that the vast majority of the problems I’ve run into either *were*or could have been solved by sampling only.

If you disagree or don’t believe me, you can use our instrumentation connectors anyways. But I can assure you many people are surprised once they understand just to what extent sampling data can help answering questions.

Also if you’re part of a tech company or a company with a deep and rich technical culture, you might find the sampling approach limited. Nevertheless, I suggest you try out our tool because there may still be a few inspiring ideas for you in there. Also, we’re always open to hearing your constructive criticism and we love all forms of contribution. So help us make djigger better !

It’s particularly good at finding system-wide issues

Your application uses too much memory?

Let the sampler run for a few minutes and identify the hot object constructors in one stacktrace + node filter query (). Chances are, some of the top constructors in the results are where you should look first. And by the way, you instantly have access to the corresponding allocation stacktraces.

Your application uses too much CPU?

Use one of our reverse views and progressively eliminate the I/O, Wait and Sync related operations. In less than a minute, you’ll find out which threads are on CPU and why.

You have a major regression and response time is affected?

Compare a sampling session from your previous build with that of the new one and filter your business cases in the stacktrace trees. You should be able to spot the difference very quickly, especially if you stress the system.

There are many more use cases including debugging, code path discovery, etc

We’re planning and currently working on illustrating that in the Case Study section of our website, on our blog and on youtube in the form of short video tutorials.

But also great for discovery and debugging

You’re responsible for operating, testing, tuning, or fixing an application but you’ve seen no architecture documents and have no idea about how it’s implemented?

As long as it’s java, and if you use djigger you’ll be able to quickly get an picture of what is happening inside the application.

We’re also planning on releasing a new visualization technique that will allow you to follow even more precisely and chronologically what any thread is doing and how code is interleaved between multiple threads. Great for thread-safety and parallelism issues in general.

… and actually okay at transaction analysis

Although this can be difficult to cover when multiple calls are happening concurrently on the same JVM, you can still investigate individual transactions very easily on controlled test environments. Even in production, we analyze individual calls on a regular basis without instrumentation, if they stick out enough from the haystack (for example, due to their unusual path in the tree, or because they’re running for such a long time that they’re distorting the response time distribution, or because they’re still running at night when no one uses the application anymore).

But obviously that’s not always enough. That’s why djigger also supports bytecode instrumentation (see section “Deep analysis of an individual call” of page “Bytecode instrumentation).

It works hand in hand with an APM

Using a sampler is absolutely compatible with the use of any APM, although your vendor might say otherwise. First of all, we’re only using standard Java interfaces to retrieve data. But more importantly, sampling concurrently to the execution of APM code will allow you to measure it’s overhead and verify that your APM is not causing more harm than it is helping !

In addition, our Persistence Store will provide an additional source of information which is always extremely valuable.

It’s a global & complete approach

Every Java thread is recorded. Every transaction is recorded, either via instrumentation or sampling, you just need to use a proper sampling value at connection level.

Also remember, the stacktraces we record are presented as they truly exist inside the JVM, by default, we’re not filtering any packages, classes nor methods and we event display code line numbers if you need them !

That’s one of the biggest advantages of sampling : you may give up continuous _event capturing but you’re gaining _completeness of stacktrace depth.

But I already have an APM !

Here’s a list of cases where most APMs fell short for us over the years and in which djigger will bring tremendous value :

There are all those cases where your APM should have recorded a certain transaction, but didn’t
There’s always that new or less critical application that has performance problems but you’ve got no licenses left to instrument it with your APM
There are also cases where your APM just doesn’t support a certain library or technology which means you’ll simply be blind.
Then there are cases where the APM is reporting results that are plain wrong. How do you tell those cases apart if you don’t have an alternative source of information?
There are cases where you could have retrieved the data, but because of performance, scalability or lack of flexibility in the APM itself, you just won’t be able to.
There are many cases where you have to monitor proprietary or unsupported software and need to set custom entry point definitions in your APM to enable monitoring. djigger can help you achieve code discovery and achieve those tasks very quickly (see section “But also great for discovery and debugging”).
Finally, and maybe most importantly : instrumentation-heavy APMs will always try to cover your use cases by implementing and supporting as many specific bytecode modifiers as they can, because if they don’t you just won’t get ANY data. On the other hand, the sampling approach of djigger guarantees that each and every thread and transaction is recorded because we record everything the threads do, globally, in the entire JVM. The only difference is we don’t do it continuously but rather in a discrete way. You just need to adjust your sampling rate to a value that fits your use case.

Why did you implement an agent then?

First of all, bytecode instrumentation and sampling is not incompatible, it’s actually a very healthy combination.

As powerful as sampling may be, there are certain things that it just can’t do (reality is a bitch), and thus there are questions that you won’t be able to answer if you don’t use byte code instrumentation.

For example, without instrumentation, we can’t gather and keep track of absolute statistics from method calls, which is one of the pillars of monitoring and is just generally useful to understand, for instance, how your business works and what your users are actually doing on your system.

You also don’t get the full continuous chronology of method calls which can be necessary in certain specific cases.

If for some reason you’re trying to analyze something happening very quickly (with an execution time below 1 ms), it will be hard to sample unless you execute it many, many times. Instrumentation can avoid having to do that.

You can’t correlate cross-layer executions

Because of the nature of RPC mechanisms in Java and JVM internals, it is not currently possible to perform cross-JVM tracing without injecting correlation id’s (for instance thread id’s) into payloads. Bytecode instrumentation can do that.

You can’t inspect objects

Sometimes, you need to dynamically inspect the arguments associated with a method call or retrieve data out of an object in order to diagnose issues more accurately. This is another limit of stacktrace sampling.

For all these reasons, we’ve still implemented an agent and are working very hard at providing more functionality every day.

Why did you bother with jstack-style output files?

The way we started doing thread dump sampling was simply by sending out SIGQUIT signals at regular time intervals to the remote JVM. On UNIX, that’s done via execution of the kill -3 PID command (but watch out, depending how your JVM is configured, this can be an expensive rpc). And on Windows, via pressing Ctrl-Break on the CMD window of the java process (given you see such a window).

Taking thread dumps can be done with a lot of tools but to perform proper sampling, the harvest needed to be implemented as a script, so as to sustain precise polling intervals and to automate that process so that it could run for hours on end.

There are other tools than just UNIX and Windows signals that allow you to do this obviously, the most famous of which being jstack maybe, but most of them are JVM vendor dependant or have some sort of shortcoming (no headless mode, requires manual steps, etc). Jstack is hotspot/oracle specific.

The great advantage of using signals or a jstack-like tool for thread dump sampling is that you can perform thread dump sampling regardless of what the command line flags of the JVM are at the moment where you’re prompted to perform an analysis of the program’s performance, and still doing so in a fairly non-intrusive way (unlike process attach). The only problem is that in some rare cases, you may indirectly cause overhead if you’re sampling at high frequency.

There are a couple of known causes - although independant from the act of sampling itself - that can lead to serious stress on a JVM if you’re not careful with signals. For instance : IBM allows their JVM users to map heap dump events and thread dump events to the same signal. Now obviously, if you’re dumping the entire heap of your application every second, you’re going to have performance issues (need I explain..). An other example is if the standard output of the JVM has any kind of pre-existing bottleneck (a sychronized appender, an awful underlying file system, etc). In that case, seeing as thread dumps contain a lot of text, you might put pressure on that pre-existing bottleneck and indirectly cause delay in the application that wasn’t there before.

Disclaimer : These are just a few examples of bad things that may happen while sampling and since we’re humans, there might and probably will be other instances where performing sampling (regardless of the chosen connector) might lead to a performance or availability issue. You’re doing this at your own risk. As provider of this free software, we can not be held accountable for your decisions. That being said, we’ll be glad to help you understand what’s going on if you report your problems to us. Just contact us via our corporate contact page.

But today, we’re still jstack-format compatible, for many reasons. However, what we won’t do is support every format of thread dump outputs that is out there. I found out for example that IBM changes that format even in minor releases of their JDK. Obiously we don’t have the ressource nor the patience to keep up with those changes. In addition, we strongly recommend the JMX-way of performing thread dumps sampling whenever you can. It’s the safest, cleanest, most painless way of doing it. If however, for some reason you still need to parse non-Oracle TD output formats, you can simply write a simple converter to turn your file format into the jstack format, and then parse it with djigger. Just ask us if you have any questions or need help. Just because we won’t support every format out there out-of-the-box doesn’t mean we won’t help you.

Are you going to support other languages and/or stacks?

We’re planning on it. We currently have Node.js, Python, and to a lesser extent, .NET on our radar. But we need more feedback from the community to be able to decide whether and in which order we’ll support these languages and the corresponding runtimes.