Distributed Transaction Tracing
Key concepts
Distributed tracing is an important feature in an APM allowing users to follow transactions across multiple threads and processes. As many users are tasked with monitoring and analyzing distributed applications, it was important to provide an API in order to match transactions which are related but which take place on different components.
The first thing which is important to understand is that distributed tracing is protocol specific and in fact, protocol-implementation specific. There is no magic way to track a transaction’s path across a distributed application’s tiers. For each and every protocol traced, a correlation id has to be injected into the message, for example as a header or by modifying it’s payload. Once injected by the client, the id can be read and removed by the server. Both of these steps require byte-code injection.
Once the data is sent back to djigger, correlation can take place and users will be able to drill-down from a client-side transaction into the server’s, which is particularly useful in order to track 1-N relationships across tiers.
djigger provides an API for writing your own probes allowing you to trace any protocol, as long as you’ve figured out where to place the correlation id and from which methods it should be injected and removed. Removing the id can sometimes prove to be critical for certain protocol which will fail if the message received doesn’t match exactly what the client would have sent.
Note: Due to the fact that distributed tracing needs to be implemented on a per-protocol basis, and although we’ve already provided several probes, djigger will keep relying on its community to add, maintain and support new protocols as well as protocol versions.
Activating an existing tracer
Tracers are a special type of subscription, which can be activated like any other subscription. For instance, the following lines of configuration can be added to a connection’s definition in the Connections.xml file in order to turn on HTTP tracing in an application which uses Apache’s HTTP client:
<subscriptions>
<io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<classNamePattern>
<pattern>.*HttpClientTracerTest</pattern>
<flags>0</flags>
</classNamePattern>
<methodNamePattern>
<pattern>.*</pattern>
<flags>0</flags>
</methodNamePattern>
<tagEvent>true</tagEvent>
</io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<io.djigger.monitoring.java.instrumentation.subscription.HttpClientTracer/>
</subscriptions>
We’re using the RegexSubscription class to identify the target classes and methods targeted by the tracer (here in this case, all of the method of our test class), and then we’re naming the tracer to be activated on this subscription (HttpClientTracer).
Creating a new tracer
In this example, we will create a new tracer for the HTTP protocol. We’ll assume the server class is implementing a java Servlet and the client component uses Apache’s HTTP client.
Id injection
For HTTP, a simple and common practice is to inject the id into the HTTP request in the form of a custom header. HTTP Servers are usually flexible and don’t have any problem with unknown/unexpected headers so we won’t bother cleaning up the id in this example.
Server transformation
Our strategy for the server side will be to modify any class implementing the interface javax.servlet.Servlet, and more specifically the method service() of these classes. This way, every servlet will be able to pick up on our correlation id’s and send it back to djigger along with the rest of the instrumentation data.
Check out the code of the Servlet tracer here to see how exactly the transformation has been implemented using javaassist.
Client transformation
The client will inject the correlation id via the method sendRequestHeader() located in the class DefaultBHttpClientConnection.
A unique id provided by a utility collector class called InstrumentationEventCollector and method getCurrentTracer() will be used to tag the transaction. The code of the transformation can be found here.
Activating the subscriptions
Here’s the configuration of the connections for HTTP tracing:
<!-- HttpClientTracerTest -->
<Connection connectionClass="io.djigger.client.AgentFacade">
<samplingParameters samplingRate="100"/>
<connectionProperties>
<property name="host" value="localhost"/>
<property name="port" value="12123"/>
<property name="username" value=""/>
<property name="password" value=""/>
</connectionProperties>
<subscriptions>
<io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<classNamePattern>
<pattern>.*HttpClientTracerTest</pattern>
<flags>0</flags>
</classNamePattern>
<methodNamePattern>
<pattern>.*</pattern>
<flags>0</flags>
</methodNamePattern>
<tagEvent>true</tagEvent>
</io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<io.djigger.monitoring.java.instrumentation.subscription.HttpClientTracer/
</subscriptions>
<attributes/>
</Connection>
<!-- ServletTracerTest -->
<Connection connectionClass="io.djigger.client.AgentFacade">
<samplingParameters samplingRate="100"/>
<connectionProperties>
<property name="host" value="localhost"/>
<property name="port" value="12124"/>
<property name="username" value=""/>
<property name="password" value=""/>
</connectionProperties>
<subscriptions>
<io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<classNamePattern>
<pattern>.*ServletTracerTest</pattern>
<flags>0</flags>
</classNamePattern>
<methodNamePattern>
<pattern>.*</pattern>
<flags>0</flags>
</methodNamePattern>
<tagEvent>true</tagEvent>
</io.djigger.monitoring.java.instrumentation.subscription.RegexSubscription>
<io.djigger.monitoring.java.instrumentation.subscription.HttpClientTracer/>
<io.djigger.monitoring.java.instrumentation.subscription.ServletTracer/>
</subscriptions>
<attributes/>
</Connection>
Demo
Check out the classes HttpClientTracerTest and ServletTracerTest which are dummy programs used to showcase the tracing functionality.
They can be started using eclipse run configurations provided in the folder djigger/java-monitoring-commons/src/test/run/
Once the collector and both “showcase” programs are running, you should start finding HTTP client events (Put’s, Get’s. Delete’s…):
After making sure that all components are connected:
Open a client and connect to your collector:
Look for events in the last 5 minutes by firing a search in the top filter with no input:
We now see instrumentation events originating from the HTTP client. Let’s look at the sequence tree to inspect the corresponding transaction based on instrumentation data:
We now see a list of events which are related as part of a single transaction. Let’s drill down on one of the HTTP client calls:
We land on the “other” side and are able to examine and investigate the transaction further, into the corresponding server-side servlet call: