Mastering Distributed Tracing – book review | Phil (aka MP3Monster)'s Blog

Tags

CNCF, Jaeger, monitoring, open tracing, Tracing

So recently we have been working on ‘knowing what I don’t know’ when it comes to Open Tracing and how such tech may intersect with traditional logging and the use of FluentD.

As part of that, I have read the Packt book Mastering Distributed Tracing written by Yuri Shkuro who has been key in the OpenTracing API and Jaeger and is the technical lead for Uber’s tracing team.

Whilst I have a good relationship with Packt, the fact they published the book is pretty much coincidental.

Understanding tracing over traditional logging is very important when moving into the world of microservices and reactive frameworks such as Node.js where threads are picked up and put down, you don’t know where and when the next service in a solution will pick up the next related activity. When you add to this solutions are more polyglot than ever – not only in the sense of different languages that may be used but a more diverse source of middle features e.g. historically you’d probably use JMS based messaging if you’re a Java developer and MSMQ for .net. Now you may be using AWS SNS as easily as Kafka. This means the mechanisms for passing and tracing events through these services need to be more unifying than ever.

Complexity of Observability

It is these problems that book clearly explains and describes how Open Tracing as a standard addresses them and provides illustrations using Jaeger with the various different client library’s supporting Java, Go etc. Given Jaeger is open source, vendor-neutral (sponsored by CNCF) is great. As Jaeger is also wrapped up with Istio service mesh (another project in the CNCF stable) which is also deployed as standard with Oracle Kubernetes Container Engine (OKE) and will be in due course incorporated into OpenShift.

As Open Tracing supports multiple programming languages, the book addresses this by providing illustrations in several different languages. So whilst it may not address every programmer’s needs, you can see how the implementation may differ based on language capabilities, and make it easy to extrapolate how it will apply in the reader’s preferred programming language. Of course, you can also find online lots of sample code in other languages as well.

But the book is very clear about what Open Tracing can’t do, and gaps in the challenges. Differences in traditional log monitoring methods. Along with how it relates to other initiatives such as some of the W3C activities and alignment between products (Zipkin, LightStep, etc).

This gives a great blend of the underlying theory as well as the practical implementation considerations such as the use of sampling, visualization of the data, scaling up to be enterprise scaled.

The only niggle (and we’re being very fussy here) it would that the book might address a couple of points:

Better address Jaeger’s handling of the Log interface in a bit more detail, as the Open Tracing sight doesn’t say too much about it, and the book focuses on the smart approach with Spring Cloud. That said, if you look at the following conversation trail, it is clear that if you look at the Jaeger issues (such as this), this is an area still maturing.
Blending time-stamped logs from logging outside Jaeger into the search engine (Elastic Search or Cassandra).
Relationship/overlap/complementing with CNCF’s FluentD.

As with many of the technical books I read- I’ve been mind mapping as I go. Whilst no substitute for the book, once you’ve read it then I’m sure you’ll find it a great memory jogger.