• Home
  • Site Aliases
    • www.cloud-native.info
  • About
    • Background
    • Presenting Activities
    • Internet Profile
      • LinkedIn
    • About
  • Books & Publications
    • Log Generator
    • Logs and Telemetry using Fluent Bit
      • Fluent Bit book
      • Book Resources in GitHub
      • Fluent Bit Classic to YAML Format configurations
    • Logging in Action with Fluentd, Kubernetes and More
      • Logging in Action with Fluentd – Book
      • Fluentd Book Resources
      • Fluentd & Fluent Bit Additional stuff
    • API & API Platform
      • API Useful Resources
    • Oracle Integration
      • Book Website
      • Useful Reading Sources
    • Publication Contributions
  • Resources
    • GitHub
    • Oracle Integration Site
    • Oracle Resources
    • Mindmaps Index
    • Useful Tech Resources
      • Fluentd & Fluent Bit Additional stuff
      • Recommended Tech Podcasts
      • Official Sources for Product Logos
      • Java and Graal Useful Links
      • Python Setup & related stuff
  • Music
    • Monster On Music
    • Music Listening
    • Music Reading

Phil (aka MP3Monster)'s Blog

~ from Technology to Music

Phil (aka MP3Monster)'s Blog

Tag Archives: Prometheus

Observing the Observer (Fluent Bit monitoring)

25 Thursday Apr 2024

Posted by mp3monster in Fluentbit, General, Technology

≈ Leave a comment

Tags

development, FluentBit, Grafana, Jaeger, logs, metrics, OpenSearch, Prometheus, Traces

In the Fluent Bit book I touch upon the point that we should be observing the observer. After all, if we don’t monitor our observability stack, then we’ll be operating blind and may never know until things go catastrophically wrong, and we’re getting complaints that production business solutions are down. One of the peer review comments was it would be really good to have a visual representation in the book. While I’d love to incorporate such diagrams, for them to be readable, they do use up a lot of space on the printed page, and very long chapters can also put some readers off. So, given the point wasn’t a key theme, we simply couldn’t incorporate the diagram.

But the suggestion is a good one. So we’ve created the visual representation here.

Annotated diagram showing how Fluent Bit could be used to monitor an open-source observation stack.

The diagram shows how we can monitor our observation tech stack so we don’t become ‘blind’ without knowing about it. The Outflow from the right of the diagram would send a signal to a notification service, which could be as simple as email or as user-friendly as Slack. We’ve indicated which kinds of metrics, logs, and traces could offer the most value from our observation tech stack.

If we’re running everything within a Kubernetes cluster, it would be easy to say we don’t need such a sophisticated setup as we can use Kubernetes liveness probes if the containers are well configured. While it is true if one of our services starts to fail, a liveness check should pick it up and recycle the container. But such probes only worry about the HTTP response code, not the cause. If we don’t monitor and capture more information we’ll never understand the problem. At worst we could end up seeing Kubernetes starting and then killing our containers in a vicious cycle and struggling to resolve the cause. So collecting the logs and metrics remains just as important.

How to Publish Fluent Bit Metrics and Logs

To publish Fluent Bit’s metrics to Prometheus, we need to configure the fluentbit-metrics input plugin (it does sound odd as an input, but there are reasons that become clearer in the book). We then route the output that supports using Fluent Bit as a Prometheus node exporter or makes use of the remote write API.

The log output for Fluent Bit can be configured via the command line or in the SERVICE blog (using the attributes log_file and log_level in the configuration file. Today this is setting the log threshold and identifying the log file. We can then, of course, configure a tail input plugin against the file if we want to send the logs to OpenSearch. We can also set plugin-specific logging thresholds as overrides to the Fluent Bit wide setting in the SERVICE block of configuration.

Configuring the other monitoring tools

  • Grafana‘s configuration will allow it to publish Prometheus scrapable metrics and Traces that are OTLP compliant can be found documented here.
  • Prometheus provides metrics on itself (details here) and logging controls as part of its command line and generates logfmt or JSON logs, details here.
  • OpenSearch‘s logs can be accessed as documented here. The Logs are created with Log4j2, which means out of the box, it will be easy to parse them. Configuring the output of slow query reports does need to be switched on. OpenSearch also illustrates a pre OpenTelemetry/OpenMetrics approach to sharing internal metrics by writing them as logs. However, there are ways to convert such log events to OTLP Metrics with Fluent Bit.
  • Jaeger provides metrics endpoints that are Prometheus-compatible, along with JSON-based logs, and are documented here. There is some support for tracing.

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X
  • Click to share on Pocket (Opens in new window) Pocket
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on WhatsApp (Opens in new window) WhatsApp
  • Click to print (Opens in new window) Print
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Pinterest (Opens in new window) Pinterest
  • More
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Cloud Observability in Action – Book Review

04 Thursday Jan 2024

Posted by mp3monster in Book Reviews, Books, General, manning

≈ Leave a comment

Tags

book, development, FluentBit, Fluentd, manning, Michael Hausenblas, o11y, observability, OpenTelemetry, Prometheus, review

With the Christmas holidays happening, things slowed down enough to sit and catch up on some reading – which included reading Cloud Observability in Action by Michael Hausenblas from Manning. You could ask – why would I read a book about a domain you’ve written about (Logging In Action with Fluentd) and have an active book in development (Fluent Bit with Kubernetes)? The truth is, it’s good to see what others are saying on the subject, not to mention it is worth confirming I’m not overlapping/duplicating content. So what did I find?

Observability in Action by Michael Hausenblas
Cloud Observability in Action by Michael Hausenblas

Cloud Observability In Action has been an easygoing and enjoyable read. Tech books can sometimes get a bit heavy going or dry, not the case here. Firstly, Michael went back to first principles, making the difference between Observability and monitoring – something that often gets muddied (and I’ve been guilty of this, as the latter is a subset of the former). Observability doesn’t roll off the tongue as smoothly as monitoring (although I rather like the trend of using O11y). This distinction, while helpful, particularly if you’re still finding your feet in this space, is good. What is more important is stepping back and asking what should we be observing and why we need to observe it. Plus, one of my pet points when presenting on the subject – we all have different observability needs – as a developer, an ops person, security, or auditors.

Next is Michael’s interesting take on how much O11y code is enough. Historically, I’ve taken the perspective – that enough is a factor of code complexity. More complex code – warrants more O11y or logging as this is where bugs are most likely to manifest themselves; secondly, I’ve looked at transaction and service boundaries. The problem is this approach can sometimes generate chatty code. I’ve certainly had to deal with chatty apps, and had to filter out the wheat from the chaff. So Michael’s approach of cost/benefit and measuring this using his B2I ratio (how much code is addressing the business problems over how much is instrumentation) was a really fresh perspective and presented in a very practical manner, with warnings about using such a measure too rigidly. It’s a really good perspective as well if you’re working on hyperscaling solutions where a couple of percentage point improvements can save tens of thousands of dollars. Pretty good going, and we’re only a couple of chapters into the book.

The book gets into the underlying ideas and concepts that inform OpenTelemetry, such as traces and spans, metrics, and how these relate to Observability. Some of the classic mistakes are called out, such as dimensioning metrics with high cardinality and why this will present real headaches for you.

As the data is understood, particularly metrics you can start to think about how to identify what normal is, what is abnormal, or an outlier. That then leads to developing Service Level Objectives (SLOs), such as an acceptable level of latency in the solution or how many errors can be tolerated.

The book isn’t all theory. The ideas are illustrated with small Go applications, which are instrumented, and the generated metrics, traces, and logs. Rather than using a technology such as Fluentd or Fluent Bit, Michael starts by keeping things simple and directly connecting the gathering of the metrics into tools such as Prometheus, Zipkin, Jaeger, and so on. In later chapters, the complexity of agents, aggregators, and collectors is addressed. Then, the choices and considerations for different backend solutions from cloud vendor-provided services such as OpenSearch, ElasticSearch, Splunk, Instana and so on. Then, the front-end visualization of the data is explored with tools such as Grafana, Kibana, cloud-provided tools, and so on.

As the book progresses, the chapters drill down into more detail, such as the differences and approaches for measuring containerized solutions vs. serverless implementations such as Lambda and the kinds of measures you may want. The book isn’t tied to technologies typically associated with modern Cloud Native solutions, but more traditional things like relational databases are taken into account.

The closing chapters address questions such as how to address alerting, incident management, and implementing SLOs. How to use these techniques and tools can help inform the development processes, not just production.

So I would recommend the book, if you’re trying to understand Observability (regardless of a cloud solution or not). If you’re trying to advance from the more traditional logging to a fuller capability, then this book is a great guide, showing what, why, and how to evaluate the value of doing so.

To come back to my opening question. The books have small points of overlap, but this is no bad thing, as it helps show how the different viewpoints intersect. I would actually say that the Observability in Action shows how the wider landscape fits together, the underlying value propositions that can help make the case for implementing a full observability solution. Then, Logging in Action and the new book, Fluent Bit with Kubernetes, give you some of the common context, and we drill into the details of how and what can be done with Fluent Bit and Fluentd. All Manning needs now is content to deep dive into Prometheus, Grafana, Jaeger, and OpenSearch to provide an end-to-end coverage of first principles to the art of the possible in Observability.

I also have to thank Michael for pointing his readers and sections of Logging in Action that directly relate and provide further depth into an area.

Further reading

  • Michael’s medium blog
  • Michael’s website
  • Return on Investment Driven Observability
  • CNCF Observability Whitepaper
  • My additional resources for Fluent Bit and Fluentd which includes some of the related content

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X
  • Click to share on Pocket (Opens in new window) Pocket
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on WhatsApp (Opens in new window) WhatsApp
  • Click to print (Opens in new window) Print
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Pinterest (Opens in new window) Pinterest
  • More
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

LogSimulator New Feature – Custom Targets with OCI Logging example

14 Friday Oct 2022

Posted by mp3monster in Cloud, development, General, logsimulator, Oracle, Technology

≈ Leave a comment

Tags

book, loggenerator, logging, logsimulator, OCI, Prometheus

Those who have been using my Logging in Action book will know that to help test the configuration of monitoring tools including Fluentd we have built a LogGenerator that can very easily play and replay logging events into a variety of destinations and formats. all written in Groovy to make the utility easy to run as a script and extend without needing to set up a proper Java development environment.

With the number of different destinations built into the script and the logic to load the source log events and format them the utility is getting rather large for a single file. Rather than letting it continue to grow as we add more destinations to pump log events too, I’ve extended the implementation so you can point to a Groovy file that implements the logic to send the log events. It only requires three simple methods to be implemented.

To demonstrate the feature we have created a custom extension and fully documented it. The extension allows you to send log events to the OCI Logging service. This includes an optional crude aggregation mechanism as sending individual log events is a little inefficient over REST. By doing this we can send synthetic or playback logs as if we’re an application in real-life to ensure that any alerting or routing for the logging works properly before we get anywhere production and do not need to run the application and induce error events.

Beyond this, we’re also thinking about creating a plugin to fire log events at Prometheus so we can send events using the Prometheus pushgateway. As a result, we can tune Prometheus’ configuration.

More improvements – refactoring the existing code

We will refactor the existing code to use the same approach which should make the code more maintainable, but the changes won’t stop the utility from working as it always has (so we won’t break out the existing output channels from the core).

We have also started to improve the code commenting – so hopefully it will make the code a bit more navigable.

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X
  • Click to share on Pocket (Opens in new window) Pocket
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on WhatsApp (Opens in new window) WhatsApp
  • Click to print (Opens in new window) Print
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Pinterest (Opens in new window) Pinterest
  • More
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Is The 12 Factor App right about Logging?

05 Wednesday Oct 2022

Posted by mp3monster in development, Fluentd, General, Technology

≈ Leave a comment

Tags

12 Factor, 12 Factor App, conference, development, Grafana, JAX, logging, London, OpenSearch, Prometheus, Splunk, stdout

The 12 Factor App definition is now ten years old.  In the world of software that is a long time. So perhaps it’s time to revisit and review what it says.  As I have spent a lot of time around Logging – I’ve focussed on Factor 11 – Logging.

I have been fortunate enough to present at the hybrid JAX London conference on this subject. It was great to get out and see people at a conference rather than just with a screen and a chat console of online-only events.

You can see my presentation here:

Continue reading →

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X
  • Click to share on Pocket (Opens in new window) Pocket
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to email a link to a friend (Opens in new window) Email
  • Click to share on WhatsApp (Opens in new window) WhatsApp
  • Click to print (Opens in new window) Print
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Pinterest (Opens in new window) Pinterest
  • More
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

    I work for Oracle, all opinions here are my own & do not necessarily reflect the views of Oracle

    • About
      • Internet Profile
      • Music Buying
      • Presenting Activities
    • Books & Publications
      • Logging in Action with Fluentd, Kubernetes and More
      • Logs and Telemetry using Fluent Bit
      • Oracle Integration
      • API & API Platform
        • API Useful Resources
        • Useful Reading Sources
    • Mindmaps Index
    • Monster On Music
      • Music Listening
      • Music Reading
    • Oracle Resources
    • Useful Tech Resources
      • Fluentd & Fluent Bit Additional stuff
        • Logging Frameworks and Fluent Bit and Fluentd connectivity
        • REGEX for BIC and IBAN processing
      • Java and Graal Useful Links
      • Official Sources for Product Logos
      • Python Setup & related tips
      • Recommended Tech Podcasts

    Oracle Ace Director Alumni

    TOGAF 9

    Logs and Telemetry using Fluent Bit


    Logging in Action — Fluentd

    Logging in Action with Fluentd


    Oracle Cloud Integration Book


    API Platform Book


    Oracle Dev Meetup London

    Blog Categories

    • App Ideas
    • Books
      • Book Reviews
      • manning
      • Oracle Press
      • Packt
    • Enterprise architecture
    • General
      • economy
      • ExternalWebPublications
      • LinkedIn
      • Website
    • Music
      • Music Resources
      • Music Reviews
    • Photography
    • Podcasts
    • Technology
      • AI
      • APIs & microservices
      • chatbots
      • Cloud
      • Cloud Native
      • Dev Meetup
      • development
        • languages
          • java
          • node.js
      • drone
      • Fluentbit
      • Fluentd
      • logsimulator
      • mindmap
      • OMESA
      • Oracle
        • API Platform CS
          • tools
        • Helidon
        • ITSO & OEAF
        • Java Cloud
        • NodeJS Cloud
        • OIC – ICS
        • Oracle Cloud Native
        • OUG
      • railroad diagrams
      • TOGAF
    • xxRetired
    • AI
    • API Platform CS
    • APIs & microservices
    • App Ideas
    • Book Reviews
    • Books
    • chatbots
    • Cloud
    • Cloud Native
    • Dev Meetup
    • development
    • drone
    • economy
    • Enterprise architecture
    • ExternalWebPublications
    • Fluentbit
    • Fluentd
    • General
    • Helidon
    • ITSO & OEAF
    • java
    • Java Cloud
    • languages
    • LinkedIn
    • logsimulator
    • manning
    • mindmap
    • Music
    • Music Resources
    • Music Reviews
    • node.js
    • NodeJS Cloud
    • OIC – ICS
    • OMESA
    • Oracle
    • Oracle Cloud Native
    • Oracle Press
    • OUG
    • Packt
    • Photography
    • Podcasts
    • railroad diagrams
    • Technology
    • TOGAF
    • tools
    • Website
    • xxRetired

    Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,555 other subscribers

    RSS

    RSS Feed RSS - Posts

    RSS Feed RSS - Comments

    January 2026
    M T W T F S S
     1234
    567891011
    12131415161718
    19202122232425
    262728293031  
    « Nov    

    Twitter

    Tweets by mp3monster

    History

    Speaker Recognition

    Open Source Summit Speaker

    Flickr Pics

    Gogo Penguin at the BarbicanGogo Penguin at the BarbicanGogo Penguin at the BarbicanGogo Penguin at the Barbican
    More Photos

    Social

    • View @mp3monster’s profile on Twitter
    • View philwilkins’s profile on LinkedIn
    • View mp3monster’s profile on GitHub
    • View mp3monster’s profile on Flickr
    • View mp3muncher’s profile on WordPress.org
    • View philmp3monster’s profile on Twitch
    Follow Phil (aka MP3Monster)'s Blog on WordPress.com

    Blog at WordPress.com.

    • Subscribe Subscribed
      • Phil (aka MP3Monster)'s Blog
      • Join 233 other subscribers
      • Already have a WordPress.com account? Log in now.
      • Phil (aka MP3Monster)'s Blog
      • Subscribe Subscribed
      • Sign up
      • Log in
      • Report this content
      • View site in Reader
      • Manage subscriptions
      • Collapse this bar
     

    Loading Comments...
     

    You must be logged in to post a comment.

      Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
      To find out more, including how to control cookies, see here: Our Cookie Policy
      %d