Migrating from Fluentd to Fluent Bit

Tags

, , , , , , ,

Earlier in the year, I made a utility available that supported the migration from Fluent Bit classic configuration format to YAML. I also mentioned I would explore the migration of Fluentd to Fluent Bit. I say explore because while both tools have a common conceptual foundation, there are many differences in the structure of the configuration.

We discussed the bigger ones in the Logs and Telemetry book. But as we’ve been experimenting with creating a Fluentd migration tool, it is worth exploring the fine details and discussing how we’ve approached it as part of a utility to help the transformation.

Routing

Many of the challenges come from the key difference in terms of routing and consumption of events from the buffer. Fluentd assumes that an event is consumed by a single output; if you want to direct the output to more than one output, you need to copy the event. Fluent Bit looks at things very differently, with every output plugin having the potential to output every event – the determination of output is controlled by the match attribute. These two approaches put a different emphasis on the ordering of declarations. Fluent Bit focuses on routing and the use of tags and match declarations to control the rounding of output.

  <match *>
    @type copy
    <store>
      @type file
      path ./Chapter5/label-pipeline-file-output
      <buffer>
        delayed_commit_timeout 10
        flush_at_shutdown true
        chunk_limit_records 50
        flush_interval 15
        flush_mode interval
      </buffer>
      <format>
        @type out_file
        delimiter comma
        output_tag true
      </format> 
    </store>
    <store>
      @type relabel
      @label common
    </store>
  </match>

Hierarchical

We can also see that Fluentd’s directives are more hierarchical (e.g., buffer, and format are within the store) than the structures used by Fluentd Bit, so we need to be able to ‘flatten’ the hierarchy. As a result, it makes sense that where the copy occurs, we’ll define both outputs in the copy declaration as having their own output plugins.

Buffering

There is a notable difference between the outputs’ buffer configurations: in Fluent Bit, the output can only control how much storage in the filesystem can be used. As you can see in the preceding example, we can set the flushing frequency, control the number of chunks involved (regardless of storage type).

Pipelines

Fluentd allows us to implicitly define multiple pipelines of sources and destinations, as ordering of declarations and event consumption is key. ~In addition to this, we can group plugin behavior with the use of the Fluentd label attribute. But the YAML representation of a Fluent Bit doesn’t support this idea.

<source>
  @type dummy
  tag dummy
  auto_increment_key counter
  dummy {"hello":"me"}
  rate 1
</source>
<filter dummy>
 @type stdout
 </filter>
<match dummy>
  @id redisTarget
  @type redislist
  port 6379
</match>
<source>
  @id redisSource
  @type redislist
  tag redisSource
  run_interval 1
</source>
<match *>
  @type stdout
</match>

Secondary outputs

Fluentd also supports the idea of a secondary output as the following fragment illustrates. If the primary output failed, you could write the event to an alternate location. Fluent Bit doesn’t have an equivalent mechanism. To create a mapping tool, we’ve taken the view we should create a separate output.

<match *>
    @type roundrobin
    <store> 
      @type forward
      buffer_type memory
      flush_interval 1s  
      weight 50
      <server>
        host 127.0.0.1
        port 28080
      </server>  
    </store>
    <store>
      @type forward
      buffer_type memory
      flush_interval 1s        
        weight 50
      <server>
        host 127.0.0.1
        port 38080
      </server> 
    </store>
  <secondary>
    @type stdout
  </secondary>
</match>

The reworked structure requires consideration for the matching configuration, which isn’t so easily automated and can require manual intervention. To help with this, we’ve included an option to add comments to link the new output to the original configuration.

Configuration differences

While the plugins have a degree of consistency, a closer look shows that there are also attributes and, as a result, features of plugins that don’t translate. To address this, we have commented out the attribute so that the translated configuration can be seen in the new configuration to allow manual modification.

Conclusion

While the tool we’re slowly piecing together will do a lot of the work in converting Fluentd to Fluent Bit, there aren’t exact correlations for all attributes and plugins. So the utility will only be able to perform the simplest of mappings without developer involvement. But we can at least help show where the input is needed.

Resources

Speaker Upgrade – how I decided what was good

Tags

, , , , , , , , , , , , , , ,

With some recent good news from work, I decided to treat myself to a speaker upgrade – Acoustic Energy 500s sat on some IsoAcoustic Aperta stands. While these would be considered audiophile – they’re still at the lower end – we’re not talking audio exotica like B& Nautilus at nearly hundred thousand pounds or the Cosmotron 130 at around the million pound mark.

Bowers & Wilkins – Nautilus Speaker – a snip at £90,000

So how can I decide and justify the expenditure, even if it’s a fraction of the loose change from the back of the sofa from buying these monsters? As friends have said to me in the past, the Samsung speakers on my stereo are just as good. Well there are a raft of things that will prevent speakers from performing well, from positioning, to the quality of their source.

Million Pound Cosmotron speaker
Cosmotrom priced at £1M

The source material is often one of the biggest issues, particularly for rock and pop pushing the envelope with CDs. We saw what has become known as the loudness wars – where the dynamic range of the music was reduced. But music with a wide dynamic range with good speakers is great. A couple characteristics of good speakers is the containment of distortion – so if you have a song that is often quiet with occasional moments of loudness, the speaker drivers (cones) will be able to react properly to another sudden spike in signal occurs the sudden movement in the magnet moving the cone is handled rather than causing the speaker surface straining against its mounts.

Better speakers will result in better control of the cone (the visible bit of the speaker), making the cone’s movements more precisely revealing detail in the music. You’ll go from hearing a cymbal, to being able to tell how the cymbal was struck, a drum is no long a thump, but you’ll start to hear it resonate.

The cone moves backward and forwards to move the air, which affects air inside the speaker, not just outside. We don’t want the speaker casing to behave as a suction cup, preventing air movement and inhibiting the cone’s movement.

Improvements in speaker performance can help you recognize little details. For example, with a vocal performance, you’ll start to hear fine details, such as air drawn over the microphone as the singer inhales. You can also hear changes as a singer moves close to or away from the microphone, even if they alter their vocal volume.

I was experimenting with a loaned hi-fi kit once, listening to a Jamie Cullum live performance, and a detail that leapt out as I swapped in and out a piece of equipment was what sounded like background ambient noise, such as air conditioning. But suddenly, it became clear I wasn’t picking up ambient noise but the fan that was positioned behind Jamie.

It is always useful to have some good go-to pieces of music for trying out hi-fi. Being familiar with the music and knowing the production values applied means that if there are improvements, you’ll pick them up. So, what are my go-to pieces at the moment?

  • Tori Amos – Me and a Gun — although any part of Little Earthquakes is good. This song is an acapella performance, recounting a rape. With just a voice, the miking of the vocal is very close, and you can hear the inhalation and the rawness of the performance.
  • Beth Orton – Weather Alive — probably Beth’s best album to date. Here is another incredible voice, but also more delicate than Tori Amos, so the better the HiFi, the purer the performance will sound.
  • GoGo Penguin – Branches Break from Man Made Objects – although just about any of their work will be good. This is a trio of piano, bass, and drums in a jazz/minimalist classical/chill beat crossover. This is a recording that should feel like it’s being performed in a big live sounding room. But you’ll hear each instrument clearly, particularly down to recognizing the loudness, varying attack, and decay of each note played.
  • Rush – Red Sector A from Grace Under Pressure, perhaps not the best-produced album in the world, but before the loudness wars really took hold. Rush were a real bunch of prog rock musos with the late Neil Peart, who many considered to be one of the best ever drummers. This track will test the HiFi in terms of control – the drumming has a huge range of very fine cymbal work, some really deep bass drums, and tom-tom runs that make Phil Collin’s In The Air Tonight sound like child’s play.
  • Elbow – One Day Like This The Seldom Seen Kid (Live At Abbey Road Studios) — with a high-quality recording (Abbey Road’s special Half Speed Mastered edition), you’ll get a sense of staging and as the song grows scale with the choir. The strings will be natural and nuanced, in the early parts of the performance of the performance you’ll hear how dry Guy’s voice is – not a hint of vibrato or sibilance.
  • Peter Gabriel – the Book Of Love — from Scratch My Back — another performance that should give a sense of staging and breadth with great dynamics and the strings swell and subside. Fronted by Peter’s voice which should weathered and world warn.

The list of music could go on. But, ultimately, it’s a very individual choice.

Final anecdote

Buying Hi-Fi is a law of diminishing returns. As you get better and better, the parts needed are more expensive and produced in fewer numbers, making the R&D more expensive, with costs to be covered by a small number of sales. But still, these esoteric, bank-crushing systems are amazing.

Some years back, I went to a HiFi show; if you’ve never been to such a show then picture this. A corridor of rooms is stripped of the beds and furnishings other than some chairs. Each company has a room and typically sets up its demo kit where the head of the bed would usually be. Everything would be positioned and mounted on professional hi-fi tables, etc, for the absolute best performance. The classic layout for a hotel room means as you walk into the room, you won’t see what is set, so the seconds it takes to walk past what is normally the bathroom is almost a blind test as you can’t see the HiFi, but you’ll be able to hear it.

So here we are, as we start to walk into a room that was pretty busy, so you didn’t see the main space for a minute or so, and we hear a performance of a beautifully played unaccompanied double bass. I could have sworn there was a musician in the room performing – the performance had that warmth, depth, and volume you’d expect. No hint of any recording artifacts. When we got to the main part of the room, we were stunned to see two speakers, big and rather boxy – no audio exotica beauty like Nautilus or Cosmotron — definitely all function, and little thought to form. With them, 3 large pieces of silver HiFisat on big chunky slabs of marble on the floor – what I assume to be a pre-amp and a power amp for each speaker. Plus a source – which might have been a turntable – but honestly, I can’t remember – whatever it was, the sound was breathtakingly natural sounding.

Chord Ultima Monoblock Power Amplifier £35,000 per unit
Chord Monobloc Power Amplifier £350,000 per bloc – you’d need two, plus a pre-amp for a basic arrangement.

I do remember the price tags, and at the time, prices were around 50k a component- so little change out of a quarter of a million. It left me wishing I’d won the national lottery.

Fluent Bit – using Lua script to split up events into multiple records

Tags

, ,

One of the really advanced features of Fluent Bit’s use of Lua scripts is the ability to split a single log event so downstream processing can process multiple log events. In the Logging and Telemetry book, we didn’t have the space to explore this possibility. Here, we’ll build upon our understanding of how to use Lua in a filter. Before we look at how it can be done, let’s consider why it might be done.

Why Split Fluent Bit events

This case primarily focuses on the handling of log events. There are several reasons that could drive us to perform the split. Such as:

  • Log events contain metrics data (particularly application or business metrics). Older systems can emit some metrics through logging such as the time to complete a particular process within the code. When data like this is generated, ideally, we expose it to tools most suited to measuring and reporting on metrics, such as Prometheus and Grafana. But doing this has several factors to consider:
    • A log record with metrics data is unlikely to generate the data in a format that can be directed straight to Prometheus.
    • We could simply transform the log to use a metrics structure, but it is a good principle to retain a copy of the logs as they’re generated so we don’t lose any additional meaning, which points to creating a second event with a metrics structure. We may wish to monitor for the absence of such metrics being generated, for example.
  • When transactional errors occur, the logs can sometimes contain sensitive details such as PII (Personally Identifiable Information). We really don’t want PII data being unnecessarily propagated as it creates additional security risks – so we mask the PII data for the event to go downstream. But, at the same time, we want to know the PII ID to make it easier to identify records that may need to be checked for accuracy and integrity. We can solve this by:
    • Copying the event and performing the masking with a one-way hash
    • Create a second event with the PII data, which is limited in its propagation and is written to a data store that is sufficiently secured for PII data, such as a dedicated database

In both scenarios provided, the underlying theme is creating a version of the event to make things downstream easier to handle.

Implementing the solution

The key to this is understanding how the record construct is processed as it gets passed back and forth. When the Lua script receives an event, it arrives in our script as a table construct (Java developers, this approximates a HashMap), with the root elements of the record representing the event payload.

Typically, we’d manipulate the record and return it with a flag saying the structure has changed, but it is still a table. But we could return an array of tables. Now each element (array entry) will be processed as its own log event.

A Note on how Lua executes copying

When splitting up the record, we need to understand how Lua handles its data. if we tried to create the array with the code:

record1 = record
record2 = record
newRecord[record1, record2] 

Then we manipulated newRecord[1] We would still impact both records; this is because Lua, like its C underpinning, always uses shallow references rather than deep copies of objects. So we need to ensure we perform a deep copy before manipulating the records. You can see this in our example configuration (here on GitHub), or look at the following Lua code fragment:

function copy(obj)
  if type(obj) ~= 'table' then return obj end
  local res = {}
  for k, v in pairs(obj) do res[copy(k)] = copy(v) end
  return res
end

The proof

To illustrate the behavior, we have created a configuration with a single dummy plugin that only emits a single event. That event is then picked up by a Filter with our Lua script. After the filter, we have a simple output plugin. As a result of creating two records, we should see two output entries. To make it easy to compare, in the Lua script, we have a flag called deepCopy; when set to true – we’ll clone the records and modify payload values; when set to true – we then perform the split.

[SERVICE]
  flush 1

[INPUT]
    name dummy
    dummy {   "time": "12/May/2023:08:05:52 +0000",   "remote_ip": "10.4.72.163",   "remoteuser": "-",   "request": {     "verb": "GET",     "path": " /downloads/product_2",     "protocol": "HTTP",     "version": "1.1"   },   "response": 304}
    samples 1
    tag dummy1

[FILTER]
    name lua
    match *
    script ./advanced.lua
    call cb_advanced
    protected_mode true

[OUTPUT]
    name stdout
    match *

Limitations and solutions

While we can easily split events up and return multiple records, we can’t use different tags or timestamps. Using the same timestamp is pretty sensible, but different tags may be more helpful if we want to route the different records in other ways.

As long as the record contains the value we want to use as a tag, we can add to the pipeline a tag-write plugin and point it to the attribute to parse with a REGEX. To keep things efficient, if we create an element that is just the tag when creating the new record, then the REGEX becomes a very simple expression to match the value.

Conclusion

We’ve seen a couple of practical examples of why we might want to spin out new observability events based on what we get from our system. An important aspect of the process is how Lua handles memory.

Resources

Moby at the O2 London

Tags

, ,

I don’t blog about gigs very often, usually because I can never remember the set list by the end of the evening, and I’m on a euphoric buzz (no chemicals involved).

This evening wasn’t that much different. There was a euphoric buzz, and I loved the music. But as the tour is celebrating Play’s 25th anniversary, and we’ve had 25 years to put titles to songs.

Moby had what looked a lot like a fifty-something audience (some with their teenage and twenty-something children with them) immediately on their feet. The vibe was as if everyone had shed 20+ years and was clubbing again, with DJ smoothness as songs transitioned into each other.

The slower tracks performed have been spiced up a bit to keep things moving, and tracks like Bodyrock went all out on the rock.

When Moby originally toured Play, he worked pretty hard behind the keyboards and occasionally thrashed at his guitar. This time out, he was willing to lean on a very talented band, two singers, and guest appearances from Lady Blackbird (who initially performed with Moby for tracks like Dark Days). This meant Moby could dash around the stage and play his guitar and take the occasional turn with a keyboard and congas.

Visually, the lighting, etc., hadn’t really moved on in 25 years. While it would be nieve to think he would compete with the likes of Peter Gabriel, the lighting did look dated against the likes of Elbow, who aren’t known for visual spectacle. This didn’t diminish the live energy, though – and chances are he was controlling costs so the charities who got the profits from the shows saw more money.

The set finished on the traditional Moby way, acknowledging his rave roots with Feel So Real and Thousand. For Thousand, the imp of a man would have climbed on top of his keyboards and launched himself off the keyboards at the climax of the song. Today, it is a bit more sedate, with the stage crew rolling on a flight case to climb onto and no spectacular leaping.

Overall, it was great to see him live again, but I suspect we’ll not see him tour again. By his own confession, he loves simply performing in his garden with friends in LA.

Sharing a monitor

Like many techies, I have a personal desktop machine that packs a bit of grunt and runs multiple monitors. I also have a company laptop—for me, it’s a Mac (which, to be honest, I’m not a fan of, despite loving my iPhone and iPad).

Like many, my challenge is that when you’re used to two large screens, a laptop monitor just doesn’t cut it. So, I want to share one of the screens between two machines. The question I’ve wrestled with is how to do that with a simple button or key press. No messing with cables, or monitor settings etc.

I initially tried to solve this with a KVM—easy, right? Wrong. It turns out Macs don’t play nice with KVM switches. I went through buying several switches and sending them back before discovering it was a MacBook Pro quirk.

For a while, I used a travel monitor, which I had acquired, to solve the same issue when I traveled extensively for my previous employer. It’s an improvement, but the travel screen is still pretty small, not to mention it takes up more desk space (my main monitors and laptop are all on arms, so I can push them back if I want more desk space).

As most decent monitors allow multiple inputs, we’ve resorted to having two inputs, the only problem is that the controls to switch between inputs aren’t easy to use – but most people don’t need to toggle back and forth several times a day (VPN related tasks are done on the Mac, everything else is the desktop).

But this week, we had a breakthrough, the core of it is finding out about ControlMyMonitor and the VCP features newer monitors have. ControlMyMonitor provides a simple UI tool that allows us to identify an ID for each monitor and a command line capability that generates and sends connected monitor instructions, which allows us to do things like switching input, changing contrast, etc. With the tool we can issues commands such as: ControlMyMonitor.exe /SetValue "\.\DISPLAY2\Monitor0" 60 15. This tells the app for display monitor 2 (as known to my desktop) to send the VCP (Virtual Control Panel) code 60 (change input source) to the input number 15. I can switch the monitor back to the desktop by supplying the input number for the connection to the desktop.

So now I can toggle between screens without feeling around the back of the monitor to navigate the menu for switching inputs. Using a command is pretty cool, but still not good enough. I could put on my desktop links to two scripts to run the relevant command. But I’d also come across AutoHotKey (aka AHK). This allows us to create scripts using AHK’s syntax, which can be configured to work with specific key combinations. So, creating a config file with a key combo to run the command shown made it really convenient. Windows+Shift+Left arrow and the monitor switches to the desktop, Windows+Shift_Right arrow, and it displays the laptop. The script will look something like this:

Requires AutoHotkey v2.0
#+Right::Run "monitor-left.bat"
#+Left::Run "monitor-right.bat"

We could embed the command directly into the AHK script, but the syntax is unusual and would require escaping quotes. By referencing a shell script, we can easily extend the process without needing to master the AHK syntax.

The only remaining problem now is for the AHK app to start when the machine boots up and load the configuration/script when the desktop boots up. We can do this by adding a shortcut the the script file. This can be done by creating a shortcut to our script that the AHK runtime will recognize to run and put that shortcut in the startup folder. The startup folder is likely to be somewhere such as C:\Users\<username>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup. But we can get a file explorer to open in the right place with the command Windows+R (open the command line option from the Start Menu. Then enter the command “shell:startup“.

Further reading:

Think Distributed Systems

Tags

, , , , ,

One of the benefits of being an author with a publisher like Manning is being given early access to books in development and being invited to share my thoughts. Recently, I was asked if I’d have a look at Think Distributed Systems by Dominik Tornow.

Systems have become increasingly distributed for years, but the growth has been accelerating fast, enabled by technologies like CORBA, SOAP, REST frameworks, and microservices. However, some distribution challenges even manifest themselves when using multithreading applications. So, I was very interested in seeing what new perspectives could be offered that may help people, and Dominik has given us a valuable perspective.

I’ve been fortunate enough that my career started with working on large multi-server, multithreaded mission-critical systems. Using Ada and working with a mentor who challenged me to work through such issues. How does this relate to the book? This work and the mentor meant I built some good mental models of distributed development early in my career. Dominik calls out that having good mental models to understand distributed systems and the challenges they can bring is key to success. It’s this understanding that equips you to understand challenges such as resource locking, contending with mutual deadlock, transaction ordering, the pros and cons of optimistic locking, and so on.

As highlighted early on in this book, most technical books come from the perspective of explaining tools, languages, or patterns and to make the examples easy to follow, the examples tend to be fairly simplistic. This is completely understandable; these books aim to teach the features of the language. Not how to bring these things to bear in complex real-world use cases. As a result, we don’t necessarily get the fullest insight and understanding of the problems that can come with optimistic locking.

Given the constraints of explaining through the use of programming features, the book takes a language-agnostic approach to explaining the ideas, and complexities of distributed solutions. Instead, the book favors using examples, analogies, and mathematics to illustrate its points. The mathematics is great at showing the implications of different aspects of distributed systems. But, for readers like me who are more visual and less comfortable with numeric abstraction, this does mean some parts of the book require more effort – but it is worth it. You can’t deny hard numeric proofs can really land a message, and if you know what the variables are that can change a result, you’re well on your way.

For anyone starting to design and implement distributed and multi-threaded applications for the first time, I’d recommend looking at this book. From what I’ve seen so far, the lessons you’ll take away will help keep you from walking into some situations that can be very difficult to overcome later or, worse, only manifest themselves when your system starts to experience a lot of load.

Fluent Bit with Chat Ops

Tags

, , , , , , , , ,

My friend Patrick Stephens and Fluent Bit committer will present at the Open Source Monitoring Conference in Germany later this year. Unfortunately, I won’t be able to make it, as my day job is closing in on its MVP product release.

The idea behind the presentation is to improve the ability to detect and respond to Observability events, as the time between detection and action is the period during which your application is experiencing harm, such as lost revenue, data corruption, and so on.

The stable configuration and code base version is in the Fluent GitHub repository; my upstream version is here. We first discussed the idea back in February and March. We applied simpler rules to determine if the log event was critical.

Advancing the idea

Now that my book is principally in the hands of the publishers (copy editing and print preparation, etc.), we can revisit this and exploit features in more recent releases to make it slicker and more effective, for example.

  • Stream processor, so a high frequency of smaller issues could trigger a notification using the stream processor.
  • We can also use the stream processor to provide a more elegant option to avoid notification storms.
  • The new processors will make it easier to interact with metrics, so any application natively producing metrics.

Other tooling

With the book’s copy editing done, we have a bit more time to turn to our other Fluent Bit project … Fluent Bit configuration converter, both classic to YAML, and implementing a Fluentd to Fluent Bit 1st stage converter. You can see this in GitHub here and here.

Two weeks of Fluent Bit

Tags

, , , , , ,

The last couple of weeks have been pretty exciting. Firstly, we have Fluent Bit 3.1 released, which brings further feature development to Fluent Bit, making it even more capable with Fluent Bit handling of Open Telemetry (OTel).

The full details of the release are available at https://fluentbit.io/announcements/v3.1.0/

Fluent Bit classic to YAML

We’ve been progressing the utility, testing and stabilizing it, and making several releases accordingly. The utility is packaged as a Docker image, and the regression test tool also runs as a Docker image.

Moving forward, we’ll start branching to develop significant changes to keep the trunk stable, including experimenting with the possibility of extending the tool to help port Fluentd to Fluent Bit YAML configurations. The tools won’t be able to do everything, but I hope they will help address the core structural challenges and flag differences needing manual intervention.

Book

The Fluent Bit book has moved into its last phase with the start of copy editing. We have also had a shift in the name to Logs and Telemetry using Fluent Bit, Kubernetes, streaming, and more, or just Logs and Telemetry using Fluent Bit. The book fundamentally hasn’t changed. There is still a lot of Kubernetes-related content, but it helps focus on what Fluent Bit is all about rather than being just another Kubernetes book.

Logs and Telemetry using Fluent Bit
Logs and Telemetry using Fluent Bit, Kubernetes, streaming and more

Fluent Bit config from classic to YAML

Tags

, , , , ,

Fluent Bit supports both a classic configuration file format and a YAML format. The support for YAML reflects industry direction. But if you’ve come from Fluentd to Fluent Bit or have been using Fluent Bit from the early days, you’re likely to be using the classic format. The differences can be seen here:

[SERVICE]
    flush 5
    log_level debug
[INPUT]
   name dummy
   dummy {"key" : "value"}
   tag blah
[OUTPUT]
   name stdout
   match *
#
# Classic Format
#
service:
    flush: 1
    log_level: info
pipeline:
    inputs:
        - name: dummy
          dummy: '{"key" : "value"}'
          tag: blah
    outputs:
        - name: stdout
          match: "*"
#
# YAML Format
#

Why migrate to YAML?

Beyond having a consistent file format, the driver is that some new features are not supported by the classic format. Currently, this is predominantly for Processors; it is fair to assume that any other new major features will likely follow suit.

Migrating from classic to YAML

The process for migrating from classic to YAML has two dimensions:

  • Change of formatting
    • YAML indentation and plugins as array elements
    • addressing any quirks such as wildcard (*) being quoted, etc
  • Addressing constraints such as:
    • Using include is more restrictive
    • Ordering of inputs and outputs is more restrictive – therefore match attributes need to be refined.

None of this is too difficult, but doing it by hand can be laborious and easy to make mistakes. So, we’ve just built a utility that can help with the process. At the moment, this solution is in an MVP state. But we hope to have beefed it up over the coming few weeks. What we plan to do and how to use the util are all covered in the GitHub readme.

The repository link (fluent-bit-classic-to-yaml-converter)

Update 4th July 24

A quick update to say that we now have a container configuration in the repository to make the tool very easy to use. All the details will be included in the readme, along with some additional features.

Update 7th July

We’ve progressed past the MVP state now. The detected include statements get incorporated into a proper include block but commented out.

We’ve added an option to convert the attributes to use Kubernetes idiomatic form, i.e., aValue rather than a_value.

The command line has a help option that outputs details such as the control flags.

Update 12th July

In the last couple of days, we pushed a little too quickly to GitHub and discovered we’d broken some cases. We’ve been testing the development a lot more rigorously now, and it helps that we have the regression container image working nicely. The Javadoc is also generating properly.

We have identified some edge cases that need to be sorted, but most scenarios have been correctly handled. Hopefully, we’ll have those edge scenarios fixed tomorrow, so we’ll tag a release version then.

Logging Frameworks that can communicate directly with Fluent Bit

Tags

, , , , , , , , , , , , , , ,

While the typical norm is for applications to write their logs to file or to stdout (console), this isn’t the most efficient way to handle logs (particularly given I/O performance for the storage devices). Many logging frameworks have addressed this by providing more direct outputs to commonly used services such as ElasticSearch and OpenSearch. This is fine, but the only downside is that there is no means for an intermediary layer to preprocess, filter, and route (potentially to multiple services). These constraints can be overcome by using an intermediary service such as Fluent Bit or Fluentd.

Many logging frameworks can work with Fluentd by supporting the HTTP or Forward protocols Fluentd supports out of the box. But as both Fluent Bit and Fluentd are interchangeable with these protocols and logging framework that supports Fluentd, by implication also supports Fluent Bit, not to mention Fluent Bit supports OpenTelemetry.

The following table identifies a range of frameworks that can support communicating directly with Fluent Bit. It is not exhaustive but does provide broad coverage. We’ll update the table as we discover new frameworks that can communicate directly.

Latest Version …

LanguageFramework / LibraryProtocol(s)Commentary
JavaLog4J2HTTP AppenderSend JSON payloads over HTTP (use HTTP input plugin)
Javafluent-logger-javaForward
Pythoncore languageHTTP HandlerProvides the means to send logs over HTTP – means Fluent Bit input handler can manage
Pythonfluent-logger-python
Fluent Logger
ForwardUses the Forward protocol meaning it can gain the efficiencies from msgpack.
Maintained by the Fluent community
Node.jsfluent-logger-nodeForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
Node.jsWinstonHTTP

Forward
Winston is designed as a simple and universal logging library supporting multiple transports.
Winston includes transport support for HTTP in its core. There is also a Transport implementation for native Fluent https://github.com/sakamoto-san/winston-fluent
Node.jsPino (Pino-fluent extension)Logger integrated into the Pino logging framework
Go (Golang)fluent-logger-golangForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
.Net (C# VB.Net etc)NLog (NLog.Targets.Fluentd)An NLog target – works with .Net
.Net (C# VB.Net etc)Log4NetLog4Net Appender
.NetSerilog (Fluent Sink)Forward and HTTPSupports both HTTP and nativbe Fluentd/FluentBit
Rubyfluent-logger-rubyForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
PHPfluent-logger-phpForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
Perlfluent-logger-perlForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
Scalafluent-logger-scalaForwardIt uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
Erlangfluent-logger-erlangForward
It uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
OCAMLfluent-logger-ocamlForward
It uses the Forward protocol, meaning it can gain efficiencies from msgpack.
Maintained by the Fluent community
RustRust Logging framework extension for Fluent BitRust crate for logging to Fluent Bit
DelphiQuickloggerHTTP