Many of the most potent features being added to Fluent Bit are only available in the newer YAML format. At the same time Fluent Bit is celebrating its 10th anniversary, so we can be sure that are some established deployments that are being maintained with the classic format.
The transition between the two Fluent Bit formats can be a bit fiddly. We built a tool as described in my Fluent Bit config from classic to YAML post using Java over a year ago. While it addressed many of the needs, I wasn’t entirely happy with the solution, for several reasons:
Written in Java, which, as a language, is the one I’m most at home with, as an implementation, it does increase the workload for using and implementing – needing a container, etc., ideally. Not unreasonable, but as a small tool project, we’re not working with a nice enterprise build pipeline, so from a code tweak to test is just fiddlier. I’m sure any Java devs reading this will be shouting, ” Setting up Maven, Jenkins, JUnit isn’t difficult – and they’re right, but it’s a time-consuming distraction from working on the real problem.
While it works, the code didn’t feel as clean as it could or should.
Over the last couple of years, we’ve become more comfortable with Python, and messing around with LLMs and code generation has helped accelerate our development skills. Python offers a potentially better solution: many environments that use Fluentd or Fluent Bit are Linux-based and include Python by default, so you can use the tool without messing with containers if you don’t want to. For a tool intended to support development and maintenance of configurations rather than in production being able to use or modify/extend the code easily is attractive.
We’ve started afresh and built a new implementation that is more extensible and maintainable. The new Python implementation is under the same Apache 2 license, but in a different repo so that existing use and forks of the Java implementation aren’t disrupted.
We can wrap Python with all the production-class build processes, but it’s not necessary to get a tool up and running or to enable people to tinker as needed. So we have focused on that. Over time, we’ll play with Poetry or other packaging mechanisms to make it even easier.
Fluent Bit supports both a classic configuration file format and a YAML format. The support for YAML reflects industry direction. But if you’ve come from Fluentd to Fluent Bit or have been using Fluent Bit from the early days, you’re likely to be using the classic format. The differences can be seen here:
[SERVICE]
flush 5
log_level debug
[INPUT]
name dummy
dummy {"key" : "value"}
tag blah
[OUTPUT]
name stdout
match *
#
# Classic Format
#
Beyond having a consistent file format, the driver is that some new features are not supported by the classic format. Currently, this is predominantly for Processors; it is fair to assume that any other new major features will likely follow suit.
Migrating from classic to YAML
The process for migrating from classic to YAML has two dimensions:
Change of formatting
YAML indentation and plugins as array elements
addressing any quirks such as wildcard (*) being quoted, etc
Addressing constraints such as:
Using include is more restrictive
Ordering of inputs and outputs is more restrictive – therefore match attributes need to be refined.
None of this is too difficult, but doing it by hand can be laborious and easy to make mistakes. So, we’ve just built a utility that can help with the process. At the moment, this solution is in an MVP state. But we hope to have beefed it up over the coming few weeks. What we plan to do and how to use the util are all covered in the GitHub readme.
A quick update to say that we now have a container configuration in the repository to make the tool very easy to use. All the details will be included in the readme, along with some additional features.
Update 7th July
We’ve progressed past the MVP state now. The detected include statements get incorporated into a proper include block but commented out.
We’ve added an option to convert the attributes to use Kubernetes idiomatic form, i.e., aValue rather than a_value.
The command line has a help option that outputs details such as the control flags.
Update 12th July
In the last couple of days, we pushed a little too quickly to GitHub and discovered we’d broken some cases. We’ve been testing the development a lot more rigorously now, and it helps that we have the regression container image working nicely. The Javadoc is also generating properly.
We have identified some edge cases that need to be sorted, but most scenarios have been correctly handled. Hopefully, we’ll have those edge scenarios fixed tomorrow, so we’ll tag a release version then.
When it comes to observability, particularly logs, and traces, there is a historical tendency to process things in a batch manner or even only once the need to determine the root cause of an outage, often only using something in the metrics to indicate something might not be right. This misses a real opportunity given Fluent Bit can capture observability events in near real-time, whether that is a log, metric, or trace indicating something unhealthy; why not present the issue to those performing an ops role as soon as it is recognized by Fluent Bit. Not once the data is processed by a back end?
While we have solutions like PagerDuty, they tend to be integrated with back-end event analytics tools. Fluent Bit can talk to social channels such as Slack – so why not direct critical events to Slack and interact with the Ops team more directly. After all, if we’re told quickly about an imminent issue or as soon after something wrong occurs, the impact and effort involved in remediation and recovery are smaller. This is the basis of a presentation that Patrick Stephens (from Chronosphere and a committer to the Fluent Bit project) and I have put together. Patrick will be leading the session at the Cloud Native Rejekts conference in Paris (the ‘b side’ to Kube Con Europe), which takes place on the two days before Kubecon itself.
The session looks at the idea of what has been called ChatOps, why and how it can bring value, facilitated with a demo of using Fluent Bit to detect and share an event with Fluent Bit and also pick up and handle directions from the Ops team in the Slack channel.
We hope you’ll see from the session why we think the approach is worthy of consideration and how the potential security considerations can be mitigated. The MVP code is currently here but may, in due course, actually be migrated to the Fluent repos here.
We’ve bundled readme content and scripts to build and help test the additional functionality created to facilitate part of the operation.
We don’t want to spoil the presentation, so we won’t share too much. But it’ll also be worth checking with the blog, seeing as we’ll record a video and eventually record a session explaining the MVP’s ins and outs.
Development trends have shown a shift towards precompiled languages like Go and Rust away from interpreted or Just-In-Time (JIT) compiled languages like Java and Ruby as it removes the startup time of the language virtual machine and the JIT compiler as well as a smaller memory footprint. All desirable features when you’re scaling containerized solutions and percentage point savings can really add up.
Oracle has been leading the way with its work on GraalVM for some years now, and as a result, not only can GraalVM be used to produce native binary images from Java code, GraalVM also supports TuffleRuby and GraalPy, among others. As TruffleRuby is an open-source project, Oracle isn’t the only vendor contributing to it, work effort has also come from Shopify.
Helping Ruby move forward isn’t new for the Shopify engineering team, and part of that investment is that they have just announced the open-sourcing of a toolchain called Ruvy. Ruvy takes Ruby and creates a WebAssembly (WASM) from it the code. This builds on the existing project ruby.wasm. In doing so they’ve addressed the Ruby startup overhead of the language VM we mentioned. They have also simplified the process of deployment, eliminating the need for Web Assembly System Interface (WASI) arguments, and overcome constraints of class loading by reading files by having the code bundled within the assembly and then accessing the content using WASI-VFS, a simple virtual file system.
The published benchmarks show a massive performance boost in the process of executing where the Ruby code needs to be executed by the packaged JIT. For me, this is interesting as one of the related cloud-native trends is the shift from Fluentd to Fluent Bit. Fluentd was built with Ruby and has a huge portfolio of third-party extensions. But Fluent Bit is built using C to get those performance gains previously described. But it does support plugins through WASM. This raises an interesting question can we take existing Ruby plugins and wrap them so the required interfacing works – which should be minimal and more likely to be impacted by the fact Fluent Bit v2 has refined the internal data structure that was common to both Fluentd and Fluent Bit to allow Fluent Bit to more easily engaged with OpenTelemetry.
If the extra bit of wrapping code isn’t complex, then applying Ruvy should mean the core plugin can then work with Fluent Bit. If this can be templated, then Fluent Bit is going to make a big leap forward with the number of available plugins.
Today, Java 21 has reached General Availability (GA) with some important new features in the language mainstream (i.e., not requiring preview flags enabled), and Oracle will be supporting Java 21 as a Long long-term support (guaranteed at least 3years of free support (2years to the next LTS + 1 yr overlap) and then for at least an additional 5 years under support subscription). Everyone is talking about virtual threads. Interestingly the new virtual threads mean, in the majority of cases, we no longer need to handle the complexities of reactive programming – not my point of view, but a view expressed earlier today by Tomas Langer, the architect for Helidon. For old hands like myself – this is a blessing as the old-style threading comes more naturally. There are a lot of other smaller features coming through in the language with this, such as records, Z Garbage Collector, and better support for Key Encapsulation management. All the fine details can be found on the Oracle Java blog.
Java.dev has a new Playground, which allows you to write some Java code in the browser and run it. No local JDK or IDE is needed. Great for trying out code, like pattern matching for switch statements.
GraalVM gets a new release with Java 21. Along with some other cool features. Including being able to deploy Graal’s polyglot features with just the support for the languages you want, meaning that the GraalVM footprint is kept as small as you need in containers. This decoupling is supported with Maven and Gradle configurations. With this are enhancements for the Just-in-Time (JIT) and Ahead-of-Time (AOT) performance – read more about this in Alina Yurenko‘s blog.
When building Node solutions, even if you’re not going to publish the code to a public repository you’re likely to be using package.json to declare the dependencies for your app. Doing this makes it easier to build and deploy a utility. But if you’re conversant with several languages there is a tendency to just adapt your existing skills to work with others. The downside of this is small tooling nuances can catch you off guard and consume time while figuring them out. The workings of packages with NPM (as shown below) is one possible case.
If you create the package.json using npm init to create the initial version of the file, it is fairly common to set values to default. In the case of the license, this is an ISC license. This is easily forgotten. The problem here is twofold:
Does the license set reflect the constraints of the dependencies and their licenses
Does the default license reflect the position you want?
Looking at the latter point first, This is important as organizations have matured (and tooling greatly improved) when it comes to understanding how open source licensing can impact. This is particularly important for any organizations leveraging open source as part of their revenue generating activities either ‘as a service’ but also selling software solutions. If you put the wrong license here the license checking tools often protecting code repositories may reject your code, even in internal only use cases (yes this tripped me up).
To help overcome this issue you can install a tool that will analyze the dependencies and optionally their dependencies and report back on your license exposure. This tool is called license-report. Once installed (npm install -g license-report) we just need to point the tool to the package.json file. e.g. license-report package.json. We can make the results a lot more consumable by outputting the content in a number of formats. For example a simple text value:
From this, you could set your license declaration in package.json or validate that your preferred license won’t conflict,
I’ve designed a variety of GraphQL schemas and developed microservice backends. But not done much with configuring the Apollo implementation of a GraphQL server until recently. This may reflect the fact my understanding of JavaScript doesn’t extend into the world of Node.JS as much as I’d like (the problem with being a multi-language developer is you’re likely to find your way around many languages but never be a master of one). Anyway, the following content is about the implementation within a GraphQL server part of a solution. It may be these pointers are just for my benefit you might find them helpful as well.
To make it easy to reference the code, we’ve added entries (n) into the code, where n is a number. This is not part of the code. But there to make the different lines referenceable. Where code should go but is not relevant to the point being made I’ve added ellipsis (…)
Dynamic loading and server configuration
import { ApolloServer } from 'apollo-server';
import { loadFilesSync } from '@graphql-tools/load-files';
import { resolvers } from './resolvers.js'; (1)
import ProviderInternalAPI from './ProviderInternalAPI.js'; (1)
import EventsInternalAPI from './EventsInternalAPI.js'; (1)
const server = new ApolloServer({
debug : true, (2)
typeDefs: loadFilesSync('./schema.graphql'), (3)
resolvers,
dataSources: () => {
return {
eventsInternalAPI: new EventsInternalAPI(), (4)
providerInternalAPI: new ProviderInternalAPI() (4)
pro
};
}});
There is the potential to dynamically load the resolvers rather than importing each JavaScript file as we see on lines (1). The mechanics to do this is documented here. It would be cool if an opinionated implementation was provided. As shown by (3) we can take a independent schema file being loaded. The Apollo example approach for this didn’t seem to work for us, although both approaches make use of graphql-tools in a synchronous manner.
We can switch on debugging (2) for the GraphQL server, although the level of information published doesn’t appear to be significant. Ideally this setting is changed for production.
Defining the resolvers
The prefix for each resolver (1) must correlate to the name in the schema of the mutator or query (not the type as you would expect with Java). Often we don’t need all the parameters for the resolver. The documentation describes replacing each unused parameter with one or more underscores (i.e _, __ ). The underscore denoting the field not in use. However we can satisfy the indication of not being used, but keep the meaning of each position by using the underscore then a name (i.e. _parent, _args ) as shown in (2).
By taking the response into a variable (3) we can optionally log it. Trying to return using invocation line would result in the handler object rather than the payload itself. By taking the result into a variable we can log the content if desired and return the content.
The use of the backward quote is a node feature. It allows us to incorporate variables into a string by referencing it within ${}(4).
We need to supply the GraphQL server with instances with a layer of code that will interact with the resolvers. We can instantiate the instances in the declaration. The naming of the object is important (4) to the resolver.js (declarations).
import { useLogger } from "@graphql-yoga/node";
...
latestEvent (1): async (_parent, _args, { dataSources }, _info) (2) => {
if (log) { console.log("resolvers - get latest event"); }
let responseValue = await dataSources.eventsInternalAPI.getLatestEvent(); (3)
if (log) { console.log(`(4) Resolver response for latest event:\n ${responseValue}`); }
return responseValue;
},
To handle the use of resolvers within a larger resolver we need to declare the resolution outside of the Query and Mutator blocks (but inside the whole declaration block)(1). The name provided needs to match the parent entity that the query resolver contributes to.
To then provide values from the outer resolution we need to prover to the chained resolution use the naming as represented in the GraphQL schema as shown by (2). The GraphQL engine will resolve the mapping values.
Web resolver URL
// GET
async getProvider(code) {
console.log("getProvider (%s) directing to %s",code,this.baseURL);
return this.get(`provider?code=${code} (1)`);
}
The URL parameters need to be appended to the base URL path for the parent class to use in the invocation as shown by (1). The Apollo examples showed a setter option but we didn’t see the URI being addressed properly. This approach produces the relevant requirement.
You must be logged in to post a comment.