Back in 2018, Manning published Chris Richardson‘s Microservices Patterns book. In many respects, this book is the microservices version of the famous Gang of Four patterns book. The exciting news is that Chris is working on a second edition.
One key difference between the GoF book and this is that engaging with patterns like Inversion of Control, Factories, and so on isn’t impacted by considerations around architecture, organization, and culture.
While the foundational ideas of microservices are established, the techniques for designing and deploying have continued to evolve and mature. If you follow Chris through social media, you’ll know he has, in the years since the book’s first edition, worked with numerous organisations, training and helping them engage effectively with microservices. As a result, a lot of processes and techniques that Chris has identified and developed with customers are grounded in real practical experience.
As the book is in its early access phase (MEAP), not all chapters are available yet, so plenty to look forward to.
So even if you have the 1st edition and work with microservice patterns, the updates will, I think, offer insights that could pay dividends.
If you’re starting your software career or considering the adoption of microservices (and Chris will tell you it isn’t always the right answer), I highly recommend getting a copy, as with the 1st edition, the 2nd will become a must-read book.
So, there is a new book title published with me as the author (Logging Best Practices) published by Manning, and yes, the core content has been written by me. But was I involved with the book? Sadly, not. So what has happened?
Background
To introduce the book, I need to share some background. A tech author’s relationship with their publisher can be a little odd and potentially challenging (the editors are looking at the commerciality – what will ensure people will consider your book, as well as readability; as an author, you’re looking at what you think is important from a technical practitioner).
It is becoming increasingly common for software vendors to sponsor books. Book sponsorship involves the sponsor’s name on the cover and the option to give away ebook copies of the book for a period of time, typically during the development phase, and for 6-12 months afterwards.
This, of course, comes with a price tag for the sponsor and guarantees the publisher an immediate return. Of course, there is a gamble for the publisher as you’re risking possible sales revenue against an upfront guaranteed fee. However, for a title that isn’t guaranteed to be a best seller, as it focuses on a more specialized area, a sponsor is effectively taking the majority investment risk from the publisher (yes, the publisher still has some risk, but it is a lot smaller).
When I started on the Fluent Bit book (Logs and Telemetry), I introduced friends at Calyptia to Manning, and they struck a deal. Subsequently, Calyptia was acquired by Chronosphere (Chronosphere acquires Calyptia), so they inherited the sponsorship. An agreement I had no issue with, as I’ve written before, I write as it is a means to share what I know with the broader community. It meant my advance would be immediately settled (the advance, which comes pretty late in the process, is a payment that the publisher recovers by keeping the author’s share of a book sale).
The new book…
How does this relate to the new book? Well, the sponsorship of Logs and Telemetry is coming to an end. As a result, it appears that the commercial marketing relationship between Chronosphere and Manning has reached an agreement. Unfortunately, in this case, the agreement over publishing content wasn’t shared with the author or me, or the commissioning editor at Manning I have worked with. So we had no input on the content, who would contribute a foreword (usually someone the author knows).
Manning is allowed to do this; it is the most extreme application of the agreement with me as an author. But that isn’t the issue. The disappointing aspect is the lack of communication – discovering a new title while looking at the Chronosphere website (and then on Manning’s own website) and having to contact the commissioning editor to clarify the situation isn’t ideal.
Reading between the lines (and possibly coming to 2 + 2 = 5), Chronosphere’s new log management product launch, and presumably being interested in sponsoring content that ties in. My first book with Manning (Logging in Action), which focused on Fluentd, includes chapters on logging best practices and using logging frameworks. As a result, a decision was made to combine chapters from both books to create the new title.
Had we been in the loop during the discussion, we could have looked at tweaking the content to make it more cohesive and perhaps incorporated some new content – a missed opportunity.
If you already have the Logging in Action and Logs and Telemetry titles, then you already have all the material in Logging Best Practices. While the book is on the Manning site, if you follow the link or search for it, you’ll see it isn’t available. Today, the only way to get a copy is to go to Chronosphere and give them your details. Of course, suppose you only have one of the books. In that case, I’d recommend considering buying the other one (yes, I’ll get a small single-digit percentage of the money you spend), but more importantly, you’ll have details relating to the entire Fluent ecosystem, and plenty of insights that will help even if you’re currently only focused on one of the Fluent tools.
Going forward
While I’m disappointed by how this played out, it doesn’t mean I won’t work with Manning again. But we’ll probably approach things a little differently. At the end of the day, the relationship with Manning extends beyond commercial marketing.
Manning has a tremendous group of authors, and aside from writing, the relationship allows me to see new titles in development.
Working with the development team is an enriching experience.
It is a brand with a recognized quality.
The social/online marketing team(s) are great to interact with – not just to help with my book, but with opportunities to help other authors.
As to another book, if there was an ask or need for an update on the original books, we’d certainly consider it. If we identify an area that warrants a book and I possess the necessary knowledge to write it, then maybe. However, I tend to focus on more specialized domains, so the books won’t be best-selling titles. It is this sort of content that is most at risk of being disrupted by AI, and things like vibe coding will have the most significant impact, making it the riskiest area for publishers. Oh, and this has to be worked around the day job and family.
The latest release of Fluent Bit is only considered a patch release (based on SemVer naming). But given the enhancements included it would be reasonable to have called it a minor change. There are some really good enhancements here.
Character Encoding
As all mainstream programming languages have syntaxes that lend themselves to English or Western-based languages, it is easy to forget that a lot of the global population use languages that don’t have this heritage, and therefore can’t be encoded using UTF-8. For example, according to the World Factbook, 13.8% speak Mandarin Chinese. While this doesn’t immediately translate into written communication or language use with computers, it is a clear indicator that when logging, we need to support log files that can be encoded to support idiomatic languages, such as Simplified Chinese, and recognized extensions, such as GSK and BIG5. However, internally, Fluent Bit transmits the payload as JSON, so the encoding needs to be handled. This means log file ingestion with the Tail plugin ideally needs to support such encodings. To achieve this, the plugin features a native character encoding engine that can be directed using a new attribute called generic. encoding, which is used to specify the encoding the file is using.
The Win**** encodings are Windows-based formats that predate the adoption of UTF-8 by Microsoft.
Log Rotation handling
The Tail plugin, has also seen another improvement. Working with remote file mounts has been challenging, as it is necessary to ensure that file rotation is properly recognized. To improve the file rotation recognition, Fluent Bit has been modified to take full advantage of fstat. From a configuration perspective, we’ll not see any changes, but from the viewpoint of handling edge cases the plugin is far more robust.
Lua scripting for OpenTelemetry
In my opinion, the Lua plugin has been an underappreciated filter. It provides the means to create customized filtering and transformers with minimal overhead and effort. Until now, Lua has been limited in its ability to interact with OpenTelemetry payloads. This has been rectified by introducing a new callback signature with an additional parameter, which allows access to the OLTP attributes, enabling examination and, if necessary, return of a modified set. The new signature does not invalidate existing Lua scripts with the older three or four parameters. So backward compatibility is retained.
The most challenging aspect of using Lua scripts with OpenTelemetry is understanding the attribute values. Given this, let’s just see an example of the updated Lua callback. We’ll explore this feature further in future blogs.
function cb(tag, ts, group, metadata, record)
if group['resource']['attributes']['service.name'] then
record['service_name'] = group['resource']['attributes']['service.name']
end
if metadata['otlp']['severity_number'] == 9 then
metadata['otlp']['severity_number'] = 13
metadata['otlp']['severity_text'] = 'WARN'
end
return 1, ts, metadata, record
end
Other enhancements
With nearly every release of Fluent Bit, you can find plugin enhancements to improve performance (e.g., OpenTelemetry) or leverage the latest platform enhancements, such as AWS services.
Just about any web-based application will have cookies, even if they are being used as part of session management. Then, if you’re in the business-to-consumer space, you’ll likely use tracking cookies to help understand your users.
Understanding what is required depends on which part of the world your application is being used in. For the European Union (EU) and the broader European Economic Area (EEA), this is easy as all the countries have ratified the GDPR and several related laws like the ePrivacy Directive.
For North America (USA and Canada), the issue is a bit more complex as it is a network of federal and state/province law. But the strictest state legislation, such as California, aligns closely with European demands, so as a rule of thumb, meet EU legislation, and you should be in pretty good shape in North America (from a non-lawyer’s perspective).
The problem is that the EEA accounts for 30 countries (see here), plus the USA and Canada, and we have 32 of the UN’s recognized 195 states (note there is a difference between UN membership and UN recognition). So, how do we understand what the rules are for the remaining 163 countries?
I’m fortunate to work for a large multinational company with a legal team that provides guidelines for us to follow. However, I obviously can’t share that information or use it personally. Not to mention, I was a little curious to see how hard it is to get a picture of the global landscape and its needs.
It turns out that getting a picture of things is a lot harder than I’d expected. I’d assumed that finding aggregated guidance would be easy (after all, there are great sites like DLA Piper’s and the UN Trade & Development that cover the more general data protection law). But, far from it. I can only attribute this to the fact that there is a strong business in managing cookie consents.
The resources that I did find, which looked comprehensive on the subject:
Like many developers and architects, I track the news feeds from websites such as The New Stack and InfoQ. I’ve even submitted articles to some of these sites and saw them published. However, in the last week, something rather odd occurred: articles in The New Stack (TNS) appeared, attributed to me, although I had no involvement in the publication process with TNS; yet, the content is definitely mine. So what appears to be happening?
To help answer this, let me provide a little backstory. Back in October and November last year, we completed the publication of my book about Fluent Bit (called Logs and Telemetry with a working title of Fluent Bit with Kubernetes), a follow-up from Logging In Action (which covered Fluent Bit’s older sibling, Fluentd). During the process of writing these books, I have had the opportunity to get to know members of the team behind these CNCF projects and consider them as engineering friends. Through several changes, the core team has primarily come to work for Chronosphere. To cut a long story short, I connected the Fluent Bit team to Manning, and they sponsored my book (giving them the privilege of giving away a certain number of copies of the book, cover branding, and so on).
It appears that, as part of working with Manning’s marketing team, authors are invited to submit articles and agree to have them published on Manning’s website. Upon closer examination, the articles appear to have been sponsored by Chronosphere, with an apparent reference to Manning publications. So somewhere among the marketing and sales teams, an agreement has been made, and content has been reused. Sadly, no one thought to tell the author.
I don’t in principle have an issue with this, after all, I wrote the book, and blog on these subjects because I believe enabling an understanding of technologies like Fluent Bit is valuable and my way of contributing to the IT community (Yes, I do see a little bit of money from sales, but the money-to-time and effort ratio works out to be less than minimum wage).
The most frustrating bit of all of this is that one of the articles links to a book I’ve not been involved with, and the authors of Effective Platform Engineering aren’t being properly credited. It turns out that Chronosphere is sponsoring Effective Platform Engineering (Manning’s page for this is here).
Oracle has an intern programme. While the programmes differ around the world because of the way relationships with educational establishments work and the number of interns that can be supported within a particular part of the organization, there is a common goal—transitioning (under)graduates from a world of theory into productive junior staff (in the cases I work with, that’s developers).
This blog summarizes the steps I have taken with my mentees and elaborates on how I personally approach the mentor role. It serves as a checklist for myself so I don’t have to recreate it as we embark on a new journey.
Interns typically have several lines of reporting – the intern programme leadership, an engineering manager, and a technical mentor. The engineering manager and technical mentor is typically a senior engineer or architect with battle-hardened experience who is able to explain the broader picture and why things are done in particular ways. The mentor and engineering manager roles can often overlap. But the two points of contact exist, as that is how we run our product teams.
Each of the following headings covers the different phases of an intern engagement.
Introduction conversation
When starting the intern programme, and as with any mid-sized organisation, there is a standard onboarding process, which will cover all the details such as corporate policy, and possibly mandatory development skills. While this is happening, I’ll reach out to briefly introduce myself, and tell the intern to let me know when they think they’ll have completed that initial work. We use that as the point at which we have an initial, wide-ranging conversation covering …
expectations, goals, and rules of engagement
I have a couple of simple rules, which I ask my team and interns to work by:
Don’t say you understand when you don’t – as a mentor, your learning is as much my ability to communicate as it is your attention
No question is stupid
Your questions help me understand you better, pitch my communication better, appreciate what needs to be explained, and point out the best resources to help.
The more you ask, the more I share, which will help you find your path.
Mistakes are fine (and often, you can learn a lot from them), but we should own them (no deflection, or hiding – if there is a mistake or problem, let’s address it as soon as possible) and never repeat the same mistake.
We discuss the product’s purpose, value proposition, and definition of success. How do the architecture and technologies being used contribute to the solution? This helps, particularly when some of the technologies may not be seen as cool. It also provides context for the stories the intern will pick up.
Ongoing dialogue
During the internship, we have a weekly one-to-one call. Initially, the focus is to discuss progress, but as things progress, I encourage the intern to use the session to discuss anything they wish. From technologies to what things they enjoy. How they’re progressing, what is good, what can be better. Resources available to learn from, things to try.
Importantly, I put emphasis on the fact that the interns feel part of the team, never need to wait for these weekly calls if they have concerns, questions, requests, need help, etc. We get a grip on it early before things start to go very wrong.
Tasks & backlog
While the interns may not (at least to start with) be working on a product or at least be focussed on immediate tasks, we adopt normal working processes and practices. So, we manage tasks through JIRA; development processes are the same.
The major goals during the internship need to support a narrative for the intern’s degree defence. At the same time, they need to get a taste of the technologies being used across the product, such as the front-end presentation tier and the persistence and integration tech stack. The work needs to ultimately contribute to the product development programme.
In the stories early on in the internship, we keep well off the critical path, which means we can take the time to learn and understand why things are the way they are without any pressure. As the internship progresses, we start to bring stories in that are linked to specific deliverable milestones.
try and have a narrative for their degree/post-grad defence
Being part of the team
A mentor is only one part of the intern’s education journey. Ideally, learning can come from very interaction, so we need to facilitate:
It’s important that the interns feel part of the team, so they’re included in all the stand-ups and sprint planning. The intern tasks are managed as stories, just like everyone else’s. Being part of a team will help ease the tensions that can be experienced if someone is working with someone they also know has to evaluate their progress.
This gives the interns the chance to build relationships with others with whom they can talk and learn from those who are closer to where they are in their career journey.
Helping their Learning
From a mentor supporting technical development, when questions are asked, we take the time to not only answer the immediate question but also talk about the context and rationale. We look at where sources of information can be found – we don’t want to get into spoon feeding people, otherwise they’ll never stand up and figure things out for themselves. It is better that people seek some direction and then figure things out. Then go back and present what the right answer is. This way you embed initiative, different perspectives can be seen and life is easier if you’re told of a problem and then offered a solution.
Feedback
Feedback is important; if they’re doing well, then it reassures them to know this, and you’ll see more of what they’re like if you hire them. If there are problems, it is best to have quiet, informal one-to-one conversations. Things aren’t bad, but we all can be better. This positioning is constructive, and as a mentor, I’m there to help the intern find the way to overcome any weaknesses, or to recognize that strengths may be suited in other roles. The outcome of an internship should not be a surprise, but simply a formalized ceremony.
When we start our IT career (and depending on how long ago you started), the idea of software and legislation seemed pretty remote; the only rules you might have to contend with were your local development standards. As an architect today, that is far from the case, as the saying goes, you need to be a ‘Jack of all Trades’. You don’t need to be a lawyer, but you have to have a grasp of legislation and agreements that can impact, and recognise when it is time to talk to the legal eagles.
I thought it worthwhile calling out the different things we need to have a handle on, based on my experience. There will always be domain-specific laws, but the following are largely universal..
Software licenses—Today, we rarely build a solution without using a library, package, utility, or even a full application we haven’t written ourselves.
But what we can and can’t do with that third-party asset or reasonably expect from it, provided the resource is provided, is dictated by a license, explicit or implicit. Consider the implications of an Apache license compared to a Creative Commons Share-Alike. In terms of negative impact, open source licenses can at worst…
Prevent code from being used commercially or to provide commercial services (several software vendors, such as Elastic and Hashicorp, have adopted this).
Require you to share whatever you develop using open-source libraries
Declare your use of libraries (remember, such information can provide clues on possible attack vectors).
Fortunately, licenses for software solutions under several organizational umbrellas, such as the Linux Foundation (and its subsidiary organizations, such as the CNCF), require the projects to adopt a permissive licensing model.
Commercial licenses can come into play as well. The Open Source model often involves the key contributing organizations offering services such as support and training, or extended features. A|ttractive for larger organizations so that they have a fallback and access to specialist resources. However, we also have products that only exist commercially. Understanding the licensing position of these tools is essential – for example, Oracle database, where you pay for production deployments by the number of CPUs, but non-production deployments are free. Such licensing may have material on the architecture, for example, minimizing the amount of non-DB compute effort on those nodes that take place, and sizing your solution such that you have more CPUs but with less power to provide better resilience. In terms of negative impacts…
You can become exposed to unplanned license costs that hadn’t been planned.
Undermine the solution’s cost-benefit
GDPR – There are many variations of the General Data Protection Regulation (GDPR), but most have taken GDPR as a foundation. Covering concepts of the right to know and correct data held about an individual, disclosure as to personal data use, and the right to be forgotten are essential. There are resources available that cover which laws apply where. The negative impacts…
Additional development processes and administration to create evidence of compliance (eg, audit of access to data)
Additional costs to satisfy compliance, e.g, regular mandatory training for all developers that could be impacted
Several acts, such as the US Cloud Act, can also impact the choices of service providers when using hosting, such as cloud providers. This highlights an interesting factor to keep in mind: legislation from other countries can still impact the situation even if the solution will not be used in that country. Impacts could be…
Using sovereign cloud and any associated costs.
Solution options are controlled by the availability of sovereign cloud services.
Limit the use of managed services to make the solution portable to different sovereign clouds.
AI and ML are rapidly evolving areas of legislation. The EU has been proactive in this space with the AI Act. However, secondary legislative factors exist, such as intellectual property law. While we may not all be directly involved in training LLMs, we still need to understand the ramifications and the data we work with. Possible impacts can include…
Data source assurance processes.
PCI—While the Payment Card Industry (PCI) does not have legal standing, its impact is broad and substantial, so we might as well treat it as such. The exact rules PCI requires depend on whether you’re an organization providing the use and storage of cards or a service provider.
In areas like PCI, while not strictly legislation, certain domain compliances demand compliance with various standards, perhaps the most pervasive of these is ISO27001, which covers information security across the spectrum of business/commercial considerations, but extends to infrastructure, software, and its development IT. Understanding this and standards such as SOC 1, SOC 2, and SSAE16 (now 18 and 22) are essential to understand, as these are standards you need to determine if they are important to you when considering cloud and SaaS services, particularly. Things have improved over time, but we have encountered specialist managed/cloud services where the providers are unaware of such standards and have no position or evidence of addressing some of the expectations set out by SOC1 and SOC2.
If you work for a software vendor, exportation law can impact your business, particularly when the solution involves complex algorithms such as those used in encryption.
These points primarily focus on ‘universal truths’, but there are domain-specific laws and expected standards that can be considered in the same or similar light. As with all domains, there are specialist legislation requirements like the Digital Operational Resilience Act (DORA) that impact financial businesses and Consumer Protection (Distance Selling) for e-tail.
The rapid development of generative AI in traditional code development (third-generation language use) has had a lot of impact, with claims of massive productivity improvements. Given that developer productivity has historically been the domain of low-code tooling, this has led me to wonder whether the gap is shrinking and whether we are approaching a point where the benefits of low-code tools are being eroded for mainstream development.
To better understand this, let’s revisit how both technologies help.
AI-supported development
Delivered value in several ways:
Code refactoring and optimization
Code documentation generation
Unit test generation
Next generation of auto-complete
This can include creating code in a green field context. If you’ve been following reports on the value of services like Copilot, AWS Q Developer, and Code Assist, you’ll see that these tools are delivering a significant productivity boost. A recent ACM article pointed to benefits as high as a threefold boost for more routine activities, tapering off as tasks became more complex.
Low Code
Low-code tools have been around for a long time, while they have evolved and progressed, and have come in a number of forms, such as:
UI applications that map databases to screens.
Business process is defined with a visual tool support for BPM.
Connecting different data sources by using visual notations to leverage representations of sources and sinks and link them together.
The central value proposition of low-code development is speed and agility. This performance comes with the constraint that your development has to fit into the framework, which may have constraints such as how it can scale, elasticity for rapid scaling, and performance optimization. ACM conducted some research into the productivity gains here.
Development acceleration narrowing
Low-code/no-code tools are often associated with the idea of citizen developers, where people with primarily a business background and a broad appreciation of IT are able to develop applications (personal experience points to more developers being able to focus less on code, and more on usability of apps). KPMG shares a view on this here.
Evolution of AI that could change low-code?
It would be easy to be a doom monger and say that this will be the end of highly paid software engineering jobs. But we have said this many times over in the last twenty or thirty years (e.g Future of Development).
Looking at the figures, the gains of Gen AI for code development aren’t going to invalidate Low/no code tooling. Where it really benefits is where a low-code tool is not going to offer a good fit to the needs being developed, such as complex graphical UI.
What if …
If Low-Code and Generative AI assistive technologies coalesce, then we’ll see a new generation of citizen developers who can accomplish a lot more. Typical business solutions will be built more rapidly. For example, I can simply describe the UI, and the AI generates a suitable layout that incorporates all the UX features, supporting the W3C guidelines. Furthermore, it may also be able to escape the constraints of low-code frameworks.
The work of developing very efficient, highly scalable Ui building blocks, with libraries to use them will still demand talented developers. Such work is likely to also involve AI model and agent development skills, so the AI can work out how to use such building blocks.
To build such capabilities, we’re going to need to help iron out issues of hallucination from the models. Some UX roles could well be impacted as well, as how we impose consistency in a user’s experience probably needs to be approached differently to defining templates.
Merging of assistive technologies
To truly leverage AI for low-code development, we will likely need to bring multiple concepts together, including describing UIs, linking application logic to leverage other services, and defining algorithms. Bringing these together will require work to harmonize how we communicate with the different AI elements so they can leverage a common context and interact with the user if using a single voice.
Conclusion
So the productivity gap between traditional development and low/no-code has shrunk a bit, I suspect we’ll see this grow quickly if generative AI can be harnessed and is applied, not just as a superficial enhancement, but from a ground-up revisit of how the load-code tooling works. Although the first wave, like everywhere else, will be superficial in the rush for everyone to say their service or tool is AI-enabled.
I’ve been a long-time fan of mind maps, as a means to take notes while reading books, and to help organize thoughts and ideas in my day job. If you’ve explored the content of this blog, you’ll have seen I have a page of mind maps (here) covering various subjects. I’ve published them, as much to share them freely, as to provide a quick access back to them for myself.
For a long time, I’ve been using the excellent iThoughts tool from ToketaWare (great UX and works across multiple platforms – IOS, Mac, and Windows with support for cloud storage). Sadly, in 2023, iThoughts reached end-of-life. I had resisted moving tools for a long time as I couldn’t settle on a solution I liked that wasn’t crazy expensive and has bells and whistles I didn’t want. I finally settled on Wondershare’s EdrawMind which has all the core capabilities, with one additional feature I’d always wished for with iThoughts – interactive navigation of the maps online.
I’ve finally found the time to migrate a lot of my shared documents and add a couple I hadn’t shared. You can access and interact with the mindmaps I’ve migrated by clicking on the icon on the Mindmaps Index page.
With the announcement of Fluent Bit v4 at Kubecon Europe, we thought it worthwhile to take a look at what it means, aside from celebrating 10 years of Fluent Bit.
Firstly, normally using Semantic Versioning would suggest likely breaking (or incompatible changes to use SemVer wording) changes. The good news is that, like all the previous version changes for Fluent Bit the numbering change only reflects the arrival of major new features.
This is good news for me as the author of Logs and Telemetry with Fluent Bit, as it means the book remains entirely relevant. The book obviously won’t address the latest features, but we’ll try to cover those here as supplemental content.
Let’s reflect upon the new features, their benefits, and their implications.
More flexible support for TLS (v1.3, choosing ciphers to enable)
New language for custom plugins in the form of Zig
Security Improvements
While security for many is not something that will get most developers excited about, there are things here that will make a CSO (Chief Security Officer) smile. Any developer who knows implementing security behaviors because it is a good thing, rather than because you have been told to do it, makes a CSO happy, puts them in a good place, to be given some more lianency when there is a need to do something that would get the CSO hot under the collar. Given this, we can now win those points with CSOs by using new Fluent Bit configurations that control TLS versions (1.1 – 1.3) and ciphers to support in use.
But even more fundamental than that are the improvements around basic credentials management. Historically, credentials and tokens had to be explicit in a configuration file or referenced back to an environment variable. Now, such values can come from a file, and as a result, there is no explicitness in the configuration. File security can manage access and visibility of such details. This will also make credentials rotation a lot easier to implement.
Processor Improvements
The processor improvements are probably the most exciting changes. Processors allow us to introduce additional activities within the pipeline as part of a process such as an input, rather than requiring additional buffer fetch and return which we see in standard plugin operations.
Of course, the downside is that if the processor introduces a lot of effort, we can create unexpected problems, such as back pressure, for example, as a result of a processor working hard on an input.
The other factor that extending processors bring is that they are not supported in classic format, meaning that to exploit such formats, you do need to define your configuration using YAML. The only thing I’m not a fan of, is that the configuration for these features does make me think I’m having to read algorithms expressed with Backus Naur form (BNF).
Trace Sampling
Firstly, the processors supporting OpenTelemetry Tracing can now sample. This is probably Fluent Bit’s only weakness in the Open Telemetry domain until now. Sampling is essential here as traces can become significant as you track application executions through many spans. When combined with each new transaction creating a new trace, traces can become voluminous. To control this explosion on telemetry data, we want to sample traces, collecting a percentage of typical traces (performance, latency, no errors, etc) and the outliers, where tracing will show us where a process is suffering, e.g., an end-to-end process is slowing because of a bottleneck. We can dictate how the sampling is applied based on values of existing attributes, the trace status, status codes, latencies, the number of spans, etc.
Conditionality in Processors
Conditionality makes it easier to respond to aspects of logs. For example, only when the logging payload has several attributes with specific values do we want to filter the event out for more attention. For example, an application reporting that it is starting up, and logs are classified as representing an error – then we may want to add a tag to the event so it can be easily filtered and routed to the escalation process.
Plugins with Zig
The enablement of Zig for plugin development (input, output and filters) is strictly an experimental feature. The contributors are confident they have covered all the typical use cases. But the innate flexibility supporting a language always represents potential edge cases never considered and may require some additional work to address.
Let’s be honest: Zig isn’t a well-known language. So, let’s start by looking briefly at it and why the community has adopted it for custom plugin development as an alternative to the existing options with Lua and WASM.
So Zig has a number of characteristics that align with the Fluent Bit ethos better than Lua and WASM, specifically:
It is a compiled rather than interpreted language, meaning that we reduce the runtime overheads of an interpreter or JIT compiler such as Lua and the proxy layer of WASM. This aligns to be very fast/minimal compute overhead to do its job, – ideal for IoT and minimising the cost of side-care container deployments.
The footprint for the Zig executable is very, very small—smaller than even a C-generated binary! As with the previous point, this lends itself to common Fluent Bit deployments.
The language definition is formally defined, compact, and freely available. This means you should be able to take a tool chain from anyone, and it is easy for specialist chip vendors to provide compilers.
Based on those who have tried, cross-compiling is far easier to deal with than working with GCC, MSVC, etc. Making it a lot easier to develop with the benefits we want from Go. Unlike Go – to connect to the C binary of Fluent Bit doesn’t require the use of a translation layer.
One of Zig’s characteristics that differs from C is its stronger typing and its approach of, rather than prescribing how edge cases are handled, e.g., null pointers, working to prevent you from entering those conditions.
Zig has been around for a few years (the first pre-release was in 2017, and the first non-pre-release was in August 2023). This is long enough for the supporting tooling to be pretty well fleshed out with package management, important building blocks such as the HTTP server, etc.
While asking a large enterprise with more conservative approaches to development (particularly when IT is seen as an overhead, and source of risk rather than a differentiator/revenue generator) to consider adopting Zig could be challenging compared to adopting, say Go. The different potential values here, make for some interesting potential.
Not Only, but Also
While we have made some significant advancements, each Fluent Bit release brings a variety of improvements in its plugins. For example, working with it with eBPF, HTTP output supports more compression techniques, such as Snappy and ZSTD, and Exit having a configurable delay.
The Plus version of library dependencies is being updated to exploit new capabilities or ensure Fluent Bit isn’t using libraries with vulnerabilities.
You must be logged in to post a comment.