JMESPath is a mature syntax for traversing and manipulating JSON objects. The syntax is also supported with multiple language implementations available through GitHub (and other implementations exist). As a result, it has been very widely adopted; just a few examples include:
AWS CLI and Lambda
Oracle Cloud WAF
As the syntax is very flexible and recursive in its use following the documented notation can be a little tricky to start with. So following the syntax can be rather tricky. The complete definition runs to 97 lines, of which 32 lines focus on the syntactical structure. The others describe the base types such as numbers, characters, accepted escaped characters, and so on. Nothing wrong with this, as the exhaustive definition is necessary to build parsers. But for the majority of the time it is those 32 lines that we need to understand.
As the expression goes – ‘a picture says a thousand words’, there might not be a thousand words, but there is enough to suggest a visual representation will help. Even if the visual only helps us traverse the use of the detailed syntax. So we’ve use our favoured visual representation – the railroad diagram and the tool produced by Tab Akins to create the representation. We’ve put the code and created images for the syntax in my GitHub repository here, continuing the pattern previously adopted.
Here is the resulting diagram …
To make it easy to trace back to the original syntax document we’ve included groupings on the diagram that have names from the original speciofication.
Parts of the diagram make the expressions look rather simple, but you’ll note that it is possible for the sections to be iterative which allows for the expression to traverse a JSON object of undefined depth. But what can be really challenging is that an in many areas it is possible to nest expressions within expressions. Visually there is no simple way to represent the expression possibilities of this in a linear manner. Other than be clear about where the nesting can take place.
Those who have been using my Logging in Action book will know that to help test the configuration of monitoring tools including Fluentd we have built a LogGenerator that can very easily play and replay logging events into a variety of destinations and formats. all written in Groovy to make the utility easy to run as a script and extend without needing to set up a proper Java development environment.
With the number of different destinations built into the script and the logic to load the source log events and format them the utility is getting rather large for a single file. Rather than letting it continue to grow as we add more destinations to pump log events too, I’ve extended the implementation so you can point to a Groovy file that implements the logic to send the log events. It only requires three simple methods to be implemented.
To demonstrate the feature we have created a custom extension and fully documented it. The extension allows you to send log events to the OCI Logging service. This includes an optional crude aggregation mechanism as sending individual log events is a little inefficient over REST. By doing this we can send synthetic or playback logs as if we’re an application in real-life to ensure that any alerting or routing for the logging works properly before we get anywhere production and do not need to run the application and induce error events.
Beyond this, we’re also thinking about creating a plugin to fire log events at Prometheus so we can send events using the Prometheus pushgateway. As a result, we can tune Prometheus’ configuration.
More improvements – refactoring the existing code
We will refactor the existing code to use the same approach which should make the code more maintainable, but the changes won’t stop the utility from working as it always has (so we won’t break out the existing output channels from the core).
We have also started to improve the code commenting – so hopefully it will make the code a bit more navigable.
Following my article on Software Engineering Daily, here are some practical things that will help you if you’re considering taking on a technical book project.
Identifying a Publisher
While it is easy to self-publish today. The recognition comes from having worked with a traditional publisher as they have processes that ensure a level of quality. Not all publishers are equal, and some publishers are attributed with more prestige than others. In addition to this, some publishers are willing to take a risk on a subject and/or author. Have a look at the titles already published, and whether there are any publishers you can connect to.
When comes to contacting the publishers, most of their websites will have a page for recruiting authors. Some are easier to find than others. Here are a couple:
If, or when you get to talk to a publisher it is worth ensuring you understand how their editorial process works and what is expected from you? Plus what happens if you find yourself in the position of not being able to work to the original schedule. Day-to-day work can get in the way which you hadn’t expected.
For a long time, I’ve tracked and read articles on Software Engineering Daily. We’ll day represents what is hopefully the first of many articles that we will write for them. The article is about the kind of people that make technical book authors, and the perception we have of authors – so if you’re interested check it out here.
The 12 Factor App definition is now ten years old. In the world of software that is a long time. So perhaps it’s time to revisit and review what it says. As I have spent a lot of time around Logging – I’ve focussed on Factor 11 – Logging.
I have been fortunate enough to present at the hybrid JAX London conference on this subject. It was great to get out and see people at a conference rather than just with a screen and a chat console of online-only events.
Having previously blogged about being a fan of Railroad diagrams (here) as a means to communicate language syntax, I have been asked about some of the ways details should be represented. I’ve looked around and not actually found an easy-to-read for ‘newbies’ guide to reading railroad diagrams. A lot of content either focuses on the generator tools or how the representation compares to BNF (Backus Naur Form) – all very distracting. So, I thought as an advocate, I should help address the gap (the documentation with TabAkins tool does a good job of explaining things, but its focus is on the features provided rather than understanding the notation).
Reading Railroad Diagrams
The following table provides all the information to help you interpret the Railroad Diagrams, and create syntax representations.
If you’re only interested in understanding the notation, read just the first two columns.
If you want to see examples of how to create the diagram, then the 3rd column will help.
Note: Clicking on the diagrams will result in a bigger version of the image being displayed.
The start and end of an expression are shown with vertical bars. The expression is read from left to right following a path defined by the line(s).
Diagram( Terminal('This '), Terminal(' is '), Terminal(' a '), Terminal(' mp3monster '), Terminal(' blog ') )
Larger diagrams may need to be read both across and down the page to make it sensibly fit a page like this.
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), Terminal(' mp3monster '), Terminal(' blog ')) )
We can differentiate between literal values and ‘variables’ by shape. A square shape should denote a variable, and a lozenge represents a literal. I have to admit to sometimes getting these the wrong way around, but as long as the notation is used consistently in an expression, it isn’t too critical. In this example, we have replaced mp3monster with the name of a variable called domain-name. So if I set the variable domain-name = mp3monster then I’d read the same result.
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), NonTerminal(' domain-name'), Terminal(' blog ')) )
We can make parts of an expression optional. This can be seen by following an alternate path around the optional element. In this case, we’ve made the domain-name optional. Assuming domain-name = mp3monster, we could get either: – This is a mp3monster blog OR – This is a blog
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), Optional( NonTerminal(' domain-name')), Terminal(' blog ')) )
We can represent optionality by having the flow split to multiple values. Those values could be either literal values or variables. In this case, we now have several possible names in the blog with the choices being mp3monster, phil-wilkins, cloud-native, or the contents of the variable domain-name. So the expression here could resolve to (assuming domain-name = something-else): – This is a mp3monster blog OR – This is a phil-wilkins blog OR – This is a cloud-native blog OR – This is a something-else blog It is typical for the option most likely to be selected to be the value that is directly in line with the choice. Here that would mean the default would be phil-wilkins
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), Choice(1, ' mp3monster', ' phil-wilkins ', ' cloud-native ', NonTerminal('domain-name')), Terminal(' blog ')) )
We can have variations on a choice where we can express the choice as being any of the options (one or more e.g. mp3monster and cloud-native) or all of the choices. These scenarios are differentiated by the additional symbols before and after the choice.
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), MultipleChoice(1, 'any',' mp3monster', ' phil-wilkins ', ' cloud-native ', NonTerminal('domain-name')), Terminal(' blog ')) )
We can represent the looping with the inclusion of an additional literal(s) or variable(s) by having a second line from the right (exit) of a literal or variable and flowing back into the left (entry) of a literal or variable. Then in the loop of the flow below are the variable(s) or literal(s) that go around between each occurrence of the loop. If our variable was a list now i.e. domain-name = [‘ mp3monster’, ‘cloud-native’] then this would resolve to : This is a mp3monster and cloud-native blog
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '), OneOrMore( NonTerminal('domain-name'), [' and '])), Terminal(' blog ') )
We can group literals and variables together with a label. These groupings are denoted with a dashed line and usually include some sort of label. In our example, we’ve grouped two literal values together and called that group the blog name.
Diagram( Stack( Terminal('This '), Terminal(' is '), Terminal(' a '),), Group(Sequence( Terminal(' mp3monster '), Terminal(' blog ')), ['blog name']) )
All of these constructs can be combined so you can do things like making choices optional, and iterate over multiple variables and literals.
Sometimes the number of options in a choice can get impractically large to show in a Railroad diagram. One way to overcome this is to show the first and last options and an ellipsis. We can then wrap it with a comment that directs the reader to the complete list of choices. This way the diagram continues to be readable – the most valuable part of the diagram and easy to locate the specific fine details.
Diagram( Stack( Terminal(‘This ‘), Terminal(‘ is ‘), Terminal(‘ a ‘), Group(Choice(1, ‘ mp3monster’, ‘ … ‘, ‘ cloud-native ‘), [‘ See Section X for full list’]), Terminal(‘ blog ‘)) )
One of the areas I present publicly is the use of Fluentd. including the use of distributed and multiple nodes. As many events have been virtual it has been easy to demo everything from my desktop – everything is set up so I can demo things very easily. While doing this all on one machine does point to how compact and efficient Fluentd is as I can run multiple instances concurrently it does undermine distributed capabilities somewhat.
Add to that I now work for Oracle it makes sense to use OCI resources. With that, I have been developing the scripts to configure Ubuntu VMs to set up the demo environments installing Ruby, Fluentd, and various gems needed and pulling the relevant configurations in. All the assets can be found in the GitHub repository https://github.com/mp3monster/logging-demos. The repository readme includes plenty of information as well.
While I’ve been putting this together using OCI, the fact that everything is based on Ubuntu should mean it can be run locally on VMs, WSL2, and adaptable for MacOS as well. The environment has been configured means you can still run on Ubuntu with a single node if desired.
Additional Log Destinations
As the demo will typically be run on OCI we can not only run the demo with a multinode setup, we have extended the setup with several inclusion files so we can utilize OCI services OpenSearch and OCI Log Analytics. If you don’t want to use these services simply replace the contents of several inclusion files including files with the contents of the dummy_inclusion.conf file provided.
The configuration works by each destination having one or two inclusion files. The files with the postfix of label-inclusion.conf contains the configuration to direct traffic to the respective service with a configuration that will push log events at a very high frequency to the destination. The second inclusion file injects the duplication of log events to each service. The inclusion declarations in the main node Fluentd config file references an environment variable that should provide the path to the inclusion file to use. As a result, by changing the environment variable to point to a dummy file it becomes possible o configure out the use of one of the services. The two inclusions mean we can keep the store declarations compact and show multiple labels being used. With the OpenSearch setup, we have a variant of the inclusion file model where the route inclusion can reference the logic that we would use in the label directly within the sore declaration.
The best way to see the use of the inclusions is to experiment with setting the different environment variables to reference the different files and then using the Fluentd dry-run feature (more on this in the book).
The setup script performs a number of tasks including:
Pulling from Git all the resources needed in terms of configuration files and folders
Retrieving the necessary plugins against the possibility of their use.
Setting up the various environment variables for:
environment variables to reference inclusion files
shortcut environment variables and aliases
network (IP) address for external services such as OpenSearch
Setting up a folder for OCI tokens needed.
Setting up temp folders to be used by OCI Plugins as a file-based cache.
Feeding the log analytics service is a more complex process to set up as the feeds need to have metadata about the events being ingested. The downside is the configuration effort is greater, but the payback is that it becomes easier to extract meaningful information quickly because the service has a greater understanding of the content. For example, attributing the logs to a type of source means the predefined or default log formats are immediately understood, and maximum meaning can be retrieved from the log event.
Going to OCI Log Analytics does cut out the need for the Connections hub, which would allow rules and routing to be defined to different OCI services which functionally can help such as directing log events to PagerDuty.
Infrastructure as Code (IaC) should be treated the same way as any other code. That is to say that we should be considering configuration management, testing, regression, code quality, and coverage. We should be addressing these points for the same reasons we address them with our application code. Such as ensuring that we don’t introduce bugs as things evolve and develop, ensuring that the code is maintainable over a prolonged period etc.
The problem is that the only real way to test IaC is to run it. Particularly with the likes of Terraform where it is largely declarative rather than containing a lot of logic. This point is nicely conveyed by Yevgeniy Brikman’s presentation (below)
The presentation goes on to illustrate Terratest which has the look and feel of JUnit or any other xUnit test framework. Terratest is implemented in Golang, But to be honest, given the nature of Terraform ( largely declarative meaning it enables ideas of composition and not sophisticated logic) it means the writing of the tests isn’t going to demand anything clever like how to achieve polymorphic behavior through Go’s type structures.
While Yevgeniy focussed on testing by invoking an application on the infrastructure deployed something we’ve described though our Platform Test logic (more here). You may wish to test things further by interrogating infrastructure components. For example, do I have the right number of nodes in a dynamic group or are container or server logs going into the cloud monitoring services.
Performing such checks is very easy with OCI as it provides a Golang SDK making it very easy to write tests that can call the OCI APIs and interrogate the setup. Better still when considering whether the Terraform configuration will behave correctly to support dynamic/auto-scaling can be done easily without modifying the Terraform configurations as part of the Terratest logic can include Go API calls to temporarily modify scaling triggers or invoking code that can stimulate OCI dynamic features.
Testing App Configuration
There is an interesting question to be considered. There is a point of separation between when to use Terraform (or Pulumi and others for that matter) and tools better suited to application deployment and configuration like Ansible and Chef. Therefore should we separate the testing of these details? Maybe I am too purist but seeing local and remote execs in Terraform as these actions are very opaque and can be used to conceal things or unwittingly depend on the way Terraform handles its dependency graph.
Of course, Ansible has its test framework ansible_test and has the means to measure test coverage. So one possibility is to treat Ansible as a separate module, independently test it, and then integrate its use in the wider picture of deploying infrastructure.
When building Node solutions, even if you’re not going to publish the code to a public repository you’re likely to be using package.json to declare the dependencies for your app. Doing this makes it easier to build and deploy a utility. But if you’re conversant with several languages there is a tendency to just adapt your existing skills to work with others. The downside of this is small tooling nuances can catch you off guard and consume time while figuring them out. The workings of packages with NPM (as shown below) is one possible case.
If you create the package.json using npm init to create the initial version of the file, it is fairly common to set values to default. In the case of the license, this is an ISC license. This is easily forgotten. The problem here is twofold:
Does the license set reflect the constraints of the dependencies and their licenses
Does the default license reflect the position you want?
Looking at the latter point first, This is important as organizations have matured (and tooling greatly improved) when it comes to understanding how open source licensing can impact. This is particularly important for any organizations leveraging open source as part of their revenue generating activities either ‘as a service’ but also selling software solutions. If you put the wrong license here the license checking tools often protecting code repositories may reject your code, even in internal only use cases (yes this tripped me up).
To help overcome this issue you can install a tool that will analyze the dependencies and optionally their dependencies and report back on your license exposure. This tool is called license-report. Once installed (npm install -g license-report) we just need to point the tool to the package.json file. e.g. license-report package.json. We can make the results a lot more consumable by outputting the content in a number of formats. For example a simple text value:
From this, you could set your license declaration in package.json or validate that your preferred license won’t conflict,
You must be logged in to post a comment.