We haven’t blogged too much recently as we have been busy helping get and producing content for my employer Oracle, working with Software Engineering Daily, and developing a collaborative book. So, I thought I’d pull together some links to these new resources.
One of the areas I present publicly is the use of Fluentd. including the use of distributed and multiple nodes. As many events have been virtual it has been easy to demo everything from my desktop – everything is set up so I can demo things very easily. While doing this all on one machine does point to how compact and efficient Fluentd is as I can run multiple instances concurrently it does undermine distributed capabilities somewhat.
Add to that I now work for Oracle it makes sense to use OCI resources. With that, I have been developing the scripts to configure Ubuntu VMs to set up the demo environments installing Ruby, Fluentd, and various gems needed and pulling the relevant configurations in. All the assets can be found in the GitHub repository https://github.com/mp3monster/logging-demos. The repository readme includes plenty of information as well.
While I’ve been putting this together using OCI, the fact that everything is based on Ubuntu should mean it can be run locally on VMs, WSL2, and adaptable for MacOS as well. The environment has been configured means you can still run on Ubuntu with a single node if desired.
Additional Log Destinations
As the demo will typically be run on OCI we can not only run the demo with a multinode setup, we have extended the setup with several inclusion files so we can utilize OCI services OpenSearch and OCI Log Analytics. If you don’t want to use these services simply replace the contents of several inclusion files including files with the contents of the dummy_inclusion.conf file provided.
Representation of the Demo setup
The configuration works by each destination having one or two inclusion files. The files with the postfix of label-inclusion.conf contains the configuration to direct traffic to the respective service with a configuration that will push log events at a very high frequency to the destination. The second inclusion file injects the duplication of log events to each service. The inclusion declarations in the main node Fluentd config file references an environment variable that should provide the path to the inclusion file to use. As a result, by changing the environment variable to point to a dummy file it becomes possible o configure out the use of one of the services. The two inclusions mean we can keep the store declarations compact and show multiple labels being used. With the OpenSearch setup, we have a variant of the inclusion file model where the route inclusion can reference the logic that we would use in the label directly within the sore declaration.
The best way to see the use of the inclusions is to experiment with setting the different environment variables to reference the different files and then using the Fluentd dry-run feature (more on this in the book).
Setup script
The setup script performs a number of tasks including:
Pulling from Git all the resources needed in terms of configuration files and folders
Retrieving the necessary plugins against the possibility of their use.
Setting up the various environment variables for:
Slack token
environment variables to reference inclusion files
shortcut environment variables and aliases
network (IP) address for external services such as OpenSearch
Setting up a folder for OCI tokens needed.
Setting up temp folders to be used by OCI Plugins as a file-based cache.
Feeding the log analytics service is a more complex process to set up as the feeds need to have metadata about the events being ingested. The downside is the configuration effort is greater, but the payback is that it becomes easier to extract meaningful information quickly because the service has a greater understanding of the content. For example, attributing the logs to a type of source means the predefined or default log formats are immediately understood, and maximum meaning can be retrieved from the log event.
Going to OCI Log Analytics does cut out the need for the Connections hub, which would allow rules and routing to be defined to different OCI services which functionally can help such as directing log events to PagerDuty.
As a consultant working with clients, we always need to address security considerations for clients, their networks and data. Typically this might mean ensuring I could connect to the correct network through a VPN with the secure client software installed. Then work through a Citrix set-up for the tools we’re allowed to use.
Since the start of the pandemic, there seems to be a marked shift towards issuing consultants with customer provided laptops that have been configured and locked down. This means I can’t use the client laptop to connect to my employer’s network to interact with our own systems – making it easy to leverage our existing resources to support the customer and conversely no trust or contractual position that might allow our company devices connecting to a VPN or ring-fenced part of a network.
Interestingly there seems to have been a drift away from the ideas of BYOD (Bring Your Own Device) which may come from the fact that outside of smaller very tech-savvy organizations, BYOD can be seen as challenging to support.
As this Google Trends report shows over the last five years the trend has been until the last couple of months showing a generally downward trend. Not authoritative proof, but hints that it hasn’t accelerated as you might expect given remote working.
By the customer supplying a laptop, there is an effort to control intrusion and other security risks. But the problem is, now I have a device that I could easily take off-line and work to defeat the security setup and the client would be non the wiser, or worse it is another laptop that could ‘get lost’ or ‘stolen’ with a greater chance of having sensitive material. Every new device is without a doubt an elevated risk for the client and a cost to support (this of course is also an argument for not applying BYOD).
Today was the first run of some new presentation material looking at the use of GitHub Actions using Runners deployed on OCI Free Tier. The presentation was actually physical rather than virtual which was after 2 years of virtual presenting, rather refreshing. Not to mention the UKOUG hosted the event at the Oval Cricket ground, which made for an interesting venue. The example configuration is included in our GitHub OCI Utilities repository (as we use this solution to help validate and test our development work).
The presentation itself (which includes screenshots of the setup of a simple Action and runner) is here, note I have disconnected my Runners, but you will be able to see the Action configuration but if you try to trigger activity through my repository then nothing will happen.
While there is a deserved amount of publicity around the introduction of ARM compute onto OCI with the ARM Ampere CPU offering, and the amazing level of always free compute being provided (24GB of memory and 4 cores which can be used in any combination of servers). There have been some interesting announcements that perhaps haven’t drawn as much attention that they deserve. This includes OCI support for GitHub Actions, plus several new DevOps services and an Artifact Registry. We’ll comeback to the new services in another post. Today, let’s look at GitHub Actions.
Oracle’s data integration product landscape outside of GoldenGate has with the arrival of Oracle Cloud been confusing at times. This has meant finding the right product documentation can be challenging, and knowing which product to use in your own technology road-map can be harder to formulate. I believe the landscape is starting to settle now. But to understand the position, let’s look at the causes of disturbance and the changes that have occurred.
Why the complexity?
This has come from I think a couple of key factors. The organizational challenges triggered by Thomas Kurian’s departure which has resulted in rather than the product organization being essentially in three parts aligning roughly to Infrastructure, Platform and Applications to being two Infrastructure and Apps. Add to this Oracle’s cloud has gone through two revolutions. Generation 1 now called Classic was essentially a recognition that they needed an answer to Microsoft, Google and AWS quickly (Oracle are now migrating customers off classic). Then came Generation 2, which is a more considered strategy which is leveraging not just the lowest level stack of virtualization (network and compute), but driving changes all the way through the internals of applications by having them leverage common technologies such as microservices along with a raft of software services such as monitoring, logging, metering, events, notifications, FaaS and so on. Essentially all the services they offer are also integral to their own offerings. The nice thing about Gen2 is you can see a strong alignment to CNCF (Cloud Native Compute Foundation) along with other open public standards (formal or de-facto such as Microprofile with Helidon and Apache). As a result despite the perceptions of Oracle, modern apps are standard a better chance of portability.
Impact on ODI
Oracle’s Data Integration capabilities, cloud or otherwise have been best known as Oracle Data Integrator or ODI. The original ODI was the data equivelent of SOA Suite implementing Extract Load Transform (ELT) rather than ETL as it meant the Oracle DB was fully leveraged. This was built on the WebLogic server.
Along Came Cloud
Oracle cloud came along, and there is a natural need for ODI capabilities. Like SOA Suite, the first evolution was to provide ODI Cloud Service just like SOA Suite had SOA Cloud Service. Both are essentially the same on-premises product with UIs to manage deployment and configuration.
ODI’s cloud transformation for the cloud lead ODI CS evolving into DIPC (Data Integration Platform Cloud). Very much an evolution, but with a more web centered experience for designing the integrations. However, DIPC is no longer available (except possible to customers already using it).
Whilst DIPC had been evolving the requirement to continue with on-premises ODI capabilities is needed. Whilst we don’t know for sure, we can speculate that there was divergent development happening creating overhead as ODI as an on-prem solution remained. We then saw the arrival of ODI Marketplace, this provides an easier transition, including taking into account licensing considerations. DIPC has been superseded by ODI Marketplace.
Marketplace
Oracle has developed a Marketplace just like the other major players so that 3rd party vendors could offer their technologies on the Oracle cloud, just as you can with Azure and AWS. But Oracle have also exploited it to offer their traditional products normally associated with on-premise deployments in the cloud. As a result we saw ODI Marketplace. A smart move as it offers the possibility of exploiting on-prem licensing into the cloud along with portability.
So far the ODI capabilities in its different forms continued to leverage its WebLogic foundations. But by this time the Gen2 Oracle Cloud and the organizational structures behind it has been well established, and working up the value stack. Those products in the middleware space have been impacted by both the technology strategies and organization. As a result API for example have been aligned to the OCI native space, but Integration Cloud has been moved towards the Apps space. in many respects this reflects a low code vs code native model.
OCI ODI
Earlier this year (2020) Oracle launched a brand new ODI product, to use its full name Oracle Cloud Infrastructure Data Integration. This is an OCI native (i.e. Gen2 solution leveraging microservices technologies).
This new product appears to be a very much ground up build as it exploits Apache Spark and Function as a Service (FaaS) as foundational elements. As a ground up build, it doesn’t inherit all the adapters the original ODI can offer. This does mean as a solution today it is very focused on some specific needs around supporting the data movement between the various Oracle Cloud storage and Database as a Service solutions rather than general ingestion and extraction processes.
Conclusion
Products are evolving, but the direction of travel appears to be resolving. But we are still in that period where there are capability gaps between the Gen2 native solution and the traditional ODI via Marketplace solution. As a result the question becomes less which product, but when and if I have to invest in using ODI Marketplace how to migrate when the native product catches up.
A few weeks ago Oracle announced a new tool for all Oracle cloud users including the Always Free tier. Cloud Shell provides a Linux (Oracle v7.7) environment to freely use ( (within your tenancy’s monthly limits) – no paying for VM or using your limited set of VMs (for free-tier users) or anything like that.
As you can see the Shell can be started using a new icon at the top right (highlighted). When you open the shell for the 1st time, it takes a few moments to instantiate – and you’ll see the message at the top of the console window (also highlighted). The window provides a number of controls which allows you to expand to full screen and back again etc.
The shell comes preconfigured with a number of tools, such as Terraform with the Oracle extensions, OCI CLI, Java and Git, so linking to Developer Cloud or GitHub for example to manage your scripts etc is easy (as long as you know you GIT CLI – cheat sheet here). The info for these can be seen in the following screenshots.
In addition to the capabilities illustrated, the Shell is set up with:
Python (2 and 3)
SQL Plus
kubectl
helm
maven
Gradle
The benefit of all of this is that you can work from pretty much any device you like. It removes the need to manage and refresh security tokens locally to run scripts.
A few things to keep in mind whilst trying to use the Shell:
It is access controlled through IAM, so you can of course grant or block the use of the tool. Even with access to the shell, users will need to obviously have to have access to the other services to use the shell effectively.
The capacity of the home folder is limited to 5GB – more than enough for executing scripts and a few CLI based tools and plugins – but that will be all.
If the shell goes unused for 6 months then the tenancy admin will be warned, but if not used, then the storage will be released. You can, of course, re-activate the Shell features at a future date, but of course, it will be a blank canvas again.
For reasons of security access to the shell using SSH is blocked.
The shell makes for a great environment to manage and perform infrastructure development from and will be a dream for Linux hard code users. For those who like to be lazy with a visual IDE, there are ways around it (e.g. edit in GitHub) and sync. But power users will be more than happy with vim or vi.
There are circumstances in which notifications from the Oracle API Platform CS could be seen as desirable. For example, if you wish to ensure that the developers are defining good APIs and not accidentally implementing APIs that hit the OWASP Top 10 for APIs. Then you will probably configure things such that developer users can design the APIs, configure the policies, but only request an API to be deployed.
However, presently notifications through mechanisms such as email or via collaboration platforms such as Slack aren’t available. But implementing a solution isn’t difficult. For the rest of this blog we’ll explore how this might be implemented, complete with a Slack implementation.
Over the last couple of years, we have seen growing references to multi-cloud. That is to say, people are recognizing that organisations, particularly larger ones are ending up with cloud services for many different vendors. This at least in part has come from where departments within an organization can buy meaningful resources within their local budgets.
Whilst there is a competitive benefit of the recent partnership agreement between Microsoft and Oracle given the market margin AWS has in comparison to everyone else. Irrespective of the positioning with AWS, this agreement has arisen because of the adoption of multi-cloud. It also provides a solution to the problem of running highly resilient Oracle database setups using RAC, DataGuard etc can be made available to Azurewithout risk to security and the all-important network performances that are essentially to DB operation. Likewise, Oracle’s SaaS offerings are sector leaders if not best in place, something Microsoft can’t compete with. But at the other end, regardless of Oracle’s offerings, often organisations will prefer Microsoft development ecosystem because of the alignment to office tooling, the ease of building solutions quickly.
Multi-cloud even with the agreements like the Microsoft and Oracle one (See here), doesn’t mean there won’t be higher costs in crossing clouds. Let’s see where the costs reside …
Data egress (and in some cases ingress as well) from clouds costs. Whilst the ingress costs have been eliminated because it can be seen as a barrier to selling services, particularly big data. Data egress can, however, be an issue. Oracle has made this cost very low to be almost negligible, but not necessarily others as the following comparison shows …
Establishing the high-performance connections That the agreement supports needed between Azure and Oracle cloud is the same tech for the cloud to the ground do incur a cost. In Oracle’s case, there is a fee for the connection (not a large cost, but one that exists none the less) plus any traffic fees the provider of the network connection spanning the data centre locations. This is because you’re leasing capacity on someone’s dedicated fibre or MPLS services. Now, this should prove to be small as part of the enabler of this offering is that both Oracle and Microsoft cloud DCs are often actually physically provided by the same provider or at-least the centres a physically pretty close, as a result of both companies gravitating to locations close together because of the optimal highly available infrastructure (power, telecommunications) legal and commercial factors along with the specialist skills needed.
If data egress is the key challenge to costs, what drives the data egress beyond the obvious content for user interfaces? …
Obviously, you have the business data flows, some of these flows will be understood by the business community. But not all, this is down to the way data from the cloud can be exposed to another. For example inefficient services with APIs that require frequent polling and not using expressing the request efficiently, rather than perhaps express the request using HTTP header attributes and other efficiencies or even utilize frameworks such as webhooks so data can be pushed.
High-speed data access, often drives data replication having databases in multiple clouds with mirror image data in each location even if the majority of the data is not necessarily needed. This can happen with technologies such as Kafka which for non compacted topics will mean every event can be replicated even if that event has a short lifetime.
One of the hidden costs is the operational tasks of gathering logs to a combined view so end to end insights can be obtained. A detailed log can actually yield more ‘data’ by volume than the business flows themselves because it is semi-structured, and intended to be very readable and at the most granular level there to help debug and test.
In addition to the data flows, you need to consider how other linkages in addition to the Oracle-Azure connection are involved. In the detailed documentation, it is not possible to get your on-premises location connected to one of the clouds (e.g. Oracle FastConnect, and then assume your traffic can hop to Azure via the bridge using FastConnect and Azure’s ExpressRoute. To add performance to your solution parts in both Azure and Oracle Cloud, you still need to have FastConnect and ExpressRoute configured to your on-premises location. This, of course, may impact how bulk data for lift and shift app use cases such as EBS may be applied. For example, if you choose to regularly bulk data transfer between on-premise and EBS via the app/middleware tier rather than via direct DB, and that mid-tier is running in Azure – you will need both routes established.
Conclusion
There is no doubt that the Oracle-Azure cloud to cloud linkage is a step forward, but ‘the devil is in the details‘ as the saying goes. To optimize the benefits and savings we’d suggest that you;
you’ll need to think through your use cases – understand data flow and volume (someone bulk syncing application data with a data warehouse?),
define a cloud data strategy – to layout principles, approaches and identify compliance needs, this is particularly helpful for custom solution development, so the right level of log data is consolidated with the important details, data retention addresses compliance requirements and doesn’t ratchet up unnecessary costs (there is a tendency to horde data just in case – if this is really wanted, think about how its stored),
based on business common usage models define a simple forecasting formula – being able to quantify data costs will always make it easier to challenge back data hoarding tendency,
confirm the inter-cloud network vendor charges when working with multi-cloud.
Another Spring means another excellent Oracle EMEA PaaS Forum for Oracle partners. Every Year Juergen Kress organizes the event, finding really nice venues to host several hundred people over four and half days.
The event is split into several parts, Monday afternoon normally involves Oracle Ace’s presenting on best practices, insights on applying the various technologies etc. For me this meant presenting on the London Developer Meetup, looking at how it worked, what has been successful, and what hasn’t. For those know have read my blogs on the subject (here) will know about our Drone initiative.
Then Tuesday is a single stream day where Juergen has managed to pull in SVPs and Senior Product Managers from around the globe to provide a high-level view of what has been going on with their products. For anyone consulting in the Oracle domain, this is incredibly useful. For example, there is a clear strategy coalescing around AI and Machine Learning both as a service proposition to users, but also how these technologies are being made available and used within other products. Other areas such as OIC and SOA CS have stability and maturity, and the road map is about maximising connectivity with the newer products.
But before the sessions start, Juergen starts with opening remarks, and demos’ something engaging. In previous years this has been things like Digital Assistants/Chatbots and so on. This year, we have been fortunate to be an active contributor by demoing the drone through the use of APIs and talking about the ideas. The dry runs of the demo on Monday went without a problem, but when it came to the main show, the drone was a little uncooperative – we think because the air-con had really kicked in. But importantly, even not achieving the desired result, the message of engagement made it home.
Wednesday is split into streams with in-depth sessions from the different Product Managers, he amount of insight gained from these sessions is tremendous, some of which is very much protected by safe harbour statements or not for public disclosure such is the honest and open discussions. The day closes with an Ace Director initiative which demonstrates the application of Oracle Cloud products to a plausible use case, and Luis Weir (Capgemini Oracle CTO) is part of. This session has become something of a tradition now.
The day’s business concludes awards, and for a second year the UK Capgemini team have taken home two awards for APIs and PaaS Contribution.
Luis Weir with his API award
The final two days are then a choice of Hackerthon or 1/2 day training sessions on different products with the relevant Product Managers, and an excellent opportunity to pick the brains of the presenters as well as get hands-on experience with the different products.
The week isn’t without it’s social and networking activities of course …
You must be logged in to post a comment.