While there is a deserved amount of publicity around the introduction of ARM compute onto OCI with the ARM Ampere CPU offering, and the amazing level of always free compute being provided (24GB of memory and 4 cores which can be used in any combination of servers). There have been some interesting announcements that perhaps haven’t drawn as much attention that they deserve. This includes OCI support for GitHub Actions, plus several new DevOps services and an Artifact Registry. We’ll comeback to the new services in another post. Today, let’s look at GitHub Actions.
Oracle’s data integration product landscape outside of GoldenGate has with the arrival of Oracle Cloud been confusing at times. This has meant finding the right product documentation can be challenging, and knowing which product to use in your own technology road-map can be harder to formulate. I believe the landscape is starting to settle now. But to understand the position, let’s look at the causes of disturbance and the changes that have occurred.
Why the complexity?
This has come from I think a couple of key factors. The organizational challenges triggered by Thomas Kurian’s departure which has resulted in rather than the product organization being essentially in three parts aligning roughly to Infrastructure, Platform and Applications to being two Infrastructure and Apps. Add to this Oracle’s cloud has gone through two revolutions. Generation 1 now called Classic was essentially a recognition that they needed an answer to Microsoft, Google and AWS quickly (Oracle are now migrating customers off classic). Then came Generation 2, which is a more considered strategy which is leveraging not just the lowest level stack of virtualization (network and compute), but driving changes all the way through the internals of applications by having them leverage common technologies such as microservices along with a raft of software services such as monitoring, logging, metering, events, notifications, FaaS and so on. Essentially all the services they offer are also integral to their own offerings. The nice thing about Gen2 is you can see a strong alignment to CNCF (Cloud Native Compute Foundation) along with other open public standards (formal or de-facto such as Microprofile with Helidon and Apache). As a result despite the perceptions of Oracle, modern apps are standard a better chance of portability.
Impact on ODI
Oracle’s Data Integration capabilities, cloud or otherwise have been best known as Oracle Data Integrator or ODI. The original ODI was the data equivelent of SOA Suite implementing Extract Load Transform (ELT) rather than ETL as it meant the Oracle DB was fully leveraged. This was built on the WebLogic server.
Along Came Cloud
Oracle cloud came along, and there is a natural need for ODI capabilities. Like SOA Suite, the first evolution was to provide ODI Cloud Service just like SOA Suite had SOA Cloud Service. Both are essentially the same on-premises product with UIs to manage deployment and configuration.
ODI’s cloud transformation for the cloud lead ODI CS evolving into DIPC (Data Integration Platform Cloud). Very much an evolution, but with a more web centered experience for designing the integrations. However, DIPC is no longer available (except possible to customers already using it).
Whilst DIPC had been evolving the requirement to continue with on-premises ODI capabilities is needed. Whilst we don’t know for sure, we can speculate that there was divergent development happening creating overhead as ODI as an on-prem solution remained. We then saw the arrival of ODI Marketplace, this provides an easier transition, including taking into account licensing considerations. DIPC has been superseded by ODI Marketplace.
Oracle has developed a Marketplace just like the other major players so that 3rd party vendors could offer their technologies on the Oracle cloud, just as you can with Azure and AWS. But Oracle have also exploited it to offer their traditional products normally associated with on-premise deployments in the cloud. As a result we saw ODI Marketplace. A smart move as it offers the possibility of exploiting on-prem licensing into the cloud along with portability.
So far the ODI capabilities in its different forms continued to leverage its WebLogic foundations. But by this time the Gen2 Oracle Cloud and the organizational structures behind it has been well established, and working up the value stack. Those products in the middleware space have been impacted by both the technology strategies and organization. As a result API for example have been aligned to the OCI native space, but Integration Cloud has been moved towards the Apps space. in many respects this reflects a low code vs code native model.
This new product appears to be a very much ground up build as it exploits Apache Spark and Function as a Service (FaaS) as foundational elements. As a ground up build, it doesn’t inherit all the adapters the original ODI can offer. This does mean as a solution today it is very focused on some specific needs around supporting the data movement between the various Oracle Cloud storage and Database as a Service solutions rather than general ingestion and extraction processes.
Products are evolving, but the direction of travel appears to be resolving. But we are still in that period where there are capability gaps between the Gen2 native solution and the traditional ODI via Marketplace solution. As a result the question becomes less which product, but when and if I have to invest in using ODI Marketplace how to migrate when the native product catches up.
A few weeks ago Oracle announced a new tool for all Oracle cloud users including the Always Free tier. Cloud Shell provides a Linux (Oracle v7.7) environment to freely use ( (within your tenancy’s monthly limits) – no paying for VM or using your limited set of VMs (for free-tier users) or anything like that.
As you can see the Shell can be started using a new icon at the top right (highlighted). When you open the shell for the 1st time, it takes a few moments to instantiate – and you’ll see the message at the top of the console window (also highlighted). The window provides a number of controls which allows you to expand to full screen and back again etc.
The shell comes preconfigured with a number of tools, such as Terraform with the Oracle extensions, OCI CLI, Java and Git, so linking to Developer Cloud or GitHub for example to manage your scripts etc is easy (as long as you know you GIT CLI – cheat sheet here). The info for these can be seen in the following screenshots.
In addition to the capabilities illustrated, the Shell is set up with:
Python (2 and 3)
The benefit of all of this is that you can work from pretty much any device you like. It removes the need to manage and refresh security tokens locally to run scripts.
A few things to keep in mind whilst trying to use the Shell:
It is access controlled through IAM, so you can of course grant or block the use of the tool. Even with access to the shell, users will need to obviously have to have access to the other services to use the shell effectively.
The capacity of the home folder is limited to 5GB – more than enough for executing scripts and a few CLI based tools and plugins – but that will be all.
If the shell goes unused for 6 months then the tenancy admin will be warned, but if not used, then the storage will be released. You can, of course, re-activate the Shell features at a future date, but of course, it will be a blank canvas again.
For reasons of security access to the shell using SSH is blocked.
The shell makes for a great environment to manage and perform infrastructure development from and will be a dream for Linux hard code users. For those who like to be lazy with a visual IDE, there are ways around it (e.g. edit in GitHub) and sync. But power users will be more than happy with vim or vi.
There are circumstances in which notifications from the Oracle API Platform CS could be seen as desirable. For example, if you wish to ensure that the developers are defining good APIs and not accidentally implementing APIs that hit the OWASP Top 10 for APIs. Then you will probably configure things such that developer users can design the APIs, configure the policies, but only request an API to be deployed.
However, presently notifications through mechanisms such as email or via collaboration platforms such as Slack aren’t available. But implementing a solution isn’t difficult. For the rest of this blog we’ll explore how this might be implemented, complete with a Slack implementation.
Over the last couple of years, we have seen growing references to multi-cloud. That is to say, people are recognizing that organisations, particularly larger ones are ending up with cloud services for many different vendors. This at least in part has come from where departments within an organization can buy meaningful resources within their local budgets.
Whilst there is a competitive benefit of the recent partnership agreement between Microsoft and Oracle given the market margin AWS has in comparison to everyone else. Irrespective of the positioning with AWS, this agreement has arisen because of the adoption of multi-cloud. It also provides a solution to the problem of running highly resilient Oracle database setups using RAC, DataGuard etc can be made available to Azurewithout risk to security and the all-important network performances that are essentially to DB operation. Likewise, Oracle’s SaaS offerings are sector leaders if not best in place, something Microsoft can’t compete with. But at the other end, regardless of Oracle’s offerings, often organisations will prefer Microsoft development ecosystem because of the alignment to office tooling, the ease of building solutions quickly.
Multi-cloud even with the agreements like the Microsoft and Oracle one (See here), doesn’t mean there won’t be higher costs in crossing clouds. Let’s see where the costs reside …
Data egress (and in some cases ingress as well) from clouds costs. Whilst the ingress costs have been eliminated because it can be seen as a barrier to selling services, particularly big data. Data egress can, however, be an issue. Oracle has made this cost very low to be almost negligible, but not necessarily others as the following comparison shows …
Establishing the high-performance connections That the agreement supports needed between Azure and Oracle cloud is the same tech for the cloud to the ground do incur a cost. In Oracle’s case, there is a fee for the connection (not a large cost, but one that exists none the less) plus any traffic fees the provider of the network connection spanning the data centre locations. This is because you’re leasing capacity on someone’s dedicated fibre or MPLS services. Now, this should prove to be small as part of the enabler of this offering is that both Oracle and Microsoft cloud DCs are often actually physically provided by the same provider or at-least the centres a physically pretty close, as a result of both companies gravitating to locations close together because of the optimal highly available infrastructure (power, telecommunications) legal and commercial factors along with the specialist skills needed.
If data egress is the key challenge to costs, what drives the data egress beyond the obvious content for user interfaces? …
Obviously, you have the business data flows, some of these flows will be understood by the business community. But not all, this is down to the way data from the cloud can be exposed to another. For example inefficient services with APIs that require frequent polling and not using expressing the request efficiently, rather than perhaps express the request using HTTP header attributes and other efficiencies or even utilize frameworks such as webhooks so data can be pushed.
High-speed data access, often drives data replication having databases in multiple clouds with mirror image data in each location even if the majority of the data is not necessarily needed. This can happen with technologies such as Kafka which for non compacted topics will mean every event can be replicated even if that event has a short lifetime.
One of the hidden costs is the operational tasks of gathering logs to a combined view so end to end insights can be obtained. A detailed log can actually yield more ‘data’ by volume than the business flows themselves because it is semi-structured, and intended to be very readable and at the most granular level there to help debug and test.
In addition to the data flows, you need to consider how other linkages in addition to the Oracle-Azure connection are involved. In the detailed documentation, it is not possible to get your on-premises location connected to one of the clouds (e.g. Oracle FastConnect, and then assume your traffic can hop to Azure via the bridge using FastConnect and Azure’s ExpressRoute. To add performance to your solution parts in both Azure and Oracle Cloud, you still need to have FastConnect and ExpressRoute configured to your on-premises location. This, of course, may impact how bulk data for lift and shift app use cases such as EBS may be applied. For example, if you choose to regularly bulk data transfer between on-premise and EBS via the app/middleware tier rather than via direct DB, and that mid-tier is running in Azure – you will need both routes established.
There is no doubt that the Oracle-Azure cloud to cloud linkage is a step forward, but ‘the devil is in the details‘ as the saying goes. To optimize the benefits and savings we’d suggest that you;
you’ll need to think through your use cases – understand data flow and volume (someone bulk syncing application data with a data warehouse?),
define a cloud data strategy – to layout principles, approaches and identify compliance needs, this is particularly helpful for custom solution development, so the right level of log data is consolidated with the important details, data retention addresses compliance requirements and doesn’t ratchet up unnecessary costs (there is a tendency to horde data just in case – if this is really wanted, think about how its stored),
based on business common usage models define a simple forecasting formula – being able to quantify data costs will always make it easier to challenge back data hoarding tendency,
confirm the inter-cloud network vendor charges when working with multi-cloud.
Another Spring means another excellent Oracle EMEA PaaS Forum for Oracle partners. Every Year Juergen Kress organizes the event, finding really nice venues to host several hundred people over four and half days.
The event is split into several parts, Monday afternoon normally involves Oracle Ace’s presenting on best practices, insights on applying the various technologies etc. For me this meant presenting on the London Developer Meetup, looking at how it worked, what has been successful, and what hasn’t. For those know have read my blogs on the subject (here) will know about our Drone initiative.
Then Tuesday is a single stream day where Juergen has managed to pull in SVPs and Senior Product Managers from around the globe to provide a high-level view of what has been going on with their products. For anyone consulting in the Oracle domain, this is incredibly useful. For example, there is a clear strategy coalescing around AI and Machine Learning both as a service proposition to users, but also how these technologies are being made available and used within other products. Other areas such as OIC and SOA CS have stability and maturity, and the road map is about maximising connectivity with the newer products.
But before the sessions start, Juergen starts with opening remarks, and demos’ something engaging. In previous years this has been things like Digital Assistants/Chatbots and so on. This year, we have been fortunate to be an active contributor by demoing the drone through the use of APIs and talking about the ideas. The dry runs of the demo on Monday went without a problem, but when it came to the main show, the drone was a little uncooperative – we think because the air-con had really kicked in. But importantly, even not achieving the desired result, the message of engagement made it home.
Wednesday is split into streams with in-depth sessions from the different Product Managers, he amount of insight gained from these sessions is tremendous, some of which is very much protected by safe harbour statements or not for public disclosure such is the honest and open discussions. The day closes with an Ace Director initiative which demonstrates the application of Oracle Cloud products to a plausible use case, and Luis Weir (Capgemini Oracle CTO) is part of. This session has become something of a tradition now.
The day’s business concludes awards, and for a second year the UK Capgemini team have taken home two awards for APIs and PaaS Contribution.
The final two days are then a choice of Hackerthon or 1/2 day training sessions on different products with the relevant Product Managers, and an excellent opportunity to pick the brains of the presenters as well as get hands-on experience with the different products.
The week isn’t without it’s social and networking activities of course …
The Oracle API Platform takes a different licensing model to many platforms, rather than on CPU it works by the use of Logical Gateways and blocks of 25 million successful API calls per month. This means you can have as many actual gateway nodes as you like within a logical group to ensure resilience as you like, essentially how widely you deploy the gateways is more of a maintenance consideration (i.e. more nodes means more gateways to take through a maintenance process from the OS through to the gateway itself).
In our book (here) we described the use of logical gateways (groups of gateway nodes operating together) based on the classic development model, which provides a solid foundation and can leverage the gateway based routing policy very effectively.
But, things get a little trickier if you move into the cloud and elect to distribute the back end services geographically rather than perhaps have a single global instance for the back-end implementation and leverage technologies such as Content Delivery Networks to cache data at the cloud edge and their rapid routing capabilities to offset performance factors.
Classic Global split of geographies
Some of the typical reasons for geographically distributing solutions are …
The low hit rate on data meaning caching solutions like CDNs are unlikely to yield performance benefits wanted and considerable additional work is needed to ‘warm’ the cache,
Different regions require different back end implementations ordering of products in one part of the world may be fulfilled using a partner, but in another, it is directly satisfied,
Data is subject to residency/sovereignty rules – consider China for example. But Germany and India also have special considerations as well.
So our Global splits start to look like:
Global Split now adding extra divisions for India, China, Russia etc
The challenge that comes, is that the regional routing which may be resolved on the Internet side of things through Geo Routing such as the facilities provided by AWS Route53 and Oracle’s Dyn DNS as a result finding nearest local gateway. However Geo DNS may not be achievable internally (certainly not for AWS), as a result, routing to the nearest local back-end needs to be handled by the gateway. Gateway based routing can solve the problem based on logical gateways – so if we logically group gateways regionally then that works. But, this then conflicts with the use of gateway based routing for separation of Development, Test etc.
So, what are the options? Here are a few …
Make you Logical divisions both by the environment and by region – this is fine if you’re processing very high volumes i.e. hundreds of millions or more so the cost of additional Logical gateways is relatively small it the total budget.
Taking the geo split and applying the traditional layers as well has increased the number of Logical gateways
This problem can be further exacerbated, if you consider many larger organisations are likely to end up with different cloud vendors in the same part of the world, for example, AWS and Azure, or Oracle and Google. So continuing the segmentation can become an expensive challenge as the following view helps show:
It is possible to contract things slightly by only have development and test cloud services where ever your core development centre is based. Note that in the previous and next diagrams we’ve removed the region/country-specific gateway drivers.
Don’t segment based on environment, but only on the region – but then how do you control changes in the API configuration so they don’t propagate immediately into production?
Keep the existing model but clone APIs for each region – certainly the tooling we’ve shared (Managing API Policy Versioning in Oracle API Platform) makes this possible, but it’s pretty inelegant and error-prone as it be easy to forget to clone a change, and the cloning logic needs to be extended to take into account the bits that must be region-specific.
Assuming you have a DNS address for the target, you could effectively rewrite the resolution of the address by changing its meaning in each gateway node’s host file. Inelegant, but effective if you have automated deployment and configuration of your gateway servers.
Header based routing with the region and environment as header attributes. This does require either the client to set the values (not good as you’re revealing to your API consumer traits of the implementation), or you apply custom policies before the header-based routing that insert those attributes based on the gateway’s location etc.
Build a new type of gateway based routing which allows both the environment (dev, test etc) and location (region) to inform the routing,
Or, and the point of this blog, use gateway based routing and leverage some intelligent DNS naming and how the API Platform works with a little bit of Groovy or a custom Java policy.
The Oracle API Platform provides the means to examine statistics and slice and dice the numbers by application, gateway, duration and so on resulting in visually appealing graphical representations. The way the analytics works means you can book mark specific views, so you can return the same report view with the relevant features as often as you like. However, presently there is no data export option.
The question why would I want to export the information comes down to several possible use cases, all of which relate to cost management. The API Platform will eventually have all the desired data views, but now something to help address the following:
money-tization, we can see which consumer has been using the services by how much and then send the data to a companies accounting systems to invoice the users
Ability to examine demand and workload over time to create a projection of the likely infrastructure – to achieve this the API statistics need to be overlaid with infrastructure and performance details so we can extrapolate API growth against server workload.
To address these kinds of requirements, we have taken advantage of the fact the API Platform has drunk its own Champagne as they say and made many of the analytics querying APIs publicly available. As with the other API Platform tools, the logic has been written in Groovy, and freely available for use – we’ve covered the code through a Create Common license.
Tool includes a range of parameters to allow the data retrieved into a CSV file having filtered in a number of different ways – which logical gateways to examine, which API or Application(s) to report on. Finally, just to help some basic stats are produced with a count of logical gateways, API calls, APIs defined and Application definitions. The first three factors inform your cloud costs. Together the stats can help Oracle understand your use case. Note that the parameters which impact the CSV generation can also materially impact the reporting numbers.
The 1st three values must always be provided and in the order shown here
user name to access the source management cloud
password for the source management cloud
The server address without any attributes e.g. https://188.8.131.52
All the following values are optional
-h or -help – provides this information
-g – Logical gateway to retrieve numbers from e.g. production or development. using ALL with this parameter will result in ALL gateways being examined
-f – the file to target the CSV data should be written to. If not set then the default of
-t – indicates whether the data provided should be taken from an APPS perspective or from an API view by passing either APPS | API
-d – will get script to report more information about what is happening
-p – reporting period which is defined by a number as follows:
0 – Last 365 days – data is given as per month
1 – Last 30 days – this is the default if no information is provided – data is given as per day
2 – Last 7 days – data is given as per day
3 – Last day – data is given as per hour
NB – still testing the utility at this moment – will remove this comment once happy
An extract of our new book Implementing API Platform has been made available by the publishers Packt here. Of course you could enjoy all the content by buying the book directly from Packt (go here) or from book retails such as Amazon (here).