The first part of a two part article about the sort of things an Ace Associate, or anyone else in a Technology Advocacy programme such as the Ace & Groundbreakers could approach social media has been published. Go check it out (http://www.oraworld.org/home/ – page 15) along with other articles in the latest edition of OraWorld covering subjects as diverse as Open World, Apex and Spam (read and you’ll understand).
I’d like to thank my colleagues, particularly James Neate for the inspiration behind this article.
When it comes to deployment of API Gateways, there are a couple of well-known patterns, that of the Internal Gateway and External Gateway (described in several resources including here).
These two deployments essentially reflect the considerations of offering endpoints up to less secure network segments such as the internet (external gateways) and trusted network zones (internal gateways). But in addition to the physical deployment within a network, these deployments are likely to host APIs with different characteristics, reflecting levels of trust, and emphasis on enterprise decoupling/abstraction (internal) – the reason why APIs are sometimes associated with the idea of SOA 2.0. Compared with security sensitivity, and potentially monetization or at least usage metrics to help protect specific attack vectors.
These deployment patterns can be seen in the following diagram.
Both the internal and external gateways are reflective of interest in the origin of the API traffic. However a rarer 3rd pattern does exist.
This pattern of crops up when you need to consider the ability to manage how internal solutions connect to outside services, for reasons such as:
When it comes to development, we have had coding standards for almost as long as we have been coding. We tend to look at coding standards for purposes of helping to promote good quality code and reduce the likelihood of bugs and so on. But they also help with readability, making it easy to navigate a code base and so on. This is sufficiently important that there is a vast choice of tools to help us ensure we align with good practices.
That readability etc, when it comes to code interfaces lends to making it easier to use an interface as it promotes consistency and as Don Norman would say avoids ‘cognitive load‘, in other words, the effort involved in performing actions with the interface. Any Java Developer will tell you, want to print out an object (any object) you get a string representation using the .toString() method and direct it using the io packages.
That consistency and predictability are important not just for code if you look at any API best practises documents you’ll encounter directly or indirectly the need to use conventions that drive consistency – use of singular or plural for the name of entities, application of case – camel case, snake case etc. Good naming etc and we’ll see related things appear together in the documentation. Products such as Apiary and SwaggerHub include tooling to help police this in our API design work.
But what about policies that we use to define how an API Gateway handles the receipt and routing of API invocations? Well yes, we should have standards here as well. Some might say, governance gone mad. But gateways are often shared services, so making it easy to see and logically group APIs together at very least by using a good naming convention will help as a minimum. If API management is being administered in a more DevOps fashion, then information security professionals will probably want assurance that developers are applying policies in a recommended manner.
The API Platform when you configure IDCS to provide the option to authenticate users against a corporate Identity Provider such as Active Directory will automatically update the Management Portal Login screen accordingly. However today it doesn’t automatically update the Developer Portal login page. Whilst perhaps an oversight, it is very easy to fix manually when you know how. As result you can have a login that looks like:
The rest of this blog will show what’s needed to fix the problem.
There are circumstances in which notifications from the Oracle API Platform CS could be seen as desirable. For example, if you wish to ensure that the developers are defining good APIs and not accidentally implementing APIs that hit the OWASP Top 10 for APIs. Then you will probably configure things such that developer users can design the APIs, configure the policies, but only request an API to be deployed.
However, presently notifications through mechanisms such as email or via collaboration platforms such as Slack aren’t available. But implementing a solution isn’t difficult. For the rest of this blog we’ll explore how this might be implemented, complete with a Slack implementation.
The news about Oracle offering some free cloud services ‘for life’ is making an impact. But, the free services don’t end there. The pricing of some other native cloud services includes some free bands. So it’s worth keeping an eye on the fine print. I wouldn’t be surprised if we see limited capacity access in other areas.
Oracle Functions – whilst the core of this service is built on the open-source Fn Project (also largely driven by Oracle) the managed service has a free tier allowing up to 2 million invocations that can consume 400, 000 gigabytes of memory per second use (details can be seen here). Plenty enough to experiment with the concepts behind Serverless aka FaaS capabilities.
Oracle Notifications whilst focussed on the technical side of gathering key event data from OCI and its services, as the document states “sending notifications to numerous interested parties, or even synchronizing the moving parts of a distributed application” – this obviously means a service with characteristics a bit like AWS’ SNS. Like SNS it can be hooked up to email and other HTTPS services using Oracle Events which also has free use. Events is particularly interesting as it is bases the event structure on the CNCF CloudEvents spec. There is an excellent illustration of such a use case in the Oracle blogs here.
It will be interesting to see if we a similar trend with other Oracle cloud-native services. A new take on the now-defunct Application Container Cloud Service (ACCS) would be an ideal vehicle – whether there is sufficient demand for such a capability is not clear (it would in effect be an always live service like a Kubernetes solution, but the simpler, smaller footprint more like Functions in a multi-tenant environment. At the same time, it doesn’t have potential latency of a Function being activated).
This is my blog post as part of the Oracle Ground Breakers Appreciation Day (more about this with oracle-base) isn’t about a specific product or feature but an approach or possibly two approaches that exist with many of the PaaS services available from Oracle.
One of the key things that many of Oracle’s products such as Integration Cloud, API Platform and the foundation of Functions (Fn) and Containers is the recognition that many organisations are not so fortunate to be cloud-born, or even working with a cloud-native model for IT. For those organisations who would rather have across location unifying approach, Oracle cloud is not a closed capability like AWS, whilst products like Integration Cloud are at their best on Oracle Cloud Infrastructure, they can be executed in your data centre, or even another cloud.
Whilst the teams I work with experiment and build our service offerings ‘on Oracle’, when we engage with customers to help them with their specific problem spaces, we are more often than not operating in a multi-cloud or on-premises hybrid model.
This hybrid story is helped with a renewed vigour for open source both contributing to but also leading the development of open source. In addition to providing free tiers to some of their stack such as Functions, IaaS and Database (here). Many do forget the Oracle JVM is free as long as you keep up to date, you have got a small footprint Oracle database for free (XE), MySQL is part of the Oracle family. Then many of the modern development technologies are true to the core open-source, Blockchain, Container Engine meaning that the solutions on these layers are portable, can be run on-prem. Yes, Oracle adds value by wrapping these cores with tooling and features that make easier rather than diverging with proprietary Ingress controllers for example.
The irony is that organisations that tend to be associated with a low cost or being faithful to open source goals actually can end up locking you in and appear to be moving away from the original open-source ideals. Consider RedHat, the champion for a lot of open source-based enablement have removed Kubernetes from the official RedHat downloads for their Linux in-favour of a single node license of OpenShift, to get Kubernetes of RHEL you have to go outside of the normal binary source channels (other challenges are documented here).
At the time of writing the Oracle API Platform doesn’t support the use of Socket connections for handling API data flows. Whilst the API Platform does provide an SDK as we’ve described in other blogs and our book it doesn’t allow the extension of how connectivity is managed.
The use of API Gateways and socket-based connectivity is something that can engender a fair bit of debate – on the one hand, when a client is handling a large volume of data, or expects data updates, but doesn’t want to poll or utilize webhooks then a socket strategy will make sense. Think of an app wanting to listen to a Kafka topic. Conversely, API gateways are meant to be relatively lightweight components and not intended to deal with a single call to result in massive latency as the back-end produces or waits to forward on data as this is very resource-intensive and inefficient. However, a socket-based data transmission should be subject to the same kinds of security controls, and home brewing security solutions from scratch are generally not the best idea as you become responsible for the continual re-verification of the code being secure and handling dependency patching and mitigating vulnerabilities in other areas.
So how can we solve this?
As a general rule of thumb, web sockets are our least preferred way of driving connectivity, aside from the resource demand, it is a fairly fragile approach as connections are subject to the vagaries of network connections, which can drop etc. It can be difficult to manage state (i.e. knowing what data has or hasn’t reached the socket consumer). But sometimes, it just is the right answer. Therefore we have developed the following pattern as the following diagram illustrates.
How it works …
The client initiates things by contacting the gateway to request a socket, with the details of the data wanted to flow through the socket. This can then be validated as both a legitimate request or (API Tokens, OAuth etc) and that the requester can have the data wanted via analyzing the request metadata.
The gateway works in conjunction with a service component and will if approved acquire a URI from the socket manager component. This component will provide a URL for the client to use for the socket request. The URL is a randomly generated string. This means that port scans of the exposed web service are going to be difficult. These URLs are handled in a cache, which ideally has a TTL (Time To Live). By using Something like Redis with its native TTL capabilities means that we can expire the URL if not used.
With the provided URL we could further harden the security by associating with it a second token.
Having received the response by the client, it can then establish the socket-based connection which gets routed around the API Gateway to the Socket component. This then takes the randomly-generated part of the URL and looks up the value in the cache, if it exists in the cache and the secondary token matches then the request for the socket is legitimate. With the socket connection having been accepted the logic that will feed the socket can commence execution.
If the request is some form of malicious intent such as a scan, probe or brute force attempt to call the URL then the attempt should fail because …
- If the socket URL has never existed in or has been expired from the Cache and the request is rejected.
- If a genuine URL is obtained, then the secondary key must correctly verify. If incorrect again the request is rejected.
- Ironically, any malicious attack seeking to overload components is most likely to affect the cache and if this fails, then a brute access tempt gets harder as the persistence of all keys will be lost i.e. nothing to try brute force locate.
You could of course craft in more security checks such as IP whitelisting etc, but every-time this is done the socket service gets ever more complex, and we take on more of the capabilities expected from the API Gateway and aside from deploying a cache, we’ve not built much more than a simple service that creates some random strings and caches them, combined with a cache query and a comparison. All the hard security work is delegated to the gateway during the handshake request.
Over the last couple of years, we have seen growing references to multi-cloud. That is to say, people are recognizing that organisations, particularly larger ones are ending up with cloud services for many different vendors. This at least in part has come from where departments within an organization can buy meaningful resources within their local budgets.
Whilst there is a competitive benefit of the recent partnership agreement between Microsoft and Oracle given the market margin AWS has in comparison to everyone else. Irrespective of the positioning with AWS, this agreement has arisen because of the adoption of multi-cloud. It also provides a solution to the problem of running highly resilient Oracle database setups using RAC, DataGuard etc can be made available to Azure without risk to security and the all-important network performances that are essentially to DB operation. Likewise, Oracle’s SaaS offerings are sector leaders if not best in place, something Microsoft can’t compete with. But at the other end, regardless of Oracle’s offerings, often organisations will prefer Microsoft development ecosystem because of the alignment to office tooling, the ease of building solutions quickly.
Multi-cloud even with the agreements like the Microsoft and Oracle one (See here), doesn’t mean there won’t be higher costs in crossing clouds. Let’s see where the costs reside …
- Data egress (and in some cases ingress as well) from clouds costs. Whilst the ingress costs have been eliminated because it can be seen as a barrier to selling services, particularly big data. Data egress can, however, be an issue. Oracle has made this cost very low to be almost negligible, but not necessarily others as the following comparison shows …
- Establishing the high-performance connections That the agreement supports needed between Azure and Oracle cloud is the same tech for the cloud to the ground do incur a cost. In Oracle’s case, there is a fee for the connection (not a large cost, but one that exists none the less) plus any traffic fees the provider of the network connection spanning the data centre locations. This is because you’re leasing capacity on someone’s dedicated fibre or MPLS services. Now, this should prove to be small as part of the enabler of this offering is that both Oracle and Microsoft cloud DCs are often actually physically provided by the same provider or at-least the centres a physically pretty close, as a result of both companies gravitating to locations close together because of the optimal highly available infrastructure (power, telecommunications) legal and commercial factors along with the specialist skills needed.
If data egress is the key challenge to costs, what drives the data egress beyond the obvious content for user interfaces? …
- Obviously, you have the business data flows, some of these flows will be understood by the business community. But not all, this is down to the way data from the cloud can be exposed to another. For example inefficient services with APIs that require frequent polling and not using expressing the request efficiently, rather than perhaps express the request using HTTP header attributes and other efficiencies or even utilize frameworks such as webhooks so data can be pushed.
- High-speed data access, often drives data replication having databases in multiple clouds with mirror image data in each location even if the majority of the data is not necessarily needed. This can happen with technologies such as Kafka which for non compacted topics will mean every event can be replicated even if that event has a short lifetime.
- One of the hidden costs is the operational tasks of gathering logs to a combined view so end to end insights can be obtained. A detailed log can actually yield more ‘data’ by volume than the business flows themselves because it is semi-structured, and intended to be very readable and at the most granular level there to help debug and test.
In addition to the data flows, you need to consider how other linkages in addition to the Oracle-Azure connection are involved. In the detailed documentation, it is not possible to get your on-premises location connected to one of the clouds (e.g. Oracle FastConnect, and then assume your traffic can hop to Azure via the bridge using FastConnect and Azure’s ExpressRoute. To add performance to your solution parts in both Azure and Oracle Cloud, you still need to have FastConnect and ExpressRoute configured to your on-premises location. This, of course, may impact how bulk data for lift and shift app use cases such as EBS may be applied. For example, if you choose to regularly bulk data transfer between on-premise and EBS via the app/middleware tier rather than via direct DB, and that mid-tier is running in Azure – you will need both routes established.
There is no doubt that the Oracle-Azure cloud to cloud linkage is a step forward, but ‘the devil is in the details‘ as the saying goes. To optimize the benefits and savings we’d suggest that you;
- you’ll need to think through your use cases – understand data flow and volume (someone bulk syncing application data with a data warehouse?),
- define a cloud data strategy – to layout principles, approaches and identify compliance needs, this is particularly helpful for custom solution development, so the right level of log data is consolidated with the important details, data retention addresses compliance requirements and doesn’t ratchet up unnecessary costs (there is a tendency to horde data just in case – if this is really wanted, think about how its stored),
- based on business common usage models define a simple forecasting formula – being able to quantify data costs will always make it easier to challenge back data hoarding tendency,
- confirm the inter-cloud network vendor charges when working with multi-cloud.