The latest edition of OraWorld has become available to today. With its blend of insight into the Oracle community, and Oracle technologies from database to modern apps. I have to own up and say, I mention the magazine not only because of the beautifully crafted independent insights, but also it includes an article from myself. Taking a look at GraphQL what it is and how recent new Oracle product features could make a big difference to the GraphQL adoption opportunities.
The next edition should include a follow up article to this focussing on API security considerations.
A couple of years ago I got to discuss some of the design ideas behind API Platform Cloud Service. One of the points we discussed was how API Platform CS kept the configuration of APIs entirely within the platform, which meant some version management tasks couldn’t be applied like any other code. Whilst we’ve solved that problem (and you can see the various tools for this here API Platform CS tools). The argument made that your API policies are pretty important, if they get into the public domain then people can better understand to go about attacking your APIs and possibly infer more.
Move on a couple of years, Oracle’s 2nd generation cloud is established an maturing rapidly (OCI) and the organisational changes within Oracle mean PaaS was aligned to SaaS (Oracle Integration Cloud, Visual Builder CS as examples) or more cloud native IaaS. The gateway which had a strong foot in both camps eventually became aligned to IaaS (note that this doesn’t mean that the latest evolution of the API platform (Oracle Infrastructure API) will lose its cloud agnostic capabilities, as this is one of unique values of the solution, but over time the underpinnings can be expected to evolve).
Any service that has elements of infrastructure associated with it has been mandated to use Terraform as the foundation for definition and configuration. The Terraform mandate is good, we have some consistency across products with something that is becoming a defacto standard. However, by adopting the Terraform approach does mean all of our API configurations are held outside the product, raising the security risk of policy configuration is not hidden away, but conversely configuration management is a lot easier.
This has had me wondering for a long time, with the use of Terraform how do we mitigate the risks that API CS’s approach was trying to secure? But ultimately the fundamental question of security vs standardisation.
Any security expert will tell you the best security is layered, so if one layer is found to be vulnerable, then as long as the next layer is different then you’re not immediately compromised.
What this tells us is, we should look for ways to mitigate or create additional layers of security to protect the security of the API configuration. These principles probably need to extend to all Terraform files, after all it not only identifies security of not just OCI API, but also WAF, networks that are public and how they connect to private subnets (this isn’t an issue unique to Oracle, its equally true for AWS and Azure). Some mitigation actions worth considering:
- Consider using a repository that can’t be accidentally exposed to the net – configuration errors is the OWASP Top 10. So let’s avoid the mistake if possible. If this isn’t an option, then consider how to mitigate, for example …
- Strong restrictions on who can set or change visibility/access to the repo
- Configure a simple regular check that looks to see if your repos have been accidentally made publicly visible. The more frequent the the check the smaller the potential exposure window
- Make sure the Terraform configurations doesn’t contain any hard coded credentials, there are tools that can help spot this kind of error, so use them. Tools exist to allow for the scanning of such errors.
- Think about access control to the repository. It is well known that a lot of security breaches start within an organisation.
- Terraform supports the ability to segment up and inject configuration elements, using this will allow you to reuse configuration pieces, but could also be used to minimize the impact of a breach.
- Of course he odds are you’re going to integrate the Terraform into a CI/CD pipeline at some stage, so make sure credentials into the Terraform repo are also secure, otherwise you’ve undone your previous security steps.
- Minimize breach windows through credentials tokens and certificate hanging. If you use Let’s Encrypt (automated certificate issuing solution supported by the Linux Foundation). Then 90 day certificates isn’t new.
This may sound a touch paranoid, but as the joke goes….
Just because I’m paranoid, it doesn’t mean they’re not out to get me
Fundamental Security vs Standardisation?
As it goes the standardisation is actually a dimension of security. (This article illustrates the point and you can find many more). The premise is, what can be ensured as the most secure environment, one that is consistent using standards (defacto or formal) or one that is non standard and hard to understand?
I presented at an online Meetup on today (Thursday 16th April) with a shortened version of my API technology overview (A quick look at gRPC, GraphQL, REST APIs – Which way to go?). Aside from an early interruption to the event, the evening was an excellent series of speakers covering a number of API centric subjects.
More about the event and future events – https://www.meetup.com/TechItaliaTuscany/events/269621146/
When it comes to deployment of API Gateways, there are a couple of well-known patterns, that of the Internal Gateway and External Gateway (described in several resources including here).
These two deployments essentially reflect the considerations of offering endpoints up to less secure network segments such as the internet (external gateways) and trusted network zones (internal gateways). But in addition to the physical deployment within a network, these deployments are likely to host APIs with different characteristics, reflecting levels of trust, and emphasis on enterprise decoupling/abstraction (internal) – the reason why APIs are sometimes associated with the idea of SOA 2.0. Compared with security sensitivity, and potentially monetization or at least usage metrics to help protect specific attack vectors.
These deployment patterns can be seen in the following diagram.
Both the internal and external gateways are reflective of interest in the origin of the API traffic. However a rarer 3rd pattern does exist.
This pattern of crops up when you need to consider the ability to manage how internal solutions connect to outside services, for reasons such as:
The Oracle API Platform adopted an intelligent pricing model by basing costs on API call volumes and Logical gateway node groupings per hour. In our book about the API Platform (more here). We suggested that a good logical grouping would be to reflect the development, test, pre-production and production model. This makes it nice and easy to use gateway based routing to different environments without needing to change the API policy configuration as you promote your solution through environments.
We have also leveraged naming and Role/Group Based Access Controls to make it easy to operate the API Platform as a shared service, rather than each team having its own complete instance. In doing so the number of logical gateways needed is limited (I.e. not logical gateway divisions on per team models needed). Group management is very easy through the leveraging of Oracle’s Identity Cloud Service – which is free for managing users on the Oracle solutions, and also happens to a respected product in its own right.
Most organisations are not conducting development and testing 365 days a year, for 24 hours a day (yes in an ideal world prolonged soak and load tests would be run to help tease out cumulative issues such as memory leaks, but even then it isn’t perpetual). As a result, it would be ideal to not be using logical gateways for part of the day such as outside the typical development day, and weekends.
Whilst out of the out of hours traffic may drop to zero calls and we may even shutdown the gateway nodes, this alone doesn’t effectively reduce the number of logical gateways as the logical gateway aspect of the platform counts as soon as you create the logical group in the management portal. This in itself isn’t a problem as the API Platform drinks it’s own Champagne as the saying goes, and everything in the UI is actually available as a published REST endpoint. Something covered in the book, and in previous blog posts (for example Making Scripts Work with IDCS Deployed PaaS and Analytics and Stats for APIs). Rather than providing all the code, you can see pretty much all the calls necessary in the other utilities published.
Before defining the steps, there are a couple of things to consider. Firstly, the version of the API deployed to a specific logical gateway may not necessarily be the latest version (iteration) and when to delete the logical gateway this information is lost, so before deleting the logical gateway we should record this information to allow us to reinstate the logical gateway later.
As deleting logical gateways will remove the gateway from the system, when recreating the gateway we can use the same name, but the gateway is not guaranteed to get the same Id as before, as a result, we should when rebuilding always discover the Id from the name to be safe.
A logical gateway can not be deleted until all the physical nodes are reallocated, so we need to iterate through the nodes removing them. When it comes to reconnecting the nodes, this is a little more tricky as reconnecting the gateway appears to only be achievable with information known to the gateway node. Therefore the simplest thing is when bringing the node back online we take the information from the gateway-props.json file and run a script that determines whether the management tier knows about the node. If not then just re-run the create, start, join cycle., otherwise just run the start command.
As with the logical gateway, re-running the create, deploy, start cycle will result in the node having a new Id. This does mean that whilst the logical gateway name and even the node names will remain the same, the analytics data is likely to become unavailable, so you may wish to extract the analytics data. But then, for development and test, this data is unlikely to provide much long term value.
So based on this, our sequence for releasing the logical gateway needs to be…
- Capture the deployed APIs and the iteration numbers,
- Ideally shutdown the gateway node process itself,
- Delete all the gateway nodes from the logical gateway,
- Delete the logical gateway,
Recover would then be …
- Construct the logical gateway,
- Redeploy the APIs with the correct iteration numbers to the logical gateway using the recorded information- if no nodes are connected at this stage, the UI will provide a warning
- As gateway nodes, come back online, determine if it is necessary to execute the create, start, join or just start
Of course, these processes can be all linked to scheduling such as a cron job and/or server startup and shutdown processes.
It’s been a quiet month for this blog, but I’ve been pretty busy with a raft of other activities…
- a recent article on our sister site – oracle-integration.cloud on RPA.
- I also appear in an interview with K21 Academy here.
- Reviewing a new book on Enterprise API Management for Packt which we would very highly recommend if you want to understand the more Enterprise perspectives of adopting APIs, particularly if you’re considering APIs as a potential new revenue stream.
- UK Oracle User Group committees for TechFest (having been reviewing the paper submissions it looks like it’s going to be an excellent conference in December) and Southern Summit (next week).
- Just launched a number of sessions for the Oracle London Developer Meetup, with another to be announced soon (Blockchain) and potentially two more before the end of the year (we’re working on the speakers now).