So this week the big Developer Week Europe conference is running online at the moment. I got to present today. It was a relatively short session, with an unfortunate brief interruption of a smoke alarm. My presentations is here …
The latest edition of OraWorld has become available to today. With its blend of insight into the Oracle community, and Oracle technologies from database to modern apps. I have to own up and say, I mention the magazine not only because of the beautifully crafted independent insights, but also it includes an article from myself. Taking a look at GraphQL what it is and how recent new Oracle product features could make a big difference to the GraphQL adoption opportunities.
The next edition should include a follow up article to this focussing on API security considerations.
A couple of years ago I got to discuss some of the design ideas behind API Platform Cloud Service. One of the points we discussed was how API Platform CS kept the configuration of APIs entirely within the platform, which meant some version management tasks couldn’t be applied like any other code. Whilst we’ve solved that problem (and you can see the various tools for this here API Platform CS tools). The argument made that your API policies are pretty important, if they get into the public domain then people can better understand to go about attacking your APIs and possibly infer more.
Move on a couple of years, Oracle’s 2nd generation cloud is established an maturing rapidly (OCI) and the organisational changes within Oracle mean PaaS was aligned to SaaS (Oracle Integration Cloud, Visual Builder CS as examples) or more cloud native IaaS. The gateway which had a strong foot in both camps eventually became aligned to IaaS (note that this doesn’t mean that the latest evolution of the API platform (Oracle Infrastructure API) will lose its cloud agnostic capabilities, as this is one of unique values of the solution, but over time the underpinnings can be expected to evolve).
Any service that has elements of infrastructure associated with it has been mandated to use Terraform as the foundation for definition and configuration. The Terraform mandate is good, we have some consistency across products with something that is becoming a defacto standard. However, by adopting the Terraform approach does mean all of our API configurations are held outside the product, raising the security risk of policy configuration is not hidden away, but conversely configuration management is a lot easier.
This has had me wondering for a long time, with the use of Terraform how do we mitigate the risks that API CS’s approach was trying to secure? But ultimately the fundamental question of security vs standardisation.
Any security expert will tell you the best security is layered, so if one layer is found to be vulnerable, then as long as the next layer is different then you’re not immediately compromised.
What this tells us is, we should look for ways to mitigate or create additional layers of security to protect the security of the API configuration. These principles probably need to extend to all Terraform files, after all it not only identifies security of not just OCI API, but also WAF, networks that are public and how they connect to private subnets (this isn’t an issue unique to Oracle, its equally true for AWS and Azure). Some mitigation actions worth considering:
- Consider using a repository that can’t be accidentally exposed to the net – configuration errors is the OWASP Top 10. So let’s avoid the mistake if possible. If this isn’t an option, then consider how to mitigate, for example …
- Strong restrictions on who can set or change visibility/access to the repo
- Configure a simple regular check that looks to see if your repos have been accidentally made publicly visible. The more frequent the the check the smaller the potential exposure window
- Make sure the Terraform configurations doesn’t contain any hard coded credentials, there are tools that can help spot this kind of error, so use them. Tools exist to allow for the scanning of such errors.
- Think about access control to the repository. It is well known that a lot of security breaches start within an organisation.
- Terraform supports the ability to segment up and inject configuration elements, using this will allow you to reuse configuration pieces, but could also be used to minimize the impact of a breach.
- Of course he odds are you’re going to integrate the Terraform into a CI/CD pipeline at some stage, so make sure credentials into the Terraform repo are also secure, otherwise you’ve undone your previous security steps.
- Minimize breach windows through credentials tokens and certificate hanging. If you use Let’s Encrypt (automated certificate issuing solution supported by the Linux Foundation). Then 90 day certificates isn’t new.
This may sound a touch paranoid, but as the joke goes….
Just because I’m paranoid, it doesn’t mean they’re not out to get me
Fundamental Security vs Standardisation?
As it goes the standardisation is actually a dimension of security. (This article illustrates the point and you can find many more). The premise is, what can be ensured as the most secure environment, one that is consistent using standards (defacto or formal) or one that is non standard and hard to understand?
I presented at an online Meetup on today (Thursday 16th April) with a shortened version of my API technology overview (A quick look at gRPC, GraphQL, REST APIs – Which way to go?). Aside from an early interruption to the event, the evening was an excellent series of speakers covering a number of API centric subjects.
More about the event and future events – https://www.meetup.com/TechItaliaTuscany/events/269621146/
When it comes to deployment of API Gateways, there are a couple of well-known patterns, that of the Internal Gateway and External Gateway (described in several resources including here).
These two deployments essentially reflect the considerations of offering endpoints up to less secure network segments such as the internet (external gateways) and trusted network zones (internal gateways). But in addition to the physical deployment within a network, these deployments are likely to host APIs with different characteristics, reflecting levels of trust, and emphasis on enterprise decoupling/abstraction (internal) – the reason why APIs are sometimes associated with the idea of SOA 2.0. Compared with security sensitivity, and potentially monetization or at least usage metrics to help protect specific attack vectors.
These deployment patterns can be seen in the following diagram.
Both the internal and external gateways are reflective of interest in the origin of the API traffic. However a rarer 3rd pattern does exist.
This pattern of crops up when you need to consider the ability to manage how internal solutions connect to outside services, for reasons such as:
The Oracle API Platform adopted an intelligent pricing model by basing costs on API call volumes and Logical gateway node groupings per hour. In our book about the API Platform (more here). We suggested that a good logical grouping would be to reflect the development, test, pre-production and production model. This makes it nice and easy to use gateway based routing to different environments without needing to change the API policy configuration as you promote your solution through environments.
We have also leveraged naming and Role/Group Based Access Controls to make it easy to operate the API Platform as a shared service, rather than each team having its own complete instance. In doing so the number of logical gateways needed is limited (I.e. not logical gateway divisions on per team models needed). Group management is very easy through the leveraging of Oracle’s Identity Cloud Service – which is free for managing users on the Oracle solutions, and also happens to a respected product in its own right.
Most organisations are not conducting development and testing 365 days a year, for 24 hours a day (yes in an ideal world prolonged soak and load tests would be run to help tease out cumulative issues such as memory leaks, but even then it isn’t perpetual). As a result, it would be ideal to not be using logical gateways for part of the day such as outside the typical development day, and weekends.
Whilst out of the out of hours traffic may drop to zero calls and we may even shutdown the gateway nodes, this alone doesn’t effectively reduce the number of logical gateways as the logical gateway aspect of the platform counts as soon as you create the logical group in the management portal. This in itself isn’t a problem as the API Platform drinks it’s own Champagne as the saying goes, and everything in the UI is actually available as a published REST endpoint. Something covered in the book, and in previous blog posts (for example Making Scripts Work with IDCS Deployed PaaS and Analytics and Stats for APIs). Rather than providing all the code, you can see pretty much all the calls necessary in the other utilities published.
Before defining the steps, there are a couple of things to consider. Firstly, the version of the API deployed to a specific logical gateway may not necessarily be the latest version (iteration) and when to delete the logical gateway this information is lost, so before deleting the logical gateway we should record this information to allow us to reinstate the logical gateway later.
As deleting logical gateways will remove the gateway from the system, when recreating the gateway we can use the same name, but the gateway is not guaranteed to get the same Id as before, as a result, we should when rebuilding always discover the Id from the name to be safe.
A logical gateway can not be deleted until all the physical nodes are reallocated, so we need to iterate through the nodes removing them. When it comes to reconnecting the nodes, this is a little more tricky as reconnecting the gateway appears to only be achievable with information known to the gateway node. Therefore the simplest thing is when bringing the node back online we take the information from the gateway-props.json file and run a script that determines whether the management tier knows about the node. If not then just re-run the create, start, join cycle., otherwise just run the start command.
As with the logical gateway, re-running the create, deploy, start cycle will result in the node having a new Id. This does mean that whilst the logical gateway name and even the node names will remain the same, the analytics data is likely to become unavailable, so you may wish to extract the analytics data. But then, for development and test, this data is unlikely to provide much long term value.
So based on this, our sequence for releasing the logical gateway needs to be…
- Capture the deployed APIs and the iteration numbers,
- Ideally shutdown the gateway node process itself,
- Delete all the gateway nodes from the logical gateway,
- Delete the logical gateway,
Recover would then be …
- Construct the logical gateway,
- Redeploy the APIs with the correct iteration numbers to the logical gateway using the recorded information- if no nodes are connected at this stage, the UI will provide a warning
- As gateway nodes, come back online, determine if it is necessary to execute the create, start, join or just start
Of course, these processes can be all linked to scheduling such as a cron job and/or server startup and shutdown processes.
It’s been a quiet month for this blog, but I’ve been pretty busy with a raft of other activities…
- a recent article on our sister site – oracle-integration.cloud on RPA.
- I also appear in an interview with K21 Academy here.
- Reviewing a new book on Enterprise API Management for Packt which we would very highly recommend if you want to understand the more Enterprise perspectives of adopting APIs, particularly if you’re considering APIs as a potential new revenue stream.
- UK Oracle User Group committees for TechFest (having been reviewing the paper submissions it looks like it’s going to be an excellent conference in December) and Southern Summit (next week).
- Just launched a number of sessions for the Oracle London Developer Meetup, with another to be announced soon (Blockchain) and potentially two more before the end of the year (we’re working on the speakers now).
I’ve started to subscribe to the APISecurity.io newsletter. The newsletter includes the analysis of recent API based security breaches along with other useful API related news. Some of the details of the breaches make for interesting reading and provide some good examples of what not to do. It is rather surprising how regularly the lack of the application of good practises is, including:
- Checking the payload is valid to the definition,
- Checking the payload size to ensure it is in the expected bounds,
- Use strong typing on the content received it will help validate the content and limit the chances of poisonous content like injected SQL,
- Ensuring the API has mitigation’s against the classic OWASP Top 10 – SQL Injection, poor authentication implementation.
More broadly, we see that people will recognise the need for applying penetration testing, and look to external organisations to perform the testing, when such work is commissioned the understanding of what the pen tester does is not understood by those commissioning the tests (SANS paper of security scoping), therefore know whether all the risks are checked. When you add to that, the temptation to keep such costs down resulting in the service provider not necessarily probing your APIs to the fullest extent. Not all penetration test services are equal, so simply working to a budget isn’t wise, yes there is a need for pragmatism, but only when you understand the cost/risk trade-off.
But also remember application logic and API definitions and the security controls in place change over time as do the discovery of new vulnerabilities on the stack you’re using, along with evolving compliance requirements. All meaning that a penetration test at the initial go-live is not enough and should be an inherent part of an APIs lifecycle.
When it comes to payload checks etc, products like Oracle’s API Platform make it easy to realise or provide out of the box checks for factors such as size limits, implementing payload checks, so better to use them.
If you ever need to be reminded that of why best practises are needed and should be implemented; a mindset of when not if a breach will happen will ensure you’re prepared and the teams are motivated to put the good practises in.