A Brave New World - Dealing with security in AWS

By Russ Spitler, VP of Product Strategy for AlienVault. 

  • Monday, 4th May 2015 Posted 9 years ago in by Phil Alsop

When we move to a world where we are virtualizing everything, automating as much as we can through an API and renting compute power by the hour our whole operational model starts to change. I am sure we are all familiar with the opportunity afforded us when we can backup our databases with a simple API call and scale our web servers to handle bursts of traffic without any intervention, but with this new world also comes a new set of responsibilities for us to address from a security standpoint. In EC2, we make a tradeoff for the automation we love, we now share the responsibility for security with someone else - Amazon.

The Shared Security Model
Security in Amazon EC2 is the shared responsibility of the user and Amazon. This makes perfect sense in the grand scheme of things - there are things only an end user can do and there are things that only Amazon can do in this environment. The user is responsible for securing the operating systems running on their instances, as well as the applications running on those operating systems. Amazon is responsible for physical security as well as the security of the hypervisor. The responsibility for the network is shared between the user and Amazon, but the implications of this model are left in the hand of the end user.

When we relinquish our control of the full stack we gain huge operational efficiencies but we also lose the flexibility to do things completely as we wish. We have seen the biggest security issues arise as organizations struggle to handle this change. Our approach to threat detection and incident response needs to adjust we work in this new environment.

Loss of Traditional Security Controls
When we share responsibility for the network layer our ability to use some of the tried and true security controls like IDS and vulnerability scanners is limited. In EC2, Amazon is responsible for the network routing and segmentation between customers - making sure all your traffic gets to your boxes and preventing one customer from seeing another’s traffic.

In the implementation of this restriction Amazon has prevented end users from easily gaining access to all network traffic in their EC2 environment (traditionally captured from a SPAN or TAP). The implication of this is that the ability to deploy any security controls that rely on network traffic - network IDS, net flow analysis, etc - is severely limited. While some attempt to replicate this by capturing all network traffic locally on the hosts running in their environment and analyzing in a centralized location, this approach is error prone and has severe implications on the network load of the environment as all traffic is replicated as it is sent to the centralized location for analysis.

Another problem arises with the use of vulnerability or asset discovery scanners. The nature of these scanners is to replicate the traffic of malicious activity in order to confirm the presence of vulnerabilities in the systems we run. In environments we fully control we can easily determine when these tools are being run by employees performing routine security checks and differentiate from when the tools are being used maliciously. However, in EC2, Amazon monitors the network layer for this activity to help detect malicious behavior. This means that in order for us to leverage these technologies in EC2 we must first notify Amazon when and from where the tools will be used. I hope that this process will be automated in the future, but for now you have to fill out a PDF. This makes the use of these technologies incredibly inconvenient at best and practically infeasible.
New Security Features
Across the board, as new capabilities are introduced new security features are also required. While new security features are an opportunity to improve the inherent security of an environment, it is necessary for the end user to understand and appropriately leverage these new features. The introduction of new features always leaves a gap in security as end-users become familiar and acclimated to their use.

The most obviously misunderstood security feature in Amazon AWS is the EC2 Security Groups (and of course in their other forms as well - ElastiCache, RDS, etc.). These security groups are a very powerful feature for controlling port-level network access to any of your running instances. The issue arises due to their seemingly familiar nature. With little effort it is simple for a user to expose services to the public internet. In a traditional environment the effort it would take to put a database onto the internet is substantial - punch through one or two routers and a firewall - with security groups, this can be done with a single configuration update.
Dynamic Environment
Whether we want it to be or not, Amazon EC2 is a very dynamic environment. Some users design their systems for this in order to elastically scale with demand, other users find that their systems simply require restart, and redeployment to operate effectively in EC2. The implication of this for security monitoring and incident response is quite substantial. In traditional environments identifiers such as IP addresses can be relied upon for forensic analysis and systems are relatively static, meaning that an incident that first started in weeks past will likely have evidence resident on systems still operating. In a dynamic environment these assumptions cannot be made. Our security monitoring must provide concrete relation between captured security data and the instances running in the environment and must dynamically collect data for use in incident response.
API
The last consideration in this environment is a major one. All actions taken in this environment are controlled by the Amazon API. While this provides us the automation we need, it also means that a malicious user of this API could cause substantial damage in very short order. In traditional environments we address these concerns by restricting physical access to our machines and if we use things like IPMI we (hopefully) restrict its access to a dedicated management network. It is with the same seriousness that we must protect, monitor and control access to the Amazon API.

We have some substantial new considerations for effective threat detection and incident response in environments such as Amazon AWS. There is huge opportunity for an inherently more secure world, but we need to adapt our approaches and learn some new tricks in order to realize this promise. Without alternative approaches to address the limitations of our operational access to the environment we are left blind to entire classes of threats, without cross-checking the use of the new security features we are left to trust our operators are fully aware of all of the security implications of every AWS feature, and without collecting data with an understanding of the dynamic nature of the EC2 environment we are left with too little too late when it comes time to investigate an incident. And most importantly, without addressing the critical nature of the API, we are left exposed to remote attackers in ways we have never been in the past. All of these considerations come into play when designing an effective threat detection system for Amazon EC2. We have a great opportunity for a more secure future in these types of environments, but we cannot rely on the same approach of the past.