Truth and lies about latency in the Cloud

By Jelle Frank Van Der Swet, who manages the pan-European marketing and product development programme for Interxion’s sizeable and fast-growing cloud community.

  • Monday, 22nd April 2013 Posted 11 years ago in by Phil Alsop

When it comes to measuring performance across the local enterprise network, we think we know what network latency is and how to calculate it. But when we move these applications off premise and into the cloud there are a lot of subtleties that can impact latency in ways that we don’t immediately realise. Jelle Frank van der Zwet, Cloud Marketing Manager , Interxion examines what latency means for deploying and/or migrating applications to the cloud and how you can both track and measure it.


For years, latency has bedevilled application developers who have taken for granted that packets could easily traverse a local network with minimal delays. It didn’t take long to realise the folly of this course of action: when it came time to deploy these applications across a wide area network, many applications broke down because of networking delays of tens or hundreds of milliseconds. But these lessons learned decades ago have been forgotten and now, as applications are migrating to the cloud, latency becomes more important than ever before.


Defining cloud latency is not simple
In the days before the ubiquitous internet, understanding latency was relatively simple. You looked at the number of router hops between you and your application, and the delays that the packets took to get from source to destination. Those days seem quite dated now, like when we look at one of the original DOS-based IBM dual-floppy drive PCs.


With today’s cloud applications, the latency calculations are not so simple. First off, the endpoints aren’t fixed. The users of our applications can be anywhere in the world, and with the flexibility that the cloud offers the applications themselves can also be located pretty much anywhere. That is the beauty and freedom of the cloud, but this flexibility comes at a price. The resulting latencies can be horrific and unpredictable.


We also need to consider the location of the ultimate end users and the networks that connect them to the destination networks. Furthermore we need to understand how the cloud infrastructure is configured, and where the particular pieces of network, applications, servers, and storage fabrics are deployed and how they are connected. Finally, it depends on who the ultimate “owners” and “users” of our applications are. Latency can be important for the end-user experience of an enterprise’s applications.


There have been some studies that have examined overall website performance with respect to latency. For example, one study shows that reducing latency will have a tremendous effect on page load times, even more so than bandwidth. For example, every drop of 20ms of network latency will result in a 7-15% decrease in page load times.


This study isn’t just an academic exercise: both Amazon and Google found big drops in sales and traffic when pages took longer to load. A half-second delay will cause a 20% drop in Google’s traffic, and a tenth of a second delay can cause a drop in one percent of Amazon’s sales. It’s not just individual sites that have an interest in speeding up the web.


Understanding the true effect of latency
In the past, latency has had three different measures: roundtrip time (RTT), jitter and endpoint computational speed. Adding traceroute as a tool, each of these is important to understanding the true effect of latency, and only after understanding each of these metrics can you get the full picture.
RTT measures the time it takes one packet to transit a network from source to destination and back to the source, or the time it takes for an initial server connection. This is useful in interactive applications, and also in examining app-to-app situations, such as measuring the way a web server and a database server interact and exchange data.


Jitter is a variation in packet transit delay caused by queuing; contention and serialisation effects on the path through the network. This can have a large impact on interactive applications such as video or voice. The speed of the computers at the core of the application: their configuration will determine how quickly they can process the data. While this seems simple, it can be difficult to calculate once we start using cloud-based computer servers.


Finally traceroute is the name of a popular command that examines the individual hops or network routers that a packet takes to go from one place to another. Each hop can also introduce more or less latency. The path with the fewest and quickest hops may or may not correspond to what we would commonly think of as geographically the shortest link. For example, the lowest latency and fastest path between a computer in Singapore and one in Sydney Australia might go through San Francisco.


With the rise of Big Data applications built using tools such as Hadoop, the nature of applications has changed and is a lot more distributed. These applications employ tens or even thousands of compute servers that may be located all over the world, and have varying degrees of latency with each of their internet connections. And depending on when these applications are running, the latencies can be better or worse as other internet traffic waxes or wanes to compete for the same infrastructure and bandwidth.


Virtualisation adds another layer of complexity
The virtualised network infrastructure introduces yet another layer of complexity, since it can introduce its own series of packet delays before any data even leaves the rack itself. Many corporations have begun deploying virtualised desktops, which introduces yet another source of latency. If not designed properly you experience significant delays with just logging into the network, let alone running your applications on these desktops.
It isn’t just poor latency in the cloud but the unpredictable nature of the various network connections between on-premise applications and cloud providers that can cause problems. What is needed is some way to reduce these daily or even minute-by-minute variations so you can have a better handle on what to expect.


Clearly, the best available option is to directly connect to a public cloud platform. Direct connect options like Amazon, but also Windows Azure, offer to build a hybrid solution that uses both on-premise and cloud-based resources. Application code and data can be stored in an appropriate on- premise location according to regulations, privacy concerns, and a measurement of acceptable risk, while solution components requiring the features and pricing model of cloud computing can be migrated to the cloud.


Conclusion
Relying on the internet for application connectivity in the cloud introduces a degree of variability and uncertainty around bandwidth, speed and latency. This can be unacceptable to many large and medium sized enterprises, which are increasingly putting the emphasis on end-to-end quality of service management. Using dedicated connectivity to cloud providers overcomes this and hooking up via carrier neutral data centres and connectivity providers can also have benefits in terms of costs, monitoring, troubleshooting and support.


As you can see, cloud latency isn’t just about doing traceroute and reducing router hops. It has several dimensions and complicating factors. Latency in itself does not have to be an issue; it’s the unpredictability of latency that really causes the problems. Hopefully, we have given you some food for thought and provided some direction so that you can explore some of the specific issues with measuring and reducing latencies for your own cloud applications, along with some ideas on how you can better architect your own applications and networks.