Using Auto-scaling to maximise resource optimisation and reduce public cloud costs

By Terry Storrar, Managing Director at Leaseweb UK.

  • Sunday, 2nd March 2025 Posted 21 hours ago in by Phil Alsop

Today’s fast-paced digital businesses depend on scalable and flexible public cloud infrastructures to adapt fast to changing demands. However, managing these cloud resources effectively is crucial for organisations that need to maximise the benefits of cloud computing and minimise poor workload performance or unnecessary usage.

When it comes to cloud management, organisations have a number of options available to them. One such is auto-scaling, an automated approach to increasing or decreasing public cloud resources with no need for manual intervention.

With auto-scaling, organisations can dynamically adjust to real-time demand and assure the high availability and performance of their applications. Meanwhile, organisations can achieve significant cloud cost savings by only paying for the resources they actually use.

Auto-scaling: a brief overview

Before the introduction of auto-scaling, managing cloud workloads was a complex proposition that involved the manual addition or removal of resources. Since predicting demand change is difficult, this often led to errors that resulted in service outages or poor cost management due to the overprovisioning of resources.

Auto-scaling addresses these issues by automatically adjusting workloads based on fluctuating demand. A move that both assures optimal performance and availability, while minimising the risk of cloud costs ballooning. Let’s take a look at the key advantages to be gained by implementing auto-scaling for public cloud.

Firstly, the ability to auto-scale resources up or down according to demand and workload enables organisations to avoid instances of over or under-utilisation. Essentially, this gives organisations greater cost management capabilities as they are able to avoid charges for unused capacity.

Auto-scaling also enables organisations to avoid bottlenecks or latency issues and ensure their cloud application workloads and services perform optimally. This enhanced reliability ensures that applications are always available and responsive, even during sudden real-time surges in demand or workloads. This consistent performance is essential for delivering a seamless and responsive end-user experience.

Scaling up or scaling out?

Organisations can take advantage of two auto-scaling methods: horizontal scaling (scaling out) or vertical scaling (scaling up). Compared to vertical scaling, horizontal scaling is a faster method, but not every application or workload can be scaled horizontally. For example, workloads or applications that have strong data dependencies – such as highly coupled systems, relational databases with complex joint operations and applications with large, shared data stores – are not ideal candidates for horizontal auto-scaling.

With horizontal scaling, the number of instances in the public cloud participating in each workload are increased or decreased without creating downtime. Typically, auto-scaling will be configured to respond to specific events such as demand spikes or new feature launches, or set metric thresholds

created to ensure zero impact on performance. Auto-scaling can also be configured for a predetermined schedule, something that is particularly valuable for companies or services with predictable seasonal demands as capacity can be proactively scaled based on anticipated needs.

Similar to adding more lanes to a motorway to handle more traffic at rush hour, horizontal auto-scaling dynamically adds instances of a resource, such as servers, to handle fluctuating workloads. For instance, if an application requires a threshold of 60 percent CPU to be sustained for more than five minutes the autoscaling group will have a minimum and maximum number of instances configured in the public cloud.

This set up allows for the automatic launch of additional instances each time the CPU threshold is reached, up to a predefined maximum limit. Additional instances are typically assigned to a load balancer which evenly distributes all incoming traffic across all instances. Conversely, when an instance falls below the threshold for more than five minutes, it is removed and the load balancer stops directing traffic to it. All of which enhances the application’s availability.

Less commonly used than horizontal auto-scaling, vertical automated scaling adjusts the compute capabilities of an instance by increasing or decreasing resources such as memory or CPU power. For example, an instance provisioned with 4v CPUs and 16 GiB of memory could be upgraded to 64 vCPUs and 64 GiB of memory on request. Typical use cases for vertical scaling include allocating more resources to virtual machines as application demands grow or boosting CPU or RAM resources for a relational database as the number of transactions or amount of data grows.

Auto-scaling in action

Ideal for managing fluctuating demand on e-commerce sites, organisations can configure their online ordering and verification systems to scale up during the day and scale down at night. During seasonal peaks, such as Black Friday, auto-scaling can be used to ensure e-commerce systems perform optimally to meet demand while minimising unnecessary hosting costs.

Similarly, auto-scaling is ideal for gaming or streaming services that need to be able to adapt fast to handle sudden surges in user activity. It can be of particular value during new game launches or live events when the pressure is on to dynamically manage anticipated demand surges.

Quick to set up, automated, and requiring long term commitments, auto-scaling is the ideal solution for organisations that need immediate, flexible and reliable extra capacity that can instantly adapt to shifts in demand. By paying only for what they use, auto-scaling also enables organisations to keep their public cloud costs under control.

By Mark Dando, General Manager of EMEA North, SUSE.
By Justin Kuruvilla, Chief Cyber Security Strategist at Risk Ledger.
By Allan Smeyatsky, Senior Director, Searce.
By Siddhesh Parab, Solution Architect - Manufacturing and Supply Chain, Percipere.
By Sara Wilkes, CEO at Agilitas IT Solutions.
By Matt Addicks, Head of Product Marketing - Enterprise 5G, Ericsson Enterprise Wireless Solutions.
Empower Customer-Centric Success with Managed Services.
By Manoj Mehta, EVP and President, Cognizant Europe, Middle East and Africa.