Why is autoscaling so important?

by Aether Engine, Hadean Platform

Autoscaling is a crucial aspect of modern cloud computing deployments that provides the ability to utilise resources in a more elastic and dynamic way. It is a mechanism that adjusts the number of computational resources in a server based on the load of the server. Specifically, scaling is the process of removing or adding network, storage, and compute resources based on the workload demands to maintain performance and availability as usage increases. The main advantage of autoscaling for the cloud is that the workload gets the cloud computational resources it requires. The demand for computational resources for a job-based workload in the cloud is normally defined by: 

  • The number of jobs in the server queue
  • The time jobs have waited in the server queue
  • The number of incoming requests

However, not all parallel computational work can be easily expressed as a job queue, so other autoscaling solutions are required. Aether Engine is specifically designed to solve that second type of workload: where the work is not a simple queue and the workload is continuous yet variable. All the major cloud infrastructure such as AWS, Google Cloud, and Microsoft Azure, offer autoscaling capabilities (Auto Scaling groups, instance groups, Virtual Machine Scale Sets). The advantages of having an autoscaling system rather than a static instance type in a cloud deployment are:

  • Responsiveness: being policy-driven and automatic, autoscaling is activated only when needed, providing a much more efficient approach compared to a slower manual method, where the user gets a notification when the service is failing and needs to log in and spin up more servers.
  • Lower-cost: autoscaling provides a way for businesses to reduce cloud costs. Organisations are able to use a higher amount of resources only when strictly necessary.
  • Service availability: in case of a traffic spike, autoscaling guarantees service availability, avoiding cloud services being unable to service requests, perhaps leading to a lack of services for customers.
  • Reliable performance levels: thanks to autoscaling policies, developers are able to define and achieve the desired performance levels and ensure they are maintained.
  • Enhanced fault tolerance: autoscaling constantly monitors the health of the cloud system, and can promptly replace faulty resources when needed.

For example, the Amazon EC2 autoscaling tool is used to distribute instances across availability zones. AWS resources are stored in data centers, called availability zones, that are in different physical locations to provide more reliability and scalability. The EC2 autoscaling tool enables the user to span out autoscaling groups across different availability zones so when one becomes unavailable, autoscaling automatically initiates another instance in an unaltered availability zone. When the unhealthy availability zone becomes available again, autoscaling reallocates the application instances uniformly across all the availability zones. Autoscaling is also widely used on container orchestration platforms like Kubernetes. It is used to simplify resource management that would have required extensive human effort otherwise. The Kubernetes cluster can increase the number of nodes when the demand for service response increases, and decrease them as the request decreases.

What is dynamic autoscaling?

As the virtual three-dimensional space grows, an Ather Engine application will need to be able to split the space accordingly and distribute the resources where the application needs them the most. Sometimes the available resources won’t be enough, which is when Aether Engine can take advantage of dynamic autoscaling. 

Dynamic autoscaling is a special type of autoscaling. Typically, autoscaling assumes that the workload in a node will increase only if new work is added to it, usually in the form of incoming requests. For example, when the load goes over a threshold with the existing servers, web services create another server to reduce the load and incoming requests are distributed evenly over n+1 servers instead of the previous n servers. In this case, the workload already underway is not redistributed, but new incoming traffic is distributed to the server with the lightest load. This assumes a steady stream of incoming requests and that they can be redistributed as the request comes in.

With dynamic autoscaling, long-running processes increase their workload, and the workload already underway on those servers must be explicitly moved and reallocated. Hadean’s Aether Engine is an example of dynamic autoscaling, it has the ability to create new servers and relocate the existing workload to these new servers. The Hadean platform is able to dynamically scale when spawning new processes and eliminates excessive middleware, orchestration, and overengineering enabling developers to build their applications quickly and cost-effectively. Differently from existing approaches, Hadean’s system does not need the resources to lie on a single server but provides developers with abstractions over the entire cloud. Hadean’s dynamic scalability feature ensures that problems such as under/over-provisioning are avoided and that developers can build, ship, and scale their applications quickly and cost-effectively. Hadean is largely implementing its scalability feature on Aether Engine, providing the power to create games of immeasurable complexity and detail or enabling virtual events with a vast number of connections and unprecedented performance.