How Aether Engine helps developers overcome rubber-banding
What rubber-banding is, why it’s a problem and how Aether Engine can fix it
Those of you who are familiar with online gaming, either as a player or a developer, will also be familiar with a problem known as ‘rubber-banding’. This is the name given to the experience when your character or vehicle is travelling along a path, only to be pulled backwards for no apparent reason.
Rubber-banding is a tricky problem for multiplayer game developers to solve and it’s something that we believe we can help them avoid with Aether Engine. But first, an explanation on what’s happening under the surface when rubber-banding takes place.
How multiplayer online games work
Modern online games are made up of lots of interconnected pieces, but can be thought of as having a client that the player directly interacts with — whether that’s via a download from a platform like Steam, a disc inserted into a console, or something loaded into a browser — and a server that talks to all of the clients.
This client–server architecture is incredibly common in all aspects of modern computing and has been around for more than 50 years. It’s how most of the internet works. In the (simplified) case of multiplayer online games, the server’s job is to keep track of the state of the overall game and settle arguments with the clients where they disagree. The client’s job is to show the player what’s going on in the game and take input from the player ⬆️⬆️⬇️⬇️⬅️➡️⬅️➡️🔠 to send back to the server.
When client and server disagree
The client and the server are always going to be some distance from each other, possibly in different countries, and the connection between them isn’t always going to be reliable or as fast as everyone would like. That can mean that messages between the two don’t always arrive in time and the client and server end up disagreeing on what has just happened.
In the majority of cases when this happens, the server will win this argument, usually as a safety net against cheating. When the server and client can’t agree where a player is located, and the server wins the argument, then the player is returned to the position that the server says they should be at, which is usually ‘behind’ where the client thinks.
This results in rubber-banding.
What happens when your game gets really big
There’s another increasingly common case where this happens, even when the player has a great connection to the server, and this is where the problem becomes much harder to fix.
As modern games are becoming bigger and far more complex, the server had to become servers. A common way to split up a game world is to have each server look after an area or volume of its own. Players move through these game areas and the servers hand over control and the player details to the new server as they cross the boundary, which takes a little bit of time.
In the worst case scenario, you end up with a situation where a player unknowingly crosses the boundary of multiple areas and thus multiple servers, in a very short space of time, and the servers can’t hand over details quickly enough to prevent rubber-banding taking place.
As the ship moves around, server A hands its details to server B, which hands it to C and then to D. And this is only in 2 dimensions…
In our previous case the server was always authoritative over the client, but in this case which server should have final say so — there’s no obvious answer, and so you end up having to write complex, fragile logic to cope with various edge cases, and when that fails PIIIIIIIING! So you write some more logic for the cases you missed
We built a demo called Aether Wars
In March 2019 we released a technology demo using the Hadean platform called EVE: Aether Wars, in collaboration with CCP Games, creators of EVE Online. It was a very ambitious thing for us to do, as we are an engineering company with no gaming experience, but we wanted to demonstrate what our technology was capable of.
We managed to create a demo that could support over 14,000 players with millions of entities running in real-time. To support that many entities we ran the simulation on Microsoft Azure, using five 64 core virtual machines. With a simulation of that complexity there was a lot of movement of entities between cores and machines and we need to ensure that the behaviour of those entities was predictable and obvious for the players – there would be little point scaling a demo to that sort of size if we couldn’t prevent odd behaviour.
Reconciliation is hard
Any distributed simulation engine, such as an MMO game engine, must ultimately split its calculations over multiple processes; Aether Engine chooses to do this spatially. However, a process cannot be limited to knowing only about entities in the space it is computing — imagine if you couldn’t see or interact with any entities outside of your current area or volume! However, there are differences in how different engines handle this.
If you don’t have much control over how the simulation is happening in your processes, there’s only one thing you can do: perform overlapping parts of the simulation independently on each of the processes, and when you need to decide on a canonical state of an entity, perform a ‘reconciliation’ process that takes information about each state and decides an authoritative state.
A disadvantage of this approach is that reconciliation is really hard. As each entity is being run independently in the different processes, it has access to different information (after all, the point of having separate information is to have each process be able to work in parallel on local information) and so the entity’s state may evolve differently in each.
Even if you were to run the same simulation on multiple processes, simulations typically make assumptions about floating-point arithmetic that only hold up to a certain margin of error. As these errors compound, substantial differences in the final state can be introduced even if the initial state is identical in both processes!
If states are sufficiently different between two processes, and reconciliation causes authority over that entity to be transferred from one to the other, it becomes visible to the user as ‘rubber-banding’, where the entity jumps suddenly from its state on one process to its state on the other.
Here’s how Aether Engine deals with rubber-banding
For this reason and others, Aether Engine currently doesn’t perform a reconciliation step. Information about entities in neighbouring regions is inert, represented by ‘ghost entities’, and used only as information for local calculations. As a consequence of this, there is clear ownership of each entity: an entity belongs only to one particular region, and is simulated only by that region’s worker, which pushes the updates to nearby cells synchronously.
Aether Engine is sufficiently frugal with communication that this does not slow down the tick rate of the overall simulation. How many neighbouring regions that information is pushed to is determined by the size and shape of the entity’s axis-aligned bounding box (AABB). Once the AABB crosses over into another region, that region will have information about the entity pushed to it by the controlling region and will create a ‘ghost entity’ as a local representation of the remote entity.
As a result no reconciliation is ever needed: each entity has a single process that decides on its state, but may have many processes that know about its state. As the entity moves through space it may be handed over to a neighbouring process (that manages a neighbouring region) which then takes ownership of it, it goes from being a ‘ghost’ to being real, and the process goes from receiving the state of the entity to sending the state to the relevant neighbours. In its original process the opposite happens — it goes from being real to being a ‘ghost entity’ — and control is relinquished.
Download The Cloud-Native Gaming Handbook and discover how you could break through barriers of game design.