Unreal Engine - The magic of Network Managers (Networking optimization)
In this post, we’ll be touching on Network Managers and exploring how we can use them to minimise the inevitable server load that comes with cases requiring the replication of hundreds of actors across our World.
What is a Network Manager?
A Network Manager is a replicated Actor that handles the data replication of a given set of Actors, negating the need to replicate the Actors themselves.
Each Network Manager tracks a set of Actors in a list (order of hundreds), and sends an event to the appropriate Actor when a variable requires replicating.
While Network Managers handle data replication, we should note that Actors which require RPCs to work still need to be replicated. In these cases, it’s recommended to reduce the NetUpdateFrequency
of the tracked Actors as variable replication is handled by the manager.
Why are Network Managers useful?
Quoting my dear friend Zlo#1654
from Slackers:
Replicating several hundred moving Actors is doable, however, replicating several hundred moving Actors while you have several thousand non moving replicated Actors cluttering the
NetBroadcastTick
is not.
The NetBroadcastTick
is a Stat that measures performance data from the BroadcastTickFlush
function that is part of the UWorld::Tick
. This function is in charge of flushing networking and ticking our net drivers (UNetDriver::TickFlush
). In essence, it’s the responsible of managing the replication of all of our replicated Actors.
This means that the more replicated Actors we have, the more processing time BroadcastTickFlush
will need. Thus, the main objective in our quest to reduce server CPU processing time is to keep the values reported by NetBroadcastTick
as low as possible.
CPU timing issues derived from bad NetBroadcastTick
metrics are far more common than bandwidth issues in our Server instances.
Thus, the main use of Network Managers is to reduce the amount of replicated Actors by turning off their replication and passing that responsibility on to a series of Actor managers, a process that yields a direct NetBroadcastTick
optimization.
We could argue that this same effect can be achieved by employing Dormancy and a reduced NetUpdateFrequency
. However, this approach can result in less Net responsive Actors and complicated relevancy scenarios. To combat this, it would become necessary to boost the NetPriority
and the NetUpdateFrequency
of all your replicated Actors, a disastrous change for any multiplayer use case.
This can also be mitigated with the use of Actor Managers. Due to the small number of Network Managers in our world, we can set a higher NetPriority
(ie. 2.8) without causing bandwidth issues, which would translate to an increase in the responsivity of the tracked Actors.
Performance implications
This Section showcases metrics from replicating 976 actors with and without a Network Manager. The test was conducted with 2 players: A listen server host player with a -nullrhi
instance, and a -nullrhi
client. The metrics were recorded from the server instance using Unreal Insights, and all the experiments were performed on the same hardware under equivalent conditions.
The first experiment consists of replicating 976 Actors with two replicated variables, each Actor runs at a NetUpdateFrequency
of 25 and has a NetPriority
of 1:
In the second experiment, 720 of those 976 actors use the Push Model with a reduced NetUpdateFrequency
of 1 (leaving the remaining 256 actors with default replication):
In the third experiment, we delegated the handling of 720 of those 976 actors to a Network Manager running with a NetUpdateFrequency
of 100 (leaving the remaining 256 actors with default replication with a NetUpdateFrequency
of 25):
As we can see the performance metrics in the third experiment look more favorable, as we are reducing considerably the amount of replicated Actors by moving their data replication to a Network Manager (while keeping the same amount of Actors in the Level). The Push Model experiment provides speedup over the baseline setup, but the Network Manager is the absolute winner for this use case.
Actors that need RPCs to work will still need to be replicated.
Implementation
The Network Manager is an AActor
that uses FFastArraySerializer
(s) to hold the data of all their tracked Actors.
In this Section, we’ll implement a simple Network Manager that handles the replication of a health component we designed for active Actors around the world.
Without the Network Manager, all the Actors with the health component would need to be replicated alongside the component.
This article focuses on the replication of Actors previously placed in the level, so it is not necessary to replicate them. Future posts linked in this article will expand the concept to Actors spawned in runtime.
FFastArraySerializer
Each FFastArraySerializer
in our manager handles a type of Actor, in this example, our Network Manager handles the replication of health components:
The FHealthCompItem
corresponds to a health component from an Actor and contains a reference to its owner and the data to replicate. In this example the data is contained in a TArray<uint8> Data
to demonstrate encoding and decoding of generic data, but it is possible (and also recommended in gameplay-intensive instances) to have it in independent properties to ease debugging.
PostReplicatedChange
calls a custom function in the health component called PostReplication
, which is responsible of receiving and decoding the data on the client’s end.
The Network Manager class
The Network Manager uses the fast array declared in the previous step:
As we’ve seen above, the FHealthCompItem
contains a copy constructor to ease registering/adding members in the FHealthCompContainer
:
Network Managers can work with the relevancy system, but it’s not very convenient as we have a few of them, in this case, I decided to make them always relevant. The following two functions describe the core functionalities we require in our Network Managers:
RegisterActor:
Adds an element to our fast array constructing aFHealthCompItem
that gets added to the array.UpdateActor
: Updates the data of an element in our fast array. For that, we find the owner actor within the array and update the data on the found entry.
The health component
Our health component gets registered in BeginPlay
:
And communicates with the Network Manager through events:
The PostReplication
function handles the client data:
As we can see, Decode
and Encode
are being called as our Network Manager handles generic data in the form of a byte array:
These two functions have the following responsibilities:
Encode
(Server): It writes the values stored inHealth
andShield
in aTArray<uint8>
usingFMemoryWriter
. These values will then be passed to theFHealthCompContainer
to update the fast array values.Decode
(Client): Once theFHealthCompContainer
replicates,PostReplicatedChange
is called in the fast array, that callsPostReplication
, which receives the Payload that gets rearranged back in two variables usingFMemoryReader
.
And with that, Health
and Shield
are set appropriately in the client.
Conclusion
With just a few lines of code, we have optimised the CPU time consumed on the server by processing hundreds of replicated Actors through a Network Manager.
Network Managers are a great solution in cases where we have to handle lots of actors of the same type, both for performance and responsiveness.
Before I finish, I would like to thank Jambax and Zlo for introducing and explaining these concepts in the Unreal Slackers discord, which motivated me to do my own research and create this post.
Enjoy, vori.