Porting Unity Networking to Photon

jashan ✭✭
edited January 2010 in DotNet
I'm currently considering moving the networking in my current Unity project from Unity's built-in networking over to Photon. It's a LightCycle-style action-game (Traces of Illumination). The setup is basically multiple "game groups" consisting of up to twenty players per game group which can enter the actual game and play one level at a time. Different game groups can play in the same level simultanuously without interacting with each other; on an authoritative server. I'm only using unbuffered RPCs (=> also no Network.Instantiate) and managing scope (who gets which messages) myself.

The way I'd like to use Photon is having the Photon-server in between multiple authoritative game servers and the clients, handling all "lobby-related stuff", doing some simply game-logic checks and also handling persistence (I currently have a database layer using MS SQL Server, implemented in the Unity game which is not quite ideal due to Unity using Mono 1.2.5 which is *really* old - so I'd be very happy to move this to a "proper .NET 3.5 approach" ;-) ).

So basically, the Photon-server would let player-clients log in, send the clients any persistent data (player-data and what the player has achieved), get together in "game groups" and teams and then start a game session. When a game session starts, the Photon-server would pick a game server (the one with least load) and then have this authoritative game server get involved to handle major parts of the game logic, collision detection and the like. This would be implemented as a headless Unity standalone (as it already is, using Unity's built-in networking). From the perspective of the Photon-server, the "game servers" would be clients ... but "special clients", of course (authoritative ones ;-) ). They would be located in the same local area network as the Photon-server.

When the game server was selected and an actual game session started the Photon-server would then manage distributing the messages, as well as some simple game-logic checks (like, "can this player use this power-up right now" - anything I can easily implement without the Unity-backend will go there to avoid as many round-trips between Photon-server and game-server as possible). However, for player movement, turns, and collision detection, the Photon-server would "check back" with the Unity standalone game server.

Would that be a setup that works well with the way Photon is designed (Photon being the "layer" in between game servers and clients) or would I have to put the complete game logic that's relevant for the servers into Photon?

Also, what would be the steps to getting from Unity's RPCs (lots of them) to something I can implement using Photon? Would every RPC become one Operation? Or some RPCs become Operations, other Events? Would that mean one class per RPC (I currently have 115 different RPCs in my game)? Or is the RPC-based approach of Unity so different from the way Photon is designed that I'd have to do things completely differently?

For instance, on thing I'm relying on a lot is that communication is "from game objects to game objects", so I guess I'd have to build my own kind of "NetworkViewIDs" and the relevant lookup tables and lookup logic.


  • I could use some more information from you about this setup, as I'm not too deep into Unity itself:
    When you run a Unity Headless Server, would Photon be required to communicate with that by a network protocol? Even though both might run in the same local network, it sounds like a lot of overhead (serialization, synchronization). Or is there a way around this?

    In principle Photon can run as Lobby / Master Server of course. That would mean that it just selects servers for clients and directs them there. You clients should then directly talk to the in-game server.

    If you want to get Photon in as layer for persistency, you could use Photon from your in-game servers where necessary. This would minimize turnaround for in-game communication and your authoritative server could contact Photon to store stuff after the action itself has been authorized. Your server would be turned into a Photon client as well and the respective operations can be on behalf of the players (piping IDs through).

    About the RPC to Operation mapping:
    Its difficult to say how your RPC calls can be mapped without knowing more. If all hundred RPC functions are very different, that might mean they require hundred operations. However, Photon knows optional parameters (which might combine a few functions into one) and sending any event can be done within one operation. The server could still check the content and react to each players event, where needed.
  • The reason Photon and the Unity headless server would have to communicate via network is because they would be hosted on different physical machines. That way, the processing of the "Unity only game logic" can be distributed and I can simply add machines for the Unity game server when there's more players. At least as long as the Photon-server is not becoming the bottleneck.(*)

    I think a direct connection between clients and game-server would probably complicate matters for me (even though it would probably be desirable in order to avoid the overhead that having Photon as a "proxy" in between will create - but I really want to avoid having much of the "networking information distribution stuff" done on those game servers). That's why I'd move as much logic as possible from the Unity game server over to Photon. So, for most things Photon would be authoritative and just let the server know what happened, like it lets the clients know. Only for logic which is significantly easier to implement directly in Unity, I'd fall back to the Unity game server. I'm mainly thinking of collision detection and movement which involves things like vehicles riding around on moving platforms ... which I just don't feel like implementing from scratch. ;-)

    Of course, this is all done on the clients anyways - but I just can't trust those clients to do it "right" ;-)

    I think the greater part of the communication with the Unity game server would also be just one-way from the Unity game server to Photon to the clients: Most of the time, the game server would just send position updates to Photon which Photon then would distribute to the clients so that the main state is in sync for everyone (as the positions are strictly deterministic almost all of the time, I don't really need the positions from the clients except for comparatively rare events). Every now and then, collision events would happen on the game server and go the same route as the position updates (of course, the collision events happen on the clients as well - but the clients wait for confirmation from the server before letting those collision events have any significant consequences).

    The only thing I can think of right now, that requires the full path in both directions (client -> Photon -> Unity game server -> Photon -> clients) are turns initiated by the players (left/right ... and some more directions with advanced vehicles ;-) ). Those, unfortunately are a rather complex beast because the Unity game server needs to check if a turn was still valid when the message arrives at the server after the vehicle theoretically has already crashed into a wall. Another unfortunate thing about those turns is that players can choose to do many of them in a very short time (usually, they won't - but sometimes, they will).

    The good thing is that even in the case of "game server intervention", *most* can be settled between the clients and Photon (client sends "player turned" to Photon, Photon checks if this kind of turn is permitted and distributes the info to everyone else - including the server). Only if after the turn (when the info finally gets to the game server), the server discovers that the player doing the turn must have cheated, it'll let everyone know that this client exploded, using it's full authoritative power ;-) BAM!!!

    I'm expecting to have to do a rather significant amount of refactoring - this is really where it becomes obvious how convenient Unity's networking is to use. It's nice that I can at least keep the logic that I need inside Photon in C# - but I'll obviously have to pull a lot of existing logic out of MonoBehaviours into "plain classes" that can be used both by Photon (for cheat-prevention and persistence), the clients (for immediate user-feedback) and the game server (for movement and collision "authority" ;-) ). For the RPCs I'll probably have to look at each and every single one and decide how to treat them, reducing the number by combining some of them sounds like a very good idea (and should work in many cases). Well ... in the end it'll probably be a much cleaner software-design, so I guess it'll only hurt because of the time this will consume. But if I end up with something that works reliably and scales well, I'll be all happy ;-)

    (*)Speaking of Photon becoming the bottle-neck: I'm currently thinking of supporting up to 500 concurrent players, for which I would hope to be able to use 3 Unity game servers (so that would be 4 physical machines altogether ... plus db-server and Web-Server, of course ... "aber die stehen auf einem anderen Blatt" ;-) ).

    I know this is very hard to say - but would you consider this realistic (purely from Photon's point of view)? I know that you support many more players than that - but keep in mind this is an action-oriented game, not a typical MMO.

    Some details I'd consider relevant to this question: Most game-logic checks that would be done in the Photon-server are just simple state-checks, nothing too complex, no AI or anything like that. Any action related persistence is just pushing certain events into the database (asynchronously, in a separate thread with a queue); actual database queries are quite rare.

    So I think the "stuff" where it gets interesting will be the position updates: My position updates consist of a Vector3 (= 3 floats) and a bool. They are triggered by the game server and then sent to each player in the given "game group" (as players in one game group could usually see all other players in the same game group when they use they right camera angle, there's no point in trying to optimize by visibility - it simply won't work for this game).

    I'd like to do 10 position updates per second - but 5 would still be okay (with too few updates/second, jumps tend to get jerky - but I could easily increase the update-rate for this specific case). With 500 concurrent players, in the worst case 20 positions of vehicles in the game group need to be distributed to 20 clients in the game group, with 25 game groups - with 10 updates/second that would be 100.000 position updates per second. That does sound like a lot to me. But of course, this is a theoretical case that would only happen when everyone on the server is in the largest possible game group and starts a game session at the same time ... nevertheless it would be nice to know that - at least in theory - Photon wouldn't go down in such a case (on "standard" server hardware as you get it with usual root-server hosting packages).

    In a more realistic case, I'd have game groups with around 10 players each, and usually not all players are "active" and producing position changes (but everyone needs to receive all position changes - players that have "crashed" and therefore are inactive can follow other players that are still playing). So in that case it would be more like maybe 5 positions to 10 clients with 50 game groups (but with 10 updates/second, that would still be 25.000 position updates per second). As the game design favors collaboration over competition, and some levels require that all players "survive" until the end, it may actually be more realistic to think of 10 positions to 10 clients (so we're back up at 50.000).

    In a very realistic case, a few players will be completely idle, some players will be just chatting, others will be joining / leaving game groups and teams, and others playing in different sizes of game groups, some players might even spend their time just following other players (so they need position updates but never create any of their own). But that case is very difficult to put into numbers ;-)
  • So Unity would be connected by network and be used physics backend. Very interesting!

    If players are not extremely active, this will be a little less, true. But almost every communication would be doubled, so maybe you end up with 200k packets. At least the communication with the Unity servers will be almost instant and I hope the position updates replace former updates, so they can all be unreliable. Still, depending on the game (client side), there will be some effort for hiding the "correction" updates.
    There are some ways to minimize traffic, like having Unity combine player updates into a single event. Each "game group" could be resembled as "room" in Photon, handling all incoming operations serially (rooms are single threaded but run parallel).

    Photon should well be able to handle those position updates and more (of course depending on what machines you use) and their bandwidth.

    I wonder if there is a way to test this without too much effort. Have a proof of concept where we could see how the roundtrip times affect the setup and tweak response times and such.

    What do you think?

    Long post. Did I miss something??
  • Thank you for the quick response to the admittedly long posting ;-)

    Hehe, yeah - in this setup I'm thinking of right now, I kind of see the Unity game server as some sort of service that is plugged into the main architecture. I like the idea of having certain aspects of a game built into services that are running outside the actual server as this creates a lot of flexibility while at the same time simplifying certain things. The price is some extra communication overhead ... and for this, that current game is a really nice test-case because I don't think it can get any worse than having a "physics" backend for an action based game (if that works smoothly, anything else should be really easy ;-) ).

    I'm not sure why the communication load would be doubled, though: The idea is to have the position updates only come from the Unity game server. And as you mentioned, they can be unreliable without a problem (I just need to be sure that the order is correct - which I think you already support). So, "worst case scenario": The Unity game server, acting as client, sends 20 positions to the Photon server (my guess would be that the Photon client would combine those!? ... if not, that's probably something I could take care of); Photon server receives this and then distributes those updates to a maximum of 20 clients (that's the "bad stuff"). Okay: This would add "one passive client" per game group (I omitted that in my calculation - it's actually not a "passive" client as it's the one that initiates the updates; but it's "passive" in the sense that it doesn't add another position that would have to be sent).

    Combining the player updates sounds like a must, so "game groups" will be represented by "rooms" (the concept really is the same anyways, it's just another name). In the end, I think those worst-case 20 position updates could end up in one packet per client (and could probably still carry any other events that occurred in that time-step ... I'm not sure if you can mix reliable/unreliable in one packet, though - but even if not: since what I need to transmit reliably occurs significantly less frequently that shouldn't play a significant role).

    One thing I also need is a very reliable server time for my predictions - the more exactly I can calculate the time the packet took altogether (time from Unity game server to Photon server + time from Photon server to client), the less likely it is that there will be any network induced jitter (I do have an implementation to smooth this out ... but so far that implementation caused more distortion than it solved ... and as I felt I didn't really need it, I didn't follow through with it - but if that need arises I could pick it up again).

    I already do have testing clients which can simulate one or multiple players per client (for testing the performance of the Unity game server, it makes sense to simulate 20 players on just one client - for testing networking aspects, that's probably not the most valid approach ;-) ). On a 4-core Mac Pro, I can run around 10 of those clients simultanuously; on my older Windows notebooks, the limit is around 4-6 instances of those clients. Those clients are automated, so basically all one has to do is start them. They can be a bit of a pest, though, because they very eagerly join game groups and teams (each game group needs one human player, though, to start the actual sessions ... it's also nice to have actual people experience the game in that kind of situation - surrounded by all these dark AI-bots ;-) ) ;-) ... I can also set up the servers for testing.

    For a really broad test I'd probably need a "bunch" of machines to run more test clients on, though. From my experiences so far, I really feel one needs to test as many clients as one really wants to support: When I went with 60-70 players, everything still looked totally easy - going from that to around 70-80, suddenly things broke down really badly.

    Obviously, the more challenging task will be porting the networking to use Photon (in other words: implementing the Photon server and make the current game client/server implementation use that Photon server). But from what I've read and seen so far, and from my gut feeling, that'll be the direction I'm going anyways. While I really do love Unity, and I certainly appreciate the comfort of using its built-in networking, that networking part has a few limitations that had been bothering me right from the start and that are unlikely to ever be removed given that the main design-goal of Unity's built-in networking was "simple slow-scale networking".
  • I agree: physics are pretty difficult to get right and can easily cause a lot of workload.

    About doubling the traffic: It depends how this is solved. I thought the game server (Unity) is sending updates for all 20 players and the clients too. Photon would have to match / check this and send updates to either side. That would about double the data transferred (from client to game server).

    About saving bandwidth: The protocol will amass the "commands" but each will have it's own overhead, as they are typed, numbered and so on. Depending on the send interval, the available commands will be put into UDP packages, which also introduces overhead, so it could sum up when repeated often enough. But this is something that can be changed later on as well...

    About testing: It's true that you need to test the expected number of players somehow. If you can simulate only 10 clients on a (fairly powerful) machine, it will become difficult to reach those numbers. At a certain point it also becomes necessary that you test in real-world conditions and that in turn means you should test through the internet and need the bandwidth to pass all user's data along. We had several failed tests where our office firebox killed more UDP packages than it let pass and that's pretty confusing.

    Maybe there is a way to setup an even simpler testcase with less client-side simulation so you can simulate more users.
  • Tobias wrote:
    About doubling the traffic: It depends how this is solved. I thought the game server (Unity) is sending updates for all 20 players and the clients too. Photon would have to match / check this and send updates to either side. That would about double the data transferred (from client to game server).

    Ah, I see ... I mentioned that in one of the previous postings but I guess due to the length it was easy to miss: As the movement is deterministic most of the time, I only care about the client's positions "every now and then", so I can do this "server controls most of the movement"-trick which results in clients not sending regular position updates.

    Only during turns, the client which initiated the turn sends it's location to the server and stops listening to the server position updates for a moment until everything is synchronized again (technically speaking, it still listens but discards those updates as they are from an already invalidated past, anyways).

    This is really very different (and much simpler) compared to usual mmo-scenarios where players "run around totally unpredictably". Vehicles in Traces of Illumination go in straight lines, with fixed velocity. Of course, there are exceptions (like when players are jumping around, or speeding up/slowing down for a moment) but those are just that: Exceptions ;-)
    Tobias wrote:
    About testing: It's true that you need to test the expected number of players somehow. If you can simulate only 10 clients on a (fairly powerful) machine, it will become difficult to reach those numbers.

    That true ... so I'll look into further simplifying the test-clients. I'm sure there are ways how this can be done. One of the tricky things about those clients is that they obviously need some sort of simple AI for every simulated player; and that AI involves a couple of physics checks (basically the AI "looking around" to figure out where it should go). I guess I can reduce complexity there without sacrificing much in respect to the objective "test load" (the original objective obviously was "be a smart opponent to the players" which has different requirements ;-) ).
    Tobias wrote:
    At a certain point it also becomes necessary that you test in real-world conditions and that in turn means you should test through the internet and need the bandwidth to pass all user's data along. We had several failed tests where our office firebox killed more UDP packages than it let pass and that's pretty confusing.

    That does indeed sound confusing. I think my setup is pretty much "real-world" as the test-servers are already located where they will be located in production use, and I'm testing from a "typical DSL-line" (actually, I have two connections now - one DSL, the other one "cable-TV"). You obviously need a couple of different "Internet-lines" to avoid congesting the lines for the test-clients.

    Do the Photons clients have metrics in place to easily find out how many packets are being lost? Like per-client sequence numbers which would make it easy to count? That sure would help to isolate the problem if significant (and untypical) amounts of packets are lost.
    Tobias wrote:
    Maybe there is a way to setup an even simpler testcase with less client-side simulation so you can simulate more users.

    I'll certainly look into that as soon as I'm done with the porting.