Huge lags on the client

Hi, sometimes huge lags happen on the client. We are trying to investigate the issue. Sometimes our Photon server sends a packet and a client receives it in more than 1 second. Usually round trip time between the server and the client is 200 ms. So we would like understand a packet is in the client's incoming queue or the server's outgoing queue. Do we have a way to do it? The server is Windows Server 2012 R2 x64 (4.0.29.11263). The client is Windows 7 and it uses Photon Unity SDK (4.1.1.2).

Comments

  • hi, @nlisun
    you may use wireshark to sniff packets and see whether it was recived or not.

    could you also describe case when you see this issue?

    best,
    ilya
  • Thanks for quick answer!

    The server uses UDP protocol. Usually 5-20 peers are connected to the server. Each peer sends/receives 10 packets per second (this is maximum). The packets are sent by unreliable channel of the server and have a size about 90 - 800 bytes.

    Sniffers are good but they don't give us information about network packets.
  • using sniffer on server and clients you may see what packets was send to client and compare when they were recived.

    you are using UNRELIABLE protocol. this means that photon does not try to resend if it did not get acknolegements from client. and as you know UDP may even lose your packet.
    this does not sound like photon issue.

    best,
    ilya
  • @nlisun we have discussed a little what is going on with your project. it looks like some packets are sent reliable in your case. because when you send big packets they are fragmented and sent REliably. our current implementation of unreliable supports only sequenced transmission. so, if transmission takes more then 1 second this packet should be discarded with your current packets per second amount.

    another issue can reside on client side. it may happen that you do not call peer.Service often enough. may be there is a spikes on client. then you will not be able to get anything too

    best,
    ilya
  • @chvetsov What size of the packet is big? Could you please write more information how Photon works on low level? It will very help us to understand what we should do. Does SendParameters.Flush influence on sending time and packets fragmentation?

    The client tries to call peer.Service every 20 ms. So I don't think the issue is in the client.

    Thanks,
    Nikolay
  • hi, @nlisun

    >What size of the packet is big?
    size which are bigger ~1150 bytes will be splitted. this is not exact value, but close enough

    >Could you please write more information how Photon works on low level?
    this is too big question.

    >Does SendParameters.Flush influence on sending time and packets fragmentation?
    it may have influence on sending time but not on backets fragmentation. you should check PhotonServer.config whether you are using nagling or not. Nagling may create ~15 ms delay before seding in order to collect enough packets. you may switch it of, if in your case you send packets from server on per time basis.

    what version of PhotonSocketServer are you using?

    best,
    ilya
  • @chvetsov I'm using v4.0.29.11263
  • Hello @chvetsov,
    So, if we update client state 10 times a second and server replies on each message with unreliable, then what is the best strategy in aspect of server reply size?
    KPIs: minimal delay 1st, unreliable delivery % 2nd
    a) keep server reply under 1150 bytes and spread data pieces across several replies, keeping in mind 10 pps frequency
    b) reply all the data at once regardless of reply message size, because this improves client world representation

    Thanks,
    Alexander
  • hi, @aefimov
    well, i think a) is way to go. because if data is begger then mtu- header, your data will be sent reliably. this may cause quite big delays in your case. i mean with RTT ~ 200 ms,
    you may override OnSent method for Peer and track data size, in order to check whether this is really your case

    best,
    ilya
  • Hi @chvetsov,

    Yes, a) seems reasonable. That's what we have originally in the game. However, we made several tests for variable pps and variable server reply size. With just one client the result seems to be the same, regardless of its parameters. Please check the log below from test client app, I think those unreliable round trip time text charts are self-explaining.

    If there are 20 clients or more then we expect over a second delays in server reply on client side. I guess the cause might be the server host VM, which is low tier Azure instance, but CPU load is low and bandwidth is reasonable for 10-20 clients.

    So we'll get back under the MTU size to see if it helps. I'll report here about the result.

    Regards,
    Alexander

    Time:10s Freq:10p/s ServerReplySize:1400b
    Sent/Received:100/100 Lost:0.00%, In:11.73kB/s Out:0.07kB/s, RTT:137ms
    | 080ms
    |
    |
    |
    |************************ 24%
    |***************************************** 41% 130ms
    |********************** 22%
    |************* 13%
    |
    |
    | 180ms
    ======================================================================================

    Time:10s Freq:10p/s ServerReplySize:2800b
    Sent/Received:100/100 Lost:0.00%, In:23.39kB/s Out:0.07kB/s, RTT:137ms
    | 080ms
    |
    |
    |
    |**************** 16%
    |************************************************ 48% 130ms
    |********************** 22%
    |*********** 11%
    |*** 3%
    |
    | 180ms
    ======================================================================================



    Time:10s Freq:25p/s ServerReplySize:1400b
    Sent/Received:251/251 Lost:0.00%, In:29.44kB/s Out:0.17kB/s, RTT:139ms
    | 080ms
    |
    |
    |
    |**************** 16%
    |******************************************* 43% 130ms
    |************************* 25%
    |************* 13%
    |** 2%
    |* 1%
    | 180ms
    ======================================================================================

    Time:10s Freq:25p/s ServerReplySize:2800b
    Sent/Received:251/251 Lost:0.00%, In:58.71kB/s Out:0.17kB/s, RTT:139ms
    | 080ms
    |
    |
    |
    |*************** 15%
    |*************************************** 39% 130ms
    |*************************** 27%
    |***************** 17%
    |** 2%
    |
    | 180ms
    ======================================================================================



    Time:10s Freq:50p/s ServerReplySize:2800b
    Sent/Received:501/501 Lost:0.00%, In:117.19kB/s Out:0.33kB/s, RTT:134ms
    | 080ms
    |
    |
    |
    |************************** 26%
    |************************************************ 48% 130ms
    |************************* 25%
    |* 1%
    |
    |
    | 180ms
    ======================================================================================
  • @aefimov to be honest, may be it is somehow related to server logic, but for photon 20 users is just nothing. they should work as fast as one. of course if we do not send mega bytes :)

    best,
    ilya
  • The unreliable bandwidth is reasonable, ~ 20k per second per player. So for 20 players it is ~ 400k per second. Reliable traffic is a minor fraction of that.
    We track the delay from server send to player receive, game server logic does not affect that time.
    We've tested packets under MTU size and there are still second+ delays. We continue experimenting.

    Regards,
    Alexander
  • @aefimov thank you for report. keep me posted about results

    Best,
    ilya
  • @aefimov is there any news, guys?

    best,
    ilya
  • Hi @chvetsov,
    There was no change, we still experience over a second delays for some small unreliable packets. We're going to upgrade Azure VM to the next tier. I'll keep you posted.

    Best regards,
    Alexander