Disconnected with error 1039

benjaml
benjaml
edited February 2020 in Native
Hello,

I'm investigating on disconnection of a client. When I try to enter my game session room, I do get quite a lot of information which means a lot of messages. I do get a warning 1033 (QUEUE_INCOMING_RELIABLE_WARNING), so I do get a lot of incoming messages which are expected. in warningReturn of my LoadBalancing Listener, I log 'getIncomingReliableCommandsCount()' and 'getQueuedIncomingCommands()', they both have the value of 100. On another side, I do call 'services()' in another thread to avoid the main thread to block this. services() is called every 20ms. Then I get the error message 1039 (INTERNAL_RECEIVE_EXCEPTION) which disconnects my client.

From this point, I assume that I get disconnected by the overflow of incoming messages. What I don't understand is that when I log 'getIncomingReliableCommandsCount()' and 'getQueuedIncomingCommands()' from the thread used for services(), I only get values like 0 or 1.

I tried to separate the services() call to have more control over the incoming messages but I did not succeed to make it work so I think that I should have misunderstood something
instance->load_balancing_client.serviceBasic();
while (instance->load_balancing_client.getIncomingReliableCommandsCount() > 0)
{
	if (instance->load_balancing_client.getQueuedIncomingCommands() <= 50)
	{
		break;
	}
	LOG("[photon] incomming : %d and queued : %d", instance->load_balancing_client.getIncomingReliableCommandsCount(), instance->load_balancing_client.getQueuedIncomingCommands());
	instance->load_balancing_client.dispatchIncomingCommands();
}
while (instance->load_balancing_client.getQueuedOutgoingCommands() > 0)
{
	LOG("[photon] outgoing queue : %d ", instance->load_balancing_client.getQueuedOutgoingCommands());
	instance->load_balancing_client.sendOutgoingCommands();
}

PS: I'm using the TCP protocol

Comments

  • Hi @benjaml.
    I do get quite a lot of information which means a lot of messages
    How much data are we talking about?
    If your are transferring more than 500kb in a short amount of time, then you run into the size-limit of the server side per-client buffer and the server might disconnect the client because of a buffer overflow (see https://doc.photonengine.com/en-us/realtime/current/troubleshooting/faq#can_i_send_a_huge_message_using_photon_).
    In that case it would help to transfer the data at a slower pace.
    On self-hosted Photon server instance on in Photon enterprise Cloud you could set the server side per client buffer size limit to a bigger value, but this is not possible with Photon Public/Premium Cloud.
  • I've tried to log the ByteIn and ByteOut and it does not exceed 200kb in total
  • Hi @benjaml.

    Then from all the info that I have it should work and I don't see why you would get that disconnect.

    My suggestion is to try to reproduce it with one of our demo apps. Modify the demo to become as similar to your app in its photon usage as possible without having to add lots of non-Photon related code/assets: just send around some dummy data in comparable amounts to your real apps data and block the main thread for a comparable time, etc.
    If it is is not reproducible with the modified demo, track down what is different in your app than in the demo.
    If it is reproducible with the modified demo, then send us the the modified demo as a reproduction case and we could look into why it gets disconnected.
  • benjaml
    benjaml
    edited March 2020
    Hi @Kaiserludi,

    I've made some logs and I do not undestand from what this error 1039 does come from.
    Here is a piece of the logs
    [Photon] Received 60183 bytes of data
       [Photon] Received 4380 bytes of data
       [photon] Incoming reliable commands : 4 and Queued incoming reliable commands : 4
       [photon] Incoming reliable commands : 3 and Queued incoming reliable commands : 3
       [photon] Incoming reliable commands : 2 and Queued incoming reliable commands : 2
       [photon] Incoming reliable commands : 1 and Queued incoming reliable commands : 1
       [Photon] Received 75864 bytes of data
     E [photon] received connection error 1039
    

    It seems to happened when I get a large chunk of data.

    from this I have multiple question :
    - What is the difference between getIncomingReliableCommandsCount() and getQueuedIncomingCommands() ?
    - How is generated this error 1039 ? and with which conditions ?
    - Does the client have a buffer size limit as the server ?
    - Is error 1039 generated by the client or the server ?

    PS : as I did already encounter this error when I used the local server and I forgot to launch it, I tried to increase the client timeout as I thougth it could be related to a timeout on the client but it does not seem to change it.
  • Hi @benjaml.
    - What is the difference between getIncomingReliableCommandsCount() and getQueuedIncomingCommands() ?
    With UDP the latter returns the sum of the incoming reliable and the incoming unreliable commands. As with TCP everything is always reliable, the number of incoming unreliable commands is always 0 and hence with TCP there is no difference between them at all.
    - How is generated this error 1039 ? and with which conditions ?
    This error happens when the sockets receive function returns an error.
    - Does the client have a buffer size limit as the server ?
    No, it doesn't.
    - Is error 1039 generated by the client or the server ?
    By the client.
    PS : as I did already encounter this error when I used the local server and I forgot to launch it, I tried to increase the client timeout as I thougth it could be related to a timeout on the client but it does not seem to change it.
    When the clients timeout kicks in, the error code is 1040 and not 1039.

    I investigated a bit further and found the likely cause:
    Besides the 500kb limit for the server side outgoing buffer there is also a 50kb limit for data in transit, which means data that has been sent by the server, but for which no ack has been received from the client yet, and there is also a 256kb/s transmit rate limit, which is checked in 200ms intervals (leading to an effective 51.2kb per 200ms transmit rate limit).
    So if you send 20kb every 100ms, then you will be fine.
  • Hi @Kaiserludi

    Once again a perfect answer :smile:

    Just to make sure I understand it 100%
    Besides the 500kb limit for the server-side outgoing buffer there is also a 50kb limit for data in transit, which means data that has been sent by the server

    So I cannot send packets larger than 50kb to the server, which can store up to 500kb data. And the server will never send packets larger than 50kb ?

    So the thing that I do not understand quite well right now is that the only way I know to send data to the server is through services() and more particularly through sendOutgoingCommands().

    From what I have understood, it will send all the outgoing commands at once, and from what my logs said, It can happen that I send big chunks of data
    [photon] sent data of 123290 byte
    

    This log comes from the difference of ByteOut before and after sendOutgoingCommands()

    Is it possible to limit the size of the data to be sent ? can it be related to the problem ?

    On the other side, it does not seem that the host (the one sending data) get disconnected, it is the client (the one receiving data) that does get disconnected.
    instance->load_balancing_client.serviceBasic();
    while (instance->load_balancing_client.getIncomingReliableCommandsCount() > 0)
    {
    	instance->load_balancing_client.dispatchIncomingCommands();
    	if (CurrentByteIn < instance->load_balancing_client.getBytesIn())
    	{
    		LOG("[Photon] Number of incoming commands : %d", instance->GetPhotonClient()->getIncomingReliableCommandsCount());
    		dispatched += instance->load_balancing_client.getBytesIn() - CurrentByteIn;
    		LOG("[Photon] Dispached %d bytes of data", instance->load_balancing_client.getBytesIn() - CurrentByteIn);
    		CurrentByteIn = instance->load_balancing_client.getBytesIn();
    	}
    }
    

    corresponding logs
    [Photon] Number of incoming commands : 1
       [Photon] Dispached 40122 bytes of data
       [Photon] Total dispached 40122 bytes of data
       [Photon] Number of incoming commands : 3
       [Photon] Dispached 80244 bytes of data
       [Photon] Total dispached 80244 bytes of data
       [Photon] Number of incoming commands : 5
       [Photon] Dispached 128803 bytes of data
       [Photon] Total dispached 128803 bytes of data
     E [photon] received connection error 1039
       [photon] disconnected
    
  • Hi @benjaml.

    It's not possible to around packets with Photon that are larger than 1200 bytes, anyway as Photon uses 1200byte as maximum transfer unit for UDP. Messages bigger than that get split up into fragmented and transferred in multiple UDP packets.

    TCP on the other hand is a streaming protocol: It does not have anything like packets.

    Hence in both cases the maximum message size is not affected by the 50kb limit for the data in transit.
    I am not sure if this limit is even relevant for TCP.
    For reliable UDP it works like this:
    When client 1 sends 3 messages of 20kb each to the server that should be forwarded to player 2, then the server will immediately forward the first 2 messages to client 2, but wait with transferring the 3rd message as well until it has received an acknowledgement by client 2 for receiving the 1st one.

    What is kicking in in your case is likely the transmit rate limit.

    sendOutgoingCommands() will serialize and pass the whole message to the socket and the socket will send it out as fast as it can, and you have no control over this, so if the available upload speed of the senders internet connection is higher than 256kb/s (2mbit/s), then you will run into this limit, when you pass messages that are bigger than 25.6kb to opRaiseEvent() and connect to the public cloud, where you can't set the transfer rate to a higher value.

    The obvious solution to this is to split your message into smaller chunks of not more than 25.6kb each and pass one of them to Photon every 200ms. Then re-assemble the message on the receiving side.

  • Hi @Kaiserludi
    The obvious solution to this is to split your message into smaller chunks of not more than 25.6kb each and pass one of them to Photon every 200ms. Then re-assemble the message on the receiving side.

    I have set the maximum message size for my game at 20kb so this part should be ok. So I can send all I want, up 256kb/s or the server will not be able to send the data to the client and then fill the 500kb buffer ?
  • benjaml
    benjaml
    edited March 2020
    Ok so I changed a bit my code to make sure not to send more than 256kb/s but I still get error 1039 and disconnection from the client.

    I have logged the In and Out bytes, so the host does not send more than 256bk/s but the client does receive way more than 256kb/s

    I do not understand what could lead to this error 1039 on the client that is only receiving data

    Also maybe this information is important but I did not have those issues before the TCP issues that happened on the 27th February
  • Kaiserludi
    Kaiserludi admin
    edited March 2020
    I have logged the In and Out bytes, so the host does not send more than 256bk/s but the client does receive way more than 256kb/s.

    It works fine for me if I send a byte-array with 20k elements in demoLoadBalancing in NetworkLogic::sendEvent() and set the sendInterval to 100 and reduce the SLEEP() time of the run loop to 10ms (the 25.6kb every 200ms recommendation was a typo - its should have been 51.2kb every 200m or 25.6kb every 100ms).
  • benjaml
    benjaml
    edited March 2020
    Hi @Kaiserludi

    ok so I was able to reproduce the error on the demoLoadBalancing

    The issue is when I load the level I got a big burst of data send (200-400kb), so I add this piece of code in the STATE_JOINED case. Create a room with a host and then make a client join, then send the message burst and you should be able to see that the client does get disconnected with error 1039

    also I have set the sendInterval to 20ms to match my game
    case INPUT_3: // Burst message
    {
    	for (int index = 0; index < 20; index++)
    	{
    		static nByte burstMessage[20000];
    		ExitGames::LoadBalancing::RaiseEventOptions options;
    		// player 2 to separate the sender and the receiver
    		int targetPlayer = 2;
    		options.setTargetPlayers(&targetPlayer, 1);
    		bool success = mLoadBalancingClient.opRaiseEvent(false, &burstMessage[0], 20000, 0, options);
    	}
    }
    

    I was able to avoid the disconnection by storing the messages into a queue and before sendOutgoingCommands() use opRaiseEvent() up to 3 messages at the time (60kb) and it does the job. It seems that the problem was that I was sending too much data at once (sometimes sendOutgoingCommands was sending 150-300kb)
  • Hi @benjaml.

    Well, in your snippet you send 20k every 20ms. That results in 1000k per second or 200k per 200ms, which is nearly 4 times the 256kb/s transfer rate limit.

    As I have said you need to stay below 51.2kb per 200ms.
  • benjaml
    benjaml
    edited March 2020
    Hi @Kaiserludi

    yes, sorry I forgot to say that I was not sending anything in NetworkLogic::sendEvent(). I'm only sending the burst of message
  • Hi @benjaml.

    Oh wait, I just realized it's even worse. You send all 20 20k messages at once. So 400k will go out at once within the same 20ms window. That means that you send 400k within one 20ms window, so even if you don't send anything for the other 180ms, that still means 400k with 200ms, which is 8 times the transfer rate limit.
  • Hi @Kaiserludi

    Yes, the problem is that I have a big data burst to send. For now, I send maximum 50kb at once and when I send data I "pause" the send of data and only send acks for a time depending on the data just send (Data_size_send / 256kb/s) to make sure not to exceed 256kb/s.

    If you have another (more elegant) way to do so I'm open to suggestions
  • Hi @benjaml.

    I have just talked to @Philip and he informed me that the config values that I referred to are only relevant for UDP, but do not affect TCP. Furthermore he told me that there is another config value for the server side buffer for TCP, which coincidentally also was set to 500kb.
    We recently changed the logic behind when a TCP message will be buffered on the server side, before sending it to the receiving client. While doing so my colleagues also reviewed the config values for TCP and thought that 500kb were excessive, so they changed that value to 50kb, assuming that that was working fine for everyone. This change was rolled out on the Public Cloud on 27.2.
    After I have reported your issue to our server side engineers and dev-ops, they have rolled back that change today and the buffer is again at 500kb on 96.3% of the EU-region game server.
    They have kept it on 50kb on the other 3.7% for now. Those should follow in the next couple of days, so you may still occasionally experience the issue, but most of the time it should now work as before again.
  • Hi @Kaiserludi

    Nice ! I think that the user that send too much data should receive an error to tell him that he sent too much data
  • @benjaml:
    I agree with you on that, but unfortunately due to specifics of the server architecture this is not currently possible.