Reliable message lost???
Options
Strange behaviour observed with the messages that exceed UDP package size. Sometimes they just get lost.
The code looks like this:
immediately followed by
To check the delivery I attach the message ID to each message and print it to trace dump.
On the receiving side I also print the ID of the received message.
Under the heavy load (several messages that exceed UDP package size in row), a couple of messages get lost.
What could be the reason and how I could debug this further?
The code looks like this:
Result = mLoadBalancingClient.opRaiseEvent(true , data , size , Code , ExitGames::LoadBalancing::RaiseEventOptions().setReceiverGroup(Lite::ReceiverGroup::OTHERS) );
immediately followed by
mLoadBalancingClient.serviceBasic(); mLoadBalancingClient.sendOutgoingCommands())
To check the delivery I attach the message ID to each message and print it to trace dump.
On the receiving side I also print the ID of the received message.
Under the heavy load (several messages that exceed UDP package size in row), a couple of messages get lost.
What could be the reason and how I could debug this further?
0
Comments
-
To add on this - no warning or errors received from Photon.0
-
Update - sometimes I get this warning:
PHOTON: WARNING code: 1037 (PhotonPeer sent queue is filling up. Check, why the server does not acknowledge your sent reliable commands.)
0 -
When I send a lot of less than udp size packages, they also get lost.
Also heavily depends on the regional server that I am connecting to - some are better, some are worse, but sending something like 60-100 packages per second, each of 1024 bytes or so eventually leads to the packets not arriving to clients. But the connection is kept, no player gets the "disconnect" callback.
Sometimes arrives WARNING enetpeer.cpp ExitGames::Photon::Internal::EnetPeer::queueSentReliableCommand() 959 WARNING! There are 100 sent reliable messages waiting for their acknowledgement!
followed by PHOTON: WARNING code: 1037, but not necessarily. Quite often the messages are just being lost without any warning.
No warnings or errors at the client (receiving) side.0 -
Hi @ufnv.
I can't reproduce the missing messages. Either the messages arrive or my client gets disconnected.
Could you provide us with a minimal self-contained reproduction project, please?
0 -
I will try. It is 100% reproducible for me. Maybe I'll just record the packets and the timing to be able to reproduce it.0
-
@Kaiserludi
Uploaded the code here:
https://drive.google.com/file/d/1usIQ25APvW_dM1FPHwIUTiWw57OJVvD1/view?usp=sharing
It's a VS 2017 project that builds the "player" of the recorded traffic.
The record itself is PhotonLostMessages/server_dump.bin file.
This file needs to be in the same directory as the "server" executable (for example, x64/Release/).
You need to start the PhotonLostMessages.exe, then when it prints "Waiting for Player 0 to join...", start the client.exe.
The server application then will be sending the recorded packages in the same order and with (approximately) the same timing as it was recorded inside the game.
The client application will be reading the received packages and count the markers received. When some package is lost, it will notify with the "MESSAGE LOST!!!" note.
Both applications also print the Photon output.
I've recorded a very stressful trace just to be sure the lost package will appear, so finally the forced disconnect will arrive, but much too late - already a lot of packages will be lost and in the real game this can easily lead to crashes, as the game does not expect any package to be lost. It expects them either to be delivered in order or to get forced disconnect in the worse case scenario.
This example is a little extreme - throwing 50 messages per second, 5-10K each for about 3-5 seconds - it is considered a too heavy load already?0 -
Yes, this amount of load is too much.
You want to avoid sending chunks > MTU (1200 bytes which also must contain the headers / type info). Else, the message becomes fragmented and sent reliable.
If any of the messages fail to send entirely, that will lead to a timeout.
Please share the Wireshark capture. We would like to look at that before running any exe.0 -
What is Wireshark capture?
The problem is with this kind of load the reliable packets start disappearing long before the timeout. Or there can be no time-out at all if the load is more moderate. Here I've used the extra load just to make sure the problem is 100% reproducible.
Still for the real game there can be situation when the reliably sent packet is lost and there is no timeout nor any warning/error.0 -
The trace output looks like this:
Note the first lost message is nearly two hundred messages before the disconnect finally happened.
If to stop the extra load right after the first lost message, the disconnect will never happen.
Also, no need to run the provided exe's, just compile and run the provided code - it's the safest way.Connecting...done Joining Room...done received message 1 received message 2 received message 3 received message 4 received message 5 received message 6 received message 7 received message 8 received message 9 received message 10 received message 11 received message 12 received message 13 received message 14 received message 15 received message 16 received message 17 received message 18 received message 19 received message 20 received message 21 received message 22 received message 23 received message 24 received message 25 received message 26 received message 27 received message 28 received message 29 received message 30 received message 31 received message 32 received message 33 received message 34 received message 35 received message 36 received message 37 received message 38 received message 39 received message 40 received message 41 received message 42 received message 43 received message 44 received message 45 received message 46 received message 47 received message 48 received message 49 received message 50 received message 51 received message 52 received message 53 received message 54 received message 55 received message 56 received message 57 received message 58 received message 59 received message 60 received message 61 received message 62 received message 63 received message 64 received message 65 received message 66 received message 67 received message 68 received message 69 received message 70 received message 71 received message 72 received message 73 received message 74 received message 75 received message 76 received message 77 received message 78 received message 79 received message 80 received message 81 received message 82 received message 83 received message 84 received message 85 received message 86 received message 87 received message 88 received message 89 received message 90 received message 91 received message 92 received message 93 received message 94 received message 95 received message 96 received message 97 received message 98 received message 99 received message 100 received message 101 received message 102 received message 103 received message 104 received message 105 received message 106 received message 107 received message 108 received message 109 received message 110 received message 111 received message 112 received message 113 received message 114 received message 115 received message 116 received message 117 received message 118 received message 119 received message 120 received message 121 received message 122 received message 123 received message 124 received message 125 received message 126 received message 127 received message 128 received message 129 received message 130 received message 131 received message 132 received message 133 received message 134 received message 135 received message 136 received message 137 received message 138 received message 139 received message 140 received message 141 received message 142 received message 143 received message 144 received message 145 received message 146 received message 147 received message 148 received message 149 received message 150 received message 151 received message 152 received message 153 received message 154 received message 155 received message 156 received message 157 received message 158 received message 159 received message 160 received message 161 received message 162 received message 163 received message 164 received message 165 received message 166 received message 167 received message 168 received message 169 received message 170 received message 171 received message 172 received message 173 received message 174 received message 175 received message 176 received message 177 received message 178 received message 179 received message 180 received message 181 received message 182 received message 183 received message 184 received message 185 received message 186 received message 187 received message 188 received message 189 received message 190 received message 191 received message 192 received message 193 received message 194 received message 195 received message 196 received message 197 received message 198 received message 199 received message 200 received message 201 received message 202 received message 203 received message 204 received message 205 received message 206 received message 207 received message 208 received message 209 received message 210 received message 211 received message 212 received message 213 received message 214 received message 215 received message 216 received message 217 received message 218 received message 219 received message 220 received message 221 received message 222 received message 223 received message 224 received message 225 received message 226 received message 227 received message 228 received message 229 received message 230 received message 231 received message 232 received message 233 received message 234 received message 235 received message 236 received message 237 received message 238 received message 239 received message 240 received message 241 received message 242 received message 243 received message 244 received message 245 received message 246 received message 247 received message 248 received message 249 received message 250 received message 251 received message 252 received message 253 received message 254 received message 255 received message 256 received message 257 received message 258 received message 259 received message 260 received message 261 received message 262 received message 263 received message 264 received message 265 received message 266 received message 267 received message 268 received message 269 received message 270 received message 271 received message 272 received message 273 received message 274 received message 275 received message 276 received message 277 received message 278 received message 279 received message 280 received message 281 received message 282 received message 283 received message 284 received message 285 received message 287 MESSAGE LOST 286!!! received message 288 received message 289 received message 290 received message 291 received message 292 received message 304 MESSAGE LOST 293!!! MESSAGE LOST 294!!! MESSAGE LOST 295!!! MESSAGE LOST 296!!! MESSAGE LOST 297!!! MESSAGE LOST 298!!! MESSAGE LOST 299!!! MESSAGE LOST 300!!! MESSAGE LOST 301!!! MESSAGE LOST 302!!! MESSAGE LOST 303!!! received message 305 received message 306 received message 307 received message 308 received message 318 MESSAGE LOST 309!!! MESSAGE LOST 310!!! MESSAGE LOST 311!!! MESSAGE LOST 312!!! MESSAGE LOST 313!!! MESSAGE LOST 314!!! MESSAGE LOST 315!!! MESSAGE LOST 316!!! MESSAGE LOST 317!!! received message 319 received message 320 received message 321 received message 322 received message 323 received message 332 MESSAGE LOST 324!!! MESSAGE LOST 325!!! MESSAGE LOST 326!!! MESSAGE LOST 327!!! MESSAGE LOST 328!!! MESSAGE LOST 329!!! MESSAGE LOST 330!!! MESSAGE LOST 331!!! received message 333 received message 334 received message 335 received message 336 received message 337 received message 344 MESSAGE LOST 338!!! MESSAGE LOST 339!!! MESSAGE LOST 340!!! MESSAGE LOST 341!!! MESSAGE LOST 342!!! MESSAGE LOST 343!!! received message 345 received message 346 received message 347 received message 348 received message 349 received message 358 MESSAGE LOST 350!!! MESSAGE LOST 351!!! MESSAGE LOST 352!!! MESSAGE LOST 353!!! MESSAGE LOST 354!!! MESSAGE LOST 355!!! MESSAGE LOST 356!!! MESSAGE LOST 357!!! received message 359 received message 360 received message 361 received message 362 received message 363 received message 371 MESSAGE LOST 364!!! MESSAGE LOST 365!!! MESSAGE LOST 366!!! MESSAGE LOST 367!!! MESSAGE LOST 368!!! MESSAGE LOST 369!!! MESSAGE LOST 370!!! received message 372 received message 373 received message 374 received message 375 received message 384 MESSAGE LOST 376!!! MESSAGE LOST 377!!! MESSAGE LOST 378!!! MESSAGE LOST 379!!! MESSAGE LOST 380!!! MESSAGE LOST 381!!! MESSAGE LOST 382!!! MESSAGE LOST 383!!! received message 385 received message 386 received message 387 received message 389 MESSAGE LOST 388!!! received message 391 MESSAGE LOST 390!!! received message 396 MESSAGE LOST 392!!! MESSAGE LOST 393!!! MESSAGE LOST 394!!! MESSAGE LOST 395!!! received message 397 received message 398 received message 399 received message 400 received message 402 MESSAGE LOST 401!!! received message 405 MESSAGE LOST 403!!! MESSAGE LOST 404!!! received message 407 MESSAGE LOST 406!!! received message 409 MESSAGE LOST 408!!! received message 410 received message 411 received message 412 received message 413 received message 414 received message 415 received message 416 received message 417 received message 418 received message 419 received message 420 received message 421 received message 422 received message 423 received message 424 received message 425 received message 426 received message 427 received message 428 received message 429 received message 430 received message 431 received message 432 received message 433 received message 434 received message 435
0 -
Hi @ufnv.What is Wireshark capture?
https://doc.photonengine.com/en-us/realtime/current/troubleshooting/analyzing-disconnects/#wireshark
and
https://www.wireshark.org/#download
I have had a look into your repro-project and I can reproduce the issue with it. However I have not yet found the time to debug why this is happening.
I will continue looking into this next week.0 -
If you can provide the wireshark captures, I can have a look asap and analyze if there is anything odd with it.
Ludi is our expert for everything "native", so he has to look into the lib itself.0 -
Yes, it's a good first step to repro!
Ludi has a few days off but we'll continue when he's back early next week.
As said: The capture can help analyze what happens. But if you don't have the time for this, we can do one next week, too.0 -
0
-
Ludi reproduced this.
We see that the server disconnects the clients due to internal buffering limits. You exceed the limits quite frequently with your appid.
This may look like loss but actually, it is just the precursor for a disconnect. This is correct behaviour, as either side becomes unable to push everything you enqueued.
You need to reduce the amount of data you send.0 -
@Tobias , I disagree, this cannot be a correct behaviour. The whole idea of the reliable message is that it is either delivered or "disconnect" happens. In the correct paradigm I could rely on the message is delivered and the only exception I need to handle is the "disconnect".
However with the existing behaviour we have the messages not delivered and the disconnect happens only after a while. All this time between the message being lost and the occasional disconnect is the "grey zone", where the game can get any kind of exception because of the missing data from the lost message.
Like, for example, in the message N there is the info on a new unit being created, message N+1 contains the orders for that newly created unit. If the message N is lost, the delivered message N+1 will be referencing the non-existent unit. In the best case this will flag the "resync needed", further increasing the network load, in the worst case this will be an instant crash.
Also I can easily make it so that the level of load will be "on the edge", where disconnect never happens but the messages are lost.
So in the current implementation I just cannot rely on the Photon "reliable" messages and need to duplicate the reliability mechanics.
Would it be fixed, or should I implement my own really reliable messaging?0 -
Ludi told me I missed the part where some messages pass but not all. You are right, that should not be possible and we will look into this with high prio. Ludi will update this thread.0
-
Thanks, looking forward to the update!0
-
Hi @ufnv.
After doing some debugging of this issue on the client side, to me it looks like a bug in the server code.
I have supplied our server-devs with a slightly modified version of your repro-product (changed the server URL to our dev-servers, fixed the x64 Debug config project settings and added additional logging) and am now waiting for them to look into the issue.0