Production Server Crashed - Running Version 2.4.7

Options
JoshF47
JoshF47
edited January 2012 in Photon Server
I'm running server version 2.4.7 in production (32 bit Windows 2008 Server), and just had it crash completely unexpectedly. The XML below is what I see in the event viewer for this "Application Error". Is there anything more I can provide that would help identify what happened and how to prevent it in the future? Is this possibly something fixed in version 2.6.1?

It appears to have just suddenly crashed for no reason at all. I have now changed the service to restart if it fails again, but as they say, failure is not an option. It of course needs to always run without crashing. Perhaps there is something I have in my code causing this (what, I can't imagine yet - only thing I see happening last in my code is a login completed which does update MySQL database), but I can't think of a single way to know what happened or how to find out more information if it happens again.

Any suggestions on getting more information on what happened either now or the next time it happens and any input on whether this may be a known issue of any kind and an upgrade to 2.6.1 would resolve it?

Thanks,
Josh

The only thing of note in the instance log file is this:
3100: 03:05:00.310 - Purecall - Crash dump creation failed

Event Data:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Application Error" />
<EventID Qualifiers="0">1000</EventID>
<Level>2</Level>
<Task>100</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-12-15T03:05:00.000Z" />
<EventRecordID>36298</EventRecordID>
<Channel>Application</Channel>
<Computer>ip-XXXXXX7B</Computer>
<Security />
</System>
<EventData>
<Data>PhotonSocketServer.exe</Data>
<Data>0.0.0.0</Data>
<Data>4d5a614e</Data>
<Data>PhotonSocketServer.exe</Data>
<Data>0.0.0.0</Data>
<Data>4d5a614e</Data>
<Data>40000015</Data>
<Data>0010e242</Data>
<Data>408</Data>
<Data>01ccb65fd16e8086</Data>
</EventData>
</Event>


<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Service Control Manager" Guid="{555908D1-A6D7-4695-8E1E-26931D2012F4}" EventSourceName="Service Control Manager" />
<EventID Qualifiers="49152">7034</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2011-12-15T03:07:13.000Z" />
<EventRecordID>59501</EventRecordID>
<Correlation />
<Execution ProcessID="0" ThreadID="0" />
<Channel>System</Channel>
<Computer>ip-XXXXXX7B</Computer>
<Security />
</System>
<EventData>
<Data Name="param1">Photon Socket Server: Instance1</Data>
<Data Name="param2">1</Data>
</EventData>
</Event>

Comments

  • Tobias
    Options
    Hello Josh,
    This looks like a category of issues that we solved over time.
    I will have to investigate if 2.6.x has this fixed.
  • JoshF47
    Options
    Thanks Tobias, appreciate the quick response and investigation. We will work on upgrading to 2.6.11 as the latest, but it will take us a little while to get it all up and tested, so hoping it will solve it.
    Tobias wrote:
    Hello Josh,
    This looks like a category of issues that we solved over time.
    I will have to investigate if 2.6.x has this fixed.
  • Tobias
    Options
    Sorry for the late follow up.
    Not knowing exactly the cause for your crash, we can't be fully sure but we found at least 2 changes that might be related. So, the last v2 version is most likely fixed.
    If not, we expect it to be able to write dump files, which wasn't possible before, apparently. This would help in worst case.

    How far are you with the upgrade? I hope you didn't run in major roadblocks?
  • JoshF47
    Options
    Tobias wrote:
    ... How far are you with the upgrade? I hope you didn't run in major roadblocks?

    We were able to update without much of an issue (our code is extended from the examples), so hopefully we will have time to finish the testing in the next week or so and go live with it. We are kind of anal about testing, and so we take our time particularly with server upgrades.

    Thanks,
    Josh
  • JoshF47
    Options
    Well, we updated to the very latest server version 2-6-11-1647, and just yesterday/early this morning it crashed again. The results in the event data and logs are basically the same, but I'll include them again below just for reference.

    Event Data:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
    <System>
      <Provider Name="Application Error" /> 
      <EventID Qualifiers="0">1000</EventID> 
      <Level>2</Level> 
      <Task>100</Task> 
      <Keywords>0x80000000000000</Keywords> 
      <TimeCreated SystemTime="2012-01-28T05:13:14.000Z" /> 
      <EventRecordID>291979</EventRecordID> 
      <Channel>Application</Channel> 
      <Computer>ip-XXXXXXX</Computer> 
      <Security /> 
    </System>
    <EventData>
      <Data>PhotonSocketServer.exe</Data> 
      <Data>2.6.7.525</Data> 
      <Data>4dde7592</Data> 
      <Data>PhotonSocketServer.exe</Data> 
      <Data>2.6.7.525</Data> 
      <Data>4dde7592</Data> 
      <Data>40000015</Data> 
      <Data>00115402</Data> 
      <Data>9cc</Data> 
      <Data>01cccadd3f03ffe7</Data> 
    </EventData>
    </Event>
    

    The instance log says basically nothing again:
    3952: 05:13:14.542 - Failed to create crash dump
    3952: 05:13:14.542 - Purecall
    

    When it starts back up it says this:
    2476: 05:15:08.733 - Config File: C:\photon\deploy\bin_Win32\PhotonServer.config
    2476: 05:15:08.733 - Will NOT produce crash dumps
    

    So apparently I have not set it to create a crash dump as I thought. I have now added the config to ProduceDumps="true" for the instance and will re-start it later to pick that up in the future. No idea why that was not set... but it is now.

    Thanks,
    Josh
  • Tobias
    Options
    hey.
    Any news on this?
    This is sadly one of the issues we can't locate without a dump file, so did the server create some yet?
  • JoshF47
    Options
    Tobias wrote:
    Any news on this?..
    No updates today, it seems to not happen very often as there was about a month gap since the last one. The instance is definitely set to create a crash log now, so if/when it happens again I will let you know for sure - just don't expect it be "soon" (which I suppose is a good thing in one sense and not in another). I completely understand the need for the crash log in order for you to do anything with this.

    Thanks for the great support, as always!
    Josh