Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCS Server crashes on player connection #19

Closed
rurounijones opened this issue Aug 21, 2021 · 5 comments
Closed

DCS Server crashes on player connection #19

rurounijones opened this issue Aug 21, 2021 · 5 comments

Comments

@rurounijones
Copy link
Contributor

rurounijones commented Aug 21, 2021

The DCS dedicated server will crash with the current DCS-gRPC master branch code on player connection:

Steps to replicate.

  1. Run a mission, with DCS-gRPC enabled, on a DCS dedicated server.
  2. Connect to the running server in DCS.
  3. Server will instantly crash with the below exception.

Note that this does not happen:

  • when running the mission in singleplayer so it must be something to do with a multiplayer client connecting.
  • when no one connects, the dedicated server will happily run, even with complicated AI actions going on, as long as no one connects.

I am not sure when this started happening, there is at least one version of the server running on the Hoggit Syria At War server with no issues so we know of so this commit is working correctly.

If you cannot think of anything obvious then I wil do a git bisect although that will take a bit of time.

Stack trace below:

2021-08-21 02:27:34.651 INFO    EDCORE: signal caught: 'Abnormal termination'
2021-08-21 02:27:34.651 INFO    EDCORE: try to write dump information
2021-08-21 02:27:34.653 INFO    EDCORE: # -------------- 20210821-022734 --------------
2021-08-21 02:27:34.654 INFO    EDCORE: DCS/2.7.5.10869 (x86_64; Windows NT 10.0.19042)
2021-08-21 02:27:34.656 INFO    EDCORE: C:\Windows\System32\KERNELBASE.dll
2021-08-21 02:27:34.657 INFO    EDCORE: # 00000000 STATUS_SUCCESS at 73F44ED9 00:00000000
2021-08-21 02:27:34.663 INFO    EDCORE: SymInit: Symbol-SearchPath: '.;C:\Program Files\Eagle Dynamics\DCS World OpenBeta Server;C:\Program Files\Eagle Dynamics\DCS World OpenBeta Server\bin;C:\Windows;C:\Windows\system32;SRV*C:\websymbols*https://msdl.microsoft.com/download/symbols;', symOptions: 528, UserName: 'CowServer'
2021-08-21 02:27:34.665 INFO    EDCORE: OS-Version: 10.0.19042 () 0x100-0x1
2021-08-21 02:27:35.025 INFO    EDCORE: 0x0000000000034ED9 (KERNELBASE): RaiseException + 0x69
2021-08-21 02:27:35.025 INFO    EDCORE: 0x00000000000BB677 (edCore): ed::core::InitVFS::~InitVFS + 0x6B7
2021-08-21 02:27:35.025 INFO    EDCORE: 0x0000000000071881 (ucrtbase): raise + 0x1E1
2021-08-21 02:27:35.026 INFO    EDCORE: 0x0000000000072851 (ucrtbase): abort + 0x31
2021-08-21 02:27:35.026 INFO    EDCORE: 0x000000000000360D (VCRUNTIME140): _is_exception_typeof + 0x136D
2021-08-21 02:27:35.026 INFO    EDCORE: 0x000000000000BB64 (VCRUNTIME140): __C_specific_handler + 0x334
2021-08-21 02:27:35.026 INFO    EDCORE: 0x0000000000003029 (VCRUNTIME140): _is_exception_typeof + 0xD89
2021-08-21 02:27:35.026 INFO    EDCORE: 0x000000000000C021 (VCRUNTIME140): __CxxFrameHandler3 + 0x71
2021-08-21 02:27:35.027 INFO    EDCORE: 0x00000000000A21FF (ntdll): __chkstk + 0x19F
2021-08-21 02:27:35.027 INFO    EDCORE: 0x0000000000030939 (ntdll): RtlUnwindEx + 0x339
2021-08-21 02:27:35.027 INFO    EDCORE: 0x000000000000C80A (VCRUNTIME140): __report_gsfailure + 0x26A
2021-08-21 02:27:35.027 INFO    EDCORE: 0x0000000000006ED1 (lua): lua_getinfo + 0x1291
2021-08-21 02:27:35.028 INFO    EDCORE: 0x0000000000006CE2 (lua): lua_getinfo + 0x10A2
2021-08-21 02:27:35.028 INFO    EDCORE: 0x0000000000006950 (lua): lua_getinfo + 0xD10
2021-08-21 02:27:35.028 INFO    EDCORE: 0x0000000000017BB5 (lua): luaS_newlstr + 0x4075
2021-08-21 02:27:35.028 INFO    EDCORE: 0x0000000000002AA0 (lua): lua_concat + 0x40
2021-08-21 02:27:35.028 INFO    EDCORE: 0x000000000035A82A (dcs_grpc_server): luaopen_dcs_grpc_server + 0x1FCFBA
2021-08-21 02:27:35.028 INFO    EDCORE: 0x0000000000357DD3 (dcs_grpc_server): luaopen_dcs_grpc_server + 0x1FA563
2021-08-21 02:27:35.029 INFO    EDCORE: 0x0000000000007AC5 (lua): luaD_growstack + 0x845
2021-08-21 02:27:35.029 INFO    EDCORE: 0x0000000000018D94 (lua): luaS_newlstr + 0x5254
2021-08-21 02:27:35.029 INFO    EDCORE: 0x0000000000007E04 (lua): luaD_growstack + 0xB84
2021-08-21 02:27:35.029 INFO    EDCORE: 0x0000000000006F7F (lua): lua_getinfo + 0x133F
2021-08-21 02:27:35.029 INFO    EDCORE: 0x000000000000812E (lua): lua_yield + 0x9E
2021-08-21 02:27:35.030 INFO    EDCORE: 0x000000000001EEAC (lua): luaL_newstate + 0x21FC
2021-08-21 02:27:35.030 INFO    EDCORE: 0x0000000000007AC5 (lua): luaD_growstack + 0x845
2021-08-21 02:27:35.030 INFO    EDCORE: 0x0000000000018D94 (lua): luaS_newlstr + 0x5254
2021-08-21 02:27:35.030 INFO    EDCORE: 0x0000000000007E04 (lua): luaD_growstack + 0xB84
2021-08-21 02:27:35.030 INFO    EDCORE: 0x0000000000006F7F (lua): lua_getinfo + 0x133F
2021-08-21 02:27:35.030 INFO    EDCORE: 0x000000000000812E (lua): lua_yield + 0x9E
2021-08-21 02:27:35.031 INFO    EDCORE: 0x0000000000002576 (lua): lua_pcall + 0x66
2021-08-21 02:27:35.031 INFO    EDCORE: 0x00000000000F2DF9 (edCore): ED_lua_pcall + 0x59
2021-08-21 02:27:35.031 INFO    EDCORE: 0x00000000000EE8F4 (edCore): Lua::Config::call_func + 0xF4
2021-08-21 02:27:35.031 INFO    EDCORE: 0x00000000007D3C37 (DCS): SW + 0x4E3107
2021-08-21 02:27:35.031 INFO    EDCORE: 0x00000000008886C3 (DCS): SW + 0x597B93
2021-08-21 02:27:35.032 INFO    EDCORE: 0x000000000088125D (DCS): SW + 0x59072D
2021-08-21 02:27:35.032 INFO    EDCORE: 0x000000000087E612 (DCS): SW + 0x58DAE2
2021-08-21 02:27:35.032 INFO    EDCORE: 0x00000000000C6A57 (Flight): woATC::updateCoalition + 0x687
2021-08-21 02:27:35.032 INFO    EDCORE: 0x000000000009D2DA (Flight): woATC::Control + 0x51A
2021-08-21 02:27:35.032 INFO    EDCORE: 0x00000000000576CC (Flight): wAirdrome::Control + 0xCC
2021-08-21 02:27:35.032 INFO    EDCORE: 0x000000000005B3E3 (Flight): wAirdrome::Init + 0xF33
2021-08-21 02:27:35.033 INFO    EDCORE: 0x0000000000003AD6 (World): wSimTrace::CommandsTraceDiscreteIsOn + 0x466
2021-08-21 02:27:35.033 INFO    EDCORE: 0x0000000000003F3D (World): wSimCalendar::DoActionsUntil + 0x1FD
2021-08-21 02:27:35.033 INFO    EDCORE: 0x00000000007ECA22 (DCS): SW + 0x4FBEF2
2021-08-21 02:27:35.033 INFO    EDCORE: 0x00000000007EC78E (DCS): SW + 0x4FBC5E
2021-08-21 02:27:35.033 INFO    EDCORE: 0x00000000008022EB (DCS): SW + 0x5117BB
2021-08-21 02:27:35.033 INFO    EDCORE: 0x00000000007D3194 (DCS): SW + 0x4E2664
2021-08-21 02:27:35.034 INFO    EDCORE: 0x00000000007D356D (DCS): SW + 0x4E2A3D
2021-08-21 02:27:35.034 INFO    EDCORE: 0x0000000001C7332F (DCS): AmdPowerXpressRequestHighPerformance + 0xE7D32B
2021-08-21 02:27:35.034 INFO    EDCORE: 0x0000000000A0B23E (DCS): SW + 0x71A70E
2021-08-21 02:27:35.034 INFO    EDCORE: 0x0000000000017034 (KERNEL32): BaseThreadInitThunk + 0x14
2021-08-21 02:27:35.034 INFO    EDCORE: 0x0000000000052651 (ntdll): RtlUserThreadStart + 0x21
2021-08-21 02:27:35.243 INFO    EDCORE: Minidump created.
2021-08-21 02:27:35.243 INFO    Lua::Config: stack traceback:
@rurounijones rurounijones changed the title DCS Server crash DCS Server crashes on player connection Aug 21, 2021
@rurounijones
Copy link
Contributor Author

After some investigation with Binary we have determined the following:

It seems to be a combination of Release target for the rust build + A dedicated server + complex missions.

Removing --world.addEventHandler(eventHandler) seems to stop the crash occurring.

@rkusa
Copy link
Collaborator

rkusa commented Aug 21, 2021

I've tracked the cause of the crash down to https://github.com/khvzak/mlua/blob/01714d2510dfd7ee86d6beb7e1aefab3ccdd9abf/src/ffi/compat53.rs#L710. This line is part of code responsible for creating a stacktrace when we return an error to Lua. Our code basically encounters an error, correctly returns the error to Lua, but the code that prepares the stacktrace of this error for logging/display leads to the crash. Basically, we have an error in the error handler.

After a lot of debugging, I think the cause is not within mlua though. I've checked it multiple times and the code looks sound. While debugging I was also able to receive a proper error instead of a crash once in a while. The error was:

Error in event handler: attempt to concatenate a userdata value

This brings me to believe that this must be an issue of code accessing the lua stack concurrently and thus polluting it with non-strings before the linked code above calls lua_concat on a stack it believes to only consist out of strings.

I don't think that the concurrent access is introduced by our module, I could be wrong though.

I'll thus push a commit that is going to work around the bug by logging the error directly, instead of returning it to Lua.

With the change, we are seeing the culprit of all of this:

2021-08-21T18:11:36.148756100+02:00 ERROR dcs_grpc_server - failed to deserialize event: deserialize error: invalid type: string "ammo|347|2 object", expected struct Unit

rkusa added a commit that referenced this issue Aug 21, 2021
rkusa added a commit that referenced this issue Aug 21, 2021
@rurounijones
Copy link
Contributor Author

Confirmed that the mission that was previously crashing is no longer crashing. Not sure if you want to keep this issue open until you find a full fix / a bug report to report to the mlua folks to link to or not so I shall leave its status unchanged for the moment.

@rkusa
Copy link
Collaborator

rkusa commented Aug 23, 2021

Not sure if you want to keep this issue open until you find a full fix / a bug report to report to the mlua folks to link to or not so I shall leave its status unchanged for the moment.

I think the concurrent access is coming from DCS, so I don't think that there is anything wrong on mlua's side, which is why I am not planing to create a bug report in their repo. I'd like to keep the issue open as a reminder to myself in case I feel like looking into it again to see if I can pin-point the source of the concurrent access to be sure about the assumptions I've made so far.

@rkusa
Copy link
Collaborator

rkusa commented Oct 12, 2021

Actually, I'd rather close the issue as the crash doesn't happen anymore and if people see the issue they might think that it still happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants