Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRIU Unhandled exception in criu-portable-restore_test_0 on JDK11 xlinux #16807

Closed
llxia opened this issue Mar 1, 2023 · 2 comments
Closed

Comments

@llxia
Copy link
Contributor

llxia commented Mar 1, 2023

Failure link

This is a portable CRIU test.

00:06:37.712  openjdk version "11.0.19-beta" 2023-04-18
00:06:37.712  IBM Semeru Runtime Open Edition 11.0.19+3-202302282304 (build 11.0.19-beta+3-202302282304)
00:06:37.712  Eclipse OpenJ9 VM 11.0.19+3-202302282304 (build master-049495700, JRE 11 Linux amd64-64-Bit Compressed References 20230228_655 (JIT enabled, AOT enabled)
00:06:37.712  OpenJ9   - 049495700
00:06:37.712  OMR      - 084e87a92
00:06:37.712  JCL      - e2754db261 based on jdk-11.0.19+3)

Optional info

The exception can be repeated in Grinder by running criu-portable-restore_test_0.

Note: since this is CRIU container test, the SDK in the container will be used. That is, the value in CUSTOMIZED_SDK_URL in Grinder is irrelevant for the test.

Failure output (captured from console output)

00:02:30.959  ===============================================
00:02:30.959  Running test criu-portable-restore_test_0 ...
00:02:30.959  ===============================================
...
00:03:20.774  sudo podman run --privileged  --tmpfs /run  --name restore-test --rm sys-rt-docker-local.artifactory.swg-devops.com/criu_image_upload/11-openj9-ubuntu-linux_x86-64-hw.arch.x86.skylake:latest
00:03:20.774  Restore tests from Checkpoint
00:03:20.774  export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
00:03:20.774  export LD_BIND_NOT=on
00:03:20.774  Error (criu/util.c:643): exited, status=3
00:03:20.774  Error (criu/util.c:643): exited, status=3
00:03:20.774  Warn  (criu/kerndat.c:1421): Can't get pidfd
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Test cmdLineTester_criu_keepCheckpoint_0 passed
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Test cmdLineTester_criu_keepCheckpoint_1 passed
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Test cmdLineTester_criu_keepCheckpoint_2 passed
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.774  Test cmdLineTester_criu_keepCheckpoint_3 passed
00:03:20.775  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.775  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.775  Test cmdLineTester_criu_keepCheckpoint_4 passed
00:03:20.775  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.775  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:03:20.775  Test cmdLineTester_criu_keepCheckpoint_5 passed
00:03:20.775  Pulling image sys-rt-docker-local.artifactory.swg-devops.com/criu_image_upload/11-openj9-ubuntu-linux_x86-64-hw.arch.x86.amd:latest
00:03:20.775  Trying to pull sys-rt-docker-local.artifactory.swg-devops.com/criu_image_upload/11-openj9-ubuntu-linux_x86-64-hw.arch.x86.amd:latest...
00:03:20.775  Getting image source signatures
00:03:20.775  Copying blob sha256:9f11606e6a47250865e837ff79d8f88b9212198cfe75dba2d31bed927e0032f2
00:03:20.775  Copying blob sha256:2509959bbe69d49720b1c1bdf08fbd2a9ad5ae1c9368d312827b523b6cacc308
00:03:20.775  Copying blob sha256:0e347860f0a56ef83f76da2b8af41e3382de27b44f2b868a50655b74106c5555
00:03:20.775  Copying blob sha256:177bb53647b1daf9440791525ed3a4e27b7c38040c2fe2f42029e9c29f217b6d
00:03:20.775  Copying blob sha256:ab1ba2d494ce72a300e302d0661cf578ec7b41f832f66fb9eee1fee29d56092c
00:03:20.775  Copying blob sha256:f6889a73b0204cc779785f9320e062dc8bd4096e22f51664a5e1205eba0e118b
00:03:20.775  Copying blob sha256:7470190450e159c88a43bc4e13ff1f4c31f404c2a569fe93a43cde0485c77277
00:03:20.775  Copying blob sha256:12b4fc74a3724ae2db99cfdef427f632c007cfee716c0b018b889e2fc8dac92d
00:03:20.775  Copying blob sha256:4c5459d5193bafb42cc6a7d81195a3b38bb835c822788a9c93110a2cff5cc4b8
00:03:24.490  Copying blob sha256:b69443d059d67920fc63a1255b33ddd44e1634ff3f0beb762fcf18ac3328553d
00:03:24.937  Copying blob sha256:54a1e86a6e48215192abd7ebab65640fc516bf23dd004cd5bd28806706af1cc8
00:03:24.937  Copying blob sha256:381ea2d406171b43032c6b3962da046bfddaff849616a91282696ca9638a045c
00:03:25.384  Copying blob sha256:21ca441035f068245c7c1ed78ef2bf7a2ed03625ff3483bd93ae8848ac6fe19d
00:03:25.384  Copying blob sha256:44cdf6eb484cc1f8eb7a0d00d8cc221d38b20f45c6d77250d8105373911eb30a
00:05:06.598  Copying config sha256:e3f9f901030dc24a1842532388015a6614dd073517e6d8692dcf4b9947b7ab7b
00:05:06.598  Writing manifest to image destination
00:05:06.598  Storing signatures
00:05:06.598  e3f9f901030dc24a1842532388015a6614dd073517e6d8692dcf4b9947b7ab7b
00:05:06.598  sudo podman run --privileged  --tmpfs /run  --name restore-test --rm sys-rt-docker-local.artifactory.swg-devops.com/criu_image_upload/11-openj9-ubuntu-linux_x86-64-hw.arch.x86.amd:latest
00:05:06.598  Restore tests from Checkpoint
00:05:06.598  export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
00:05:06.598  export LD_BIND_NOT=on
00:05:06.598  Error (criu/util.c:643): exited, status=3
00:05:06.598  Error (criu/util.c:643): exited, status=3
00:05:06.598  Warn  (criu/kerndat.c:1421): Can't get pidfd
00:05:06.598  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:05:06.598  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:05:06.598  Test cmdLineTester_criu_keepCheckpoint_0 failed
00:05:06.598  Total checkpoint(s) 1:
00:05:06.598  Pre-checkpoint
00:05:06.598  Performing CRIUSupport.checkpointJVM(), current thread name: main, Wed Mar 01 15:44:51 UTC 2023, System.currentTimeMillis(): 1677685491251, System.nanoTime(): 3177605804453282
00:05:06.598  Unhandled exception
00:05:06.598  Type=Segmentation error vmState=0x00200003
00:05:06.598  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:05:06.598  Handler1=00007FA8EE791130 Handler2=00007FA8EE4EAB50 InaccessibleAddress=0000000000000068
00:05:06.598  RDI=0000000000000000 RSI=0000000000000000 RAX=00000000000000EE RBX=00007FA8EF34BA10
00:05:06.598  RCX=00007FA8EF97E65C RDX=0000000000000000 R8=0000000000000000 R9=0000000000000000
00:05:06.598  R10=0000000000000000 R11=0000000000000000 R12=00007FA8EF34BB18 R13=00007FA8E819D628
00:05:06.598  R14=00007FA8EF30A4E0 R15=00007FA8EF3093E0
00:05:06.598  RIP=00007FA8EF9C50D2 GS=0000 FS=0000 RSP=00007FA8EF34B660
00:05:06.598  EFlags=0000000000010206 CS=0033 RBP=0000000000000008 ERR=0000000000000004
00:05:06.598  TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000068
00:05:06.598  xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm4 00ff000000000000 (f: 0.000000, d: 7.063274e-304)
00:05:06.598  xmm5 bff0000000000000 (f: 0.000000, d: -1.000000e+00)
00:05:06.598  xmm6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm7 726d6f2f6372732f (f: 1668444928.000000, d: 1.570148e+243)
00:05:06.598  xmm8 6170736b726f772f (f: 1919907584.000000, d: 2.312844e+161)
00:05:06.598  xmm9 ffff000000ffffff (f: 16777215.000000, d: -nan)
00:05:06.598  xmm10 2020000000202020 (f: 2105376.000000, d: 5.966673e-154)
00:05:06.598  xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.598  Module=/lib64/ld-linux-x86-64.so.2
00:05:06.598  Module_base_address=00007FA8EF9B4000
00:05:06.598  Target=2_90_20230228_655 (Linux 4.18.0-425.10.1.el8_7.x86_64)
00:05:06.598  CPU=amd64 (32 logical CPUs) (0x1f605e7000 RAM)
00:05:06.598  ----------- Stack Backtrace -----------
00:05:06.598   (0x00007FA8EF9C50D2 [ld-linux-x86-64.so.2+0x110d2])
00:05:06.598   (0x00007FA8EF9CCC3E [ld-linux-x86-64.so.2+0x18c3e])
00:05:06.598  ---------------------------------------
00:05:06.598  JVMDUMP039I Processing dump event "gpf", detail "" at 2023/03/01 18:00:36 - please wait.
00:05:06.598  JVMDUMP032I JVM requested System dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/core.20230301.180036.1589.0001.dmp' in response to an event
00:05:06.598  JVMDUMP010I System dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/core.20230301.180036.1589.0001.dmp
00:05:06.598  JVMDUMP032I JVM requested Java dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/javacore.20230301.180036.1589.0002.txt' in response to an event
00:05:06.598  JVMDUMP010I Java dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/javacore.20230301.180036.1589.0002.txt
00:05:06.598  JVMDUMP032I JVM requested Snap dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/Snap.20230301.180036.1589.0003.trc' in response to an event
00:05:06.598  JVMDUMP010I Snap dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/Snap.20230301.180036.1589.0003.trc
00:05:06.598  JVMDUMP032I JVM requested JIT dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/jitdump.20230301.180036.1589.0004.dmp' in response to an event
00:05:06.598  JVMDUMP051I JIT dump occurred in 'main' thread 0x0000000000030900
00:05:06.598  JVMDUMP010I JIT dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_0/jitdump.20230301.180036.1589.0004.dmp
00:05:06.598  JVMDUMP013I Processed dump event "gpf", detail "".
00:05:06.598  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:05:06.598  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:05:06.598  Test cmdLineTester_criu_keepCheckpoint_1 failed
00:05:06.598  Total checkpoint(s) 1:
00:05:06.598  Pre-checkpoint
00:05:06.598  Performing CRIUSupport.checkpointJVM(), current thread name: main, Wed Mar 01 15:44:52 UTC 2023, System.currentTimeMillis(): 1677685492494, System.nanoTime(): 3177607048176184
00:05:06.598  Unhandled exception
00:05:06.598  Type=Segmentation error vmState=0x00200003
00:05:06.598  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
00:05:06.598  Handler1=00007FCD66C1C130 Handler2=00007FCD66975B50 InaccessibleAddress=0000000000000068
00:05:06.598  RDI=0000000000000000 RSI=0000000000000000 RAX=00000000000000EE RBX=00007FCD677D6A10
00:05:06.598  RCX=00007FCD67E0965C RDX=0000000000000000 R8=0000000000000000 R9=0000000000000000
00:05:06.598  R10=0000000000000000 R11=0000000000000000 R12=00007FCD677D6B18 R13=00007FCD60169C88
00:05:06.598  R14=00007FCD677954E0 R15=00007FCD677943E0
00:05:06.598  RIP=00007FCD67E500D2 GS=0000 FS=0000 RSP=00007FCD677D6660
00:05:06.599  EFlags=0000000000010206 CS=0033 RBP=0000000000000008 ERR=0000000000000004
00:05:06.599  TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000068
00:05:06.599  xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm4 00ff000000000000 (f: 0.000000, d: 7.063274e-304)
00:05:06.599  xmm5 bff0000000000000 (f: 0.000000, d: -1.000000e+00)
00:05:06.599  xmm6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm7 726d6f2f6372732f (f: 1668444928.000000, d: 1.570148e+243)
00:05:06.599  xmm8 6170736b726f772f (f: 1919907584.000000, d: 2.312844e+161)
00:05:06.599  xmm9 ffff000000ffffff (f: 16777215.000000, d: -nan)
00:05:06.599  xmm10 2020000000202020 (f: 2105376.000000, d: 5.966673e-154)
00:05:06.599  xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
00:05:06.599  Module=/lib64/ld-linux-x86-64.so.2
00:05:06.599  Module_base_address=00007FCD67E3F000
00:05:06.599  Target=2_90_20230228_655 (Linux 4.18.0-425.10.1.el8_7.x86_64)
00:05:06.599  CPU=amd64 (32 logical CPUs) (0x1f605e7000 RAM)
00:05:06.599  ----------- Stack Backtrace -----------
00:05:06.599   (0x00007FCD67E500D2 [ld-linux-x86-64.so.2+0x110d2])
00:05:06.599   (0x00007FCD67E57C3E [ld-linux-x86-64.so.2+0x18c3e])
00:05:06.599  ---------------------------------------
00:05:06.599  JVMDUMP039I Processing dump event "gpf", detail "" at 2023/03/01 18:00:37 - please wait.
00:05:06.599  JVMDUMP032I JVM requested System dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/core.20230301.180037.1776.0001.dmp' in response to an event
00:05:06.599  JVMDUMP010I System dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/core.20230301.180037.1776.0001.dmp
00:05:06.599  JVMDUMP032I JVM requested Java dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/javacore.20230301.180037.1776.0002.txt' in response to an event
00:05:06.599  JVMDUMP010I Java dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/javacore.20230301.180037.1776.0002.txt
00:05:06.599  JVMDUMP032I JVM requested Snap dump using '/aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/Snap.20230301.180037.1776.0003.trc' in response to an event
00:05:06.599  JVMDUMP010I Snap dump written to /aqa-tests/TKG/output_16776854903670/cmdLineTester_criu_keepCheckpoint_1/Snap.20230301.180037.1776.0003.trc
00:05:06.599  JVMDUMP013I Processed dump event "gpf", detail "".
00:05:06.599  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
00:05:06.599  Error (amdgpu_plugin.c:1931): amdgpu_plugin: failed to open kfd in plugin: No such file or directory
...
@llxia
Copy link
Contributor Author

llxia commented Mar 1, 2023

Talked with @LongyuZhang , and this may be the same issue as #16382 (comment). We will try to update glibc to see if the issue would be fixed.

@llxia
Copy link
Contributor Author

llxia commented Mar 6, 2023

Thanks @LongyuZhang . This issue is resolved.

@llxia llxia closed this as completed Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant