Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Criu test failures for JITServer SSL Tests #18225

Merged
merged 1 commit into from
Oct 6, 2023

Conversation

SajinaKandy
Copy link
Contributor

@SajinaKandy SajinaKandy commented Oct 3, 2023

Recently added CRIU tests are failing in some cases because of the required condition, which will never succeed if we don't have a vlog because of cases like when CRIU can't acquire the original thread IDs. In such cases the output with have something similar to:

 [OUT] pie: 3450382: Error (criu/pie/restorer.c:1833): Unable to create a thread: 3450384
 [OUT] pie: 3450384: Error (criu/pie/restorer.c:598): Thread pid mismatch 3450384/3450383
 [OUT] pie: 3450384: Error (criu/pie/restorer.c:651): Restorer abnormal termination for 3450382
 [OUT] Error (criu/cr-restore.c:2536): Restoring FAILED.
 [OUT] Error (criu/cr-restore.c:1494): 3450382 exited, status=1

The tests conditions needs to be altered to adapt to this expected scenario and silently succeed in such cases.

Fixes: #18148
Fixes: #18140

@SajinaKandy
Copy link
Contributor Author

SajinaKandy commented Oct 3, 2023

@dsouzai this is the proposed fix for test failure as reported in #18148 and #18140 . Can you please review?

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dsouzai
Copy link
Contributor

dsouzai commented Oct 4, 2023

Would it be better to not remove the output that was previously required and instead change it to success? The new failure strings are fine, but I'm must wondering if it's possible that both Connected to a server and Could not connect to a server don't get printed (even in the case where the CRIU restore succeeds). Is there a guarantee that one of those strings will always get printed?

@SajinaKandy
Copy link
Contributor Author

@dsouzai Ideally I would think that either of the Connected to a server or Could not connect to a server should be captured in the vlogs based on connection success/failure to the server. @mpirvu Please confirm if this is correct.
Also my reasoning for changing the required condition to failure is that I wanted to enforce the tests to check if the connection succeeded/failed based on the different conditions and I felt success condition is more forgiving that either of required/failure.

@mpirvu
Copy link
Contributor

mpirvu commented Oct 5, 2023

I would think that either of the Connected to a server or Could not connect to a server should be captured in the vlogs based on connection success/failure to the server. @mpirvu Please confirm if this is correct.

As long as the JVM being tested works in client mode, one of the two messages will be printed.

@SajinaKandy
Copy link
Contributor Author

@dsouzai I have added extra conditions which should take care of both cases.

@dsouzai
Copy link
Contributor

dsouzai commented Oct 6, 2023

jenkins test sanity.functional xlinux,plinux,zlinux jdk17

@dsouzai
Copy link
Contributor

dsouzai commented Oct 6, 2023

linux failures due to existing issues:

[2023-10-06T18:08:05.926Z] ***[TEST INFO 2023/10/06 14:08:01] ProcessKiller detected a timeout after 300000 milliseconds!***
[2023-10-06T18:08:05.926Z] INFO: Cannot find '/usr/bin/gdb' using 'gdb' from the path.
[2023-10-06T18:08:05.926Z] ***[TEST INFO 2023/10/06 14:08:01] executing gdb -batch -x /tmp/debugger4185934162318569605.txt bash 3545025***
[2023-10-06T18:08:05.926Z] java.io.IOException: Cannot run program "gdb": error=2, No such file or directory
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.Runtime.exec(Runtime.java:594)
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.Runtime.exec(Runtime.java:453)
[2023-10-06T18:08:05.926Z] 	at Test$ProcessKiller.captureCoreForProcess(Test.java:697)
[2023-10-06T18:08:05.926Z] 	at Test$ProcessKiller.captureCoreForProcess(Test.java:646)
[2023-10-06T18:08:05.926Z] 	at Test$ProcessKiller.run(Test.java:596)
[2023-10-06T18:08:05.926Z] Caused by: java.io.IOException: error=2, No such file or directory
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:314)
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:244)
[2023-10-06T18:08:05.926Z] 	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1110)
[2023-10-06T18:08:05.926Z] 	... 6 more
[2023-10-06T18:08:05.926Z] ***[TEST INFO 2023/10/06 14:08:01] executing kill -ABRT 3545025***

see #17844 (comment),

[2023-10-06T18:17:41.585Z]  [OUT] Pre-checkpoint
[2023-10-06T18:17:41.585Z]  [OUT] Performing CRIUSupport.checkpointJVM(), current thread name: main, Fri Oct 06 14:17:37 EDT 2023, System.currentTimeMillis(): 1696616257751, System.nanoTime(): 13577329179716472
[2023-10-06T18:17:41.585Z]  [OUT] Exception in thread "main" org.eclipse.openj9.criu.SystemCheckpointException: Could not dump the JVM processes, err=-52
[2023-10-06T18:17:41.585Z]  [OUT] 	at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVMImpl(Native Method)
[2023-10-06T18:17:41.585Z]  [OUT] 	at openj9.criu/org.eclipse.openj9.criu.CRIUSupport.checkpointJVM(CRIUSupport.java:811)
[2023-10-06T18:17:41.585Z]  [OUT] 	at org.openj9.criu.CRIUTestUtils.checkPointJVM(CRIUTestUtils.java:77)
[2023-10-06T18:17:41.585Z]  [OUT] 	at org.openj9.criu.OptionsFileTest.traceOptionsTest3(OptionsFileTest.java:235)
[2023-10-06T18:17:41.585Z]  [OUT] 	at org.openj9.criu.OptionsFileTest.main(OptionsFileTest.java:57)
[2023-10-06T18:17:41.585Z]  [OUT] Error (criu/protobuf.c:72): Unexpected EOF on (empty-image)
[2023-10-06T18:17:41.585Z]  [OUT] Removed test output files
[2023-10-06T18:17:41.585Z]  [OUT] finished script

see #16945

@dsouzai dsouzai merged commit bb64850 into eclipse-openj9:master Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants