Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows JDK8 jvmti tests failed with "NPT ERROR: Cannot open library" #2129

Closed
liqunl opened this issue Jun 8, 2018 · 20 comments
Closed

Windows JDK8 jvmti tests failed with "NPT ERROR: Cannot open library" #2129

liqunl opened this issue Jun 8, 2018 · 20 comments

Comments

@liqunl
Copy link
Contributor

liqunl commented Jun 8, 2018

https://ci.eclipse.org/openj9/job/PullRequest-Sanity-JDK8-win_x86-64_cmprssptrs-OpenJ9/36/tapResults/
https://ci.eclipse.org/openj9/job/PullRequest-Sanity-JDK8-win_x86-64_cmprssptrs-OpenJ9/48/tapResults/

Error message:

===============================================
Running test cmdLineTester_jvmtitests_extended_3 ...
===============================================
cmdLineTester_jvmtitests_extended_3 Start Time: Wed Jun  6 12:08:27 2018 Epoch Time (ms): 1528304907512
test with Mode154
NPT ERROR: Cannot open library

cmdLineTester_jvmtitests_extended_3_FAILED

Failing tests are

cmdLineTester_jvmtitests_extended_3
cmdLineTester_jvmtitests_extended_7
cmdLineTester_jvmtitests_extended_8
cmdLineTester_jvmtitests_extended_9
cmdLineTester_jvmtitests_extended_11
decompileAtMethodResolve_0
@liqunl
Copy link
Contributor Author

liqunl commented Jun 11, 2018

Notice that there are nightly builds whose commits are more advanced than the PRs that failed the tests. I downloaded one nightly build and tried the same tests on my laptop and can't reproduce the error. Since we don't archive builds for PRs, I have no chance to test the failing builds.

I tried running the test on my laptop with npt.dll removed, the same error occurred. Thus, I believe the error is not caused by jit/vm/gc changes. If the same tests passed in nightly build but failed in PR builds, I suspect there is a problem in the packaging process for windows builds triggered by PR's jenkins sanity test. @smlambert @llxia

@pshipton
Copy link
Member

pshipton commented Jun 11, 2018

A few builds are kept as attachments to the build jobs. You'll need to grab it quickly after a failure occurs.

I couldn't find a failed job that was still available, but as an example see the "Build Artifacts" of https://ci.eclipse.org/openj9/job/Build-JDK8-win_x86-64_cmprssptrs/84/

@liqunl
Copy link
Contributor Author

liqunl commented Jun 11, 2018

According to @AdamBrousseau, we don't keep the builds for PR's jenkins test. I didn't find any link to the failing build in the failing PRs. The "Build Artifacts" of this failing build contains only the result of the tests. https://ci.eclipse.org/openj9/job/PullRequest-Extended-JDK8-win_x86-64_cmprssptrs-OpenJ9/3/

The latest nightly build is more advanced than the failing PRs, however, the failure is not reproducible on the nightly builds. Thus the failure is not caused by any jit/gc/vm commits. I think the problem is either in machine, testing or packaging process, with the last one being the most likely.

If anyone can provide me a failing build, I'd love to test it on my laptop.

@pshipton
Copy link
Member

You need to look at the "Build" job which created the JVM used by the Extended test job.

@pshipton
Copy link
Member

Maybe I was looking at the wrong type of builds.

@smlambert
Copy link
Contributor

To be clear, the jvmti_extended tests should be renamed (no tests should contain the level they belong to), especially since they are tagged as sanity tests, so you'd look to see what JVM is used by the nightly sanity test job.

I believe the some tests are failing because they are running with compressedrefs commandline parameters against a nocompressedrefs build.

@pshipton
Copy link
Member

I believe the some tests are failing because they are running with compressedrefs commandline parameters against a nocompressedrefs build.

This should result in an error from the VM from which the problem should be obvious #1777 (comment)

@AdamBrousseau
Copy link
Contributor

I think @liqunl was having troubles getting an sdk that could produce the error since the nightly is the only ones we archive and the error was never seen in the nightly. If your PR is still seeing this error we could hack the Jenkins files in your PR in order to archive the sdk for your build. PM me if you want to discuss going this route.

@liqunl
Copy link
Contributor Author

liqunl commented Jun 12, 2018

My PR has been merged.

I want to clarify that, this is not caused by any PR that has failed the test. The reason being:

  1. This failure has been seen on PRs with completely difference changes.
  2. Two of the PRs that have failed the tests have been merged, the nightly build and continuous build containing the changes of the two PRs have passed.

Correct me if I'm wrong, it looks like the failing tests have never passed in a PR build since they were added.

I think we need to understand why this error only occurs in PR builds but not nightly/continuous builds or internal personal builds, i.e. what's different in the PR builds. It can be the machine, the testing environment, or the resulting jdk.

All in all, I think we need someone from the test or infra team to take a look at this issue based on it's nature. I tried my best last Friday, but I can't help if it is not a jit issue.

@babsingh
Copy link
Contributor

babsingh commented Jun 12, 2018

Possible solution - Create a link to libnpt.so in $JAVA_HOME/jre/lib from /usr/local/lib
ln -s <JAVA_HOME>/jre/lib/amd64/libnpt.so /usr/local/lib/libnpt.so

The above solution works on FreeBSD. For Windows, Adam said that the nightly builds work fine. But, NPT ERROR: Cannot open library is encountered for the PR builds.

NPT ERROR: Cannot open library is related to -Xrunjdwp:transport=dt_socket,address=8888,server=y,onthrow=no.pkg.foo,launch=echo and absence of libnpt.so (npt.dll on Windows). We need to figure why the required dependencies are missing in the PR builds.

@llxia
Copy link
Contributor

llxia commented Jun 13, 2018

Not related to the failure, but the test level is wrong. It should be in extended not sanity. Issue
#2165 is opened to address this.

@pshipton pshipton changed the title Windows jvmti tests failed with "NPT ERROR: Cannot open library" Windows JDK8 jvmti tests failed with "NPT ERROR: Cannot open library" Jun 17, 2018
@llxia
Copy link
Contributor

llxia commented Jun 18, 2018

Also seen TestRefreshGCSpecialClassesCache_* failed with NPT ERROR: Cannot open library error in PR build. https://ci.eclipse.org/openj9/job/PullRequest-Extended-JDK8-win_x86-OpenJ9/1/tapResults/

llxia added a commit to llxia/openj9 that referenced this issue Jul 23, 2018
- disable cmdLineTester_jvmtitests_extended on win until eclipse-openj9#2129 is fixed
- rename cmdLineTester_jvmtitests_extended to
cmdLineTester_jvmtitests_debug

Issue: eclipse-openj9#2129 eclipse-openj9#2147
[ci skip]

Signed-off-by: lanxia <[email protected]>
fengxue-IS pushed a commit to fengxue-IS/openj9 that referenced this issue Jul 24, 2018
- disable cmdLineTester_jvmtitests_extended on win until eclipse-openj9#2129 is fixed
- rename cmdLineTester_jvmtitests_extended to
cmdLineTester_jvmtitests_debug

Issue: eclipse-openj9#2129 eclipse-openj9#2147
[ci skip]

Signed-off-by: lanxia <[email protected]>
theresa-m added a commit to theresa-m/openj9 that referenced this issue Jul 25, 2018
Remove compressedrefs directory from sun.boot.library.path. Our extensions expect this property to have only one path. See full description in eclipse-openj9#2129.

Fixes: eclipse-openj9#2129

Signed-off-by: Theresa Mammarella <[email protected]>
theresa-m added a commit to theresa-m/openj9-openjdk-jdk8 that referenced this issue Jul 27, 2018
theresa-m added a commit to theresa-m/openj9-openjdk-jdk8 that referenced this issue Jul 27, 2018
@fjeremic
Copy link
Contributor

Judging from the giant list of references above it seems we are hitting this issue every single time we launch a build. This seems like a high priority issue to me. @pshipton @AdamBrousseau any leads as to what is causing it or potential workarounds? If not we should disable the tests until we get this fixed IMO.

@AdamBrousseau
Copy link
Contributor

Yes this has a 100% failure rate in PR builds but 0% in the nightly build. I doubt we want to disable the tests that are causing the failure since it passes in the nightly. Theresa has indicated above why it is failing. I'm not sure if there is a status update on a possible solution. We could disable the windows PR builds but its likely desirable to have windows testing even if there's a known issue with some of the tests.

@pshipton
Copy link
Member

The fix is ibmruntimes/openj9-openjdk-jdk8#106 but its not ready to merge yet and won't be until next week unless somebody else wants to address the review comments.

@pshipton
Copy link
Member

pshipton commented Aug 2, 2018

The immediate problem is fixed (for Windows) by ibmruntimes/openj9-openjdk-jdk8#107. Opened #2547 to handle remaining platforms and other jdk versions.

@pshipton pshipton closed this as completed Aug 2, 2018
keithc-ca added a commit to keithc-ca/openj9 that referenced this issue Aug 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants