Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Fix JVM tests on Windows #10404

Merged
merged 60 commits into from
Jun 15, 2024
Merged

[CI] Fix JVM tests on Windows #10404

merged 60 commits into from
Jun 15, 2024

Conversation

hcho3
Copy link
Collaborator

@hcho3 hcho3 commented Jun 10, 2024

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 10, 2024

Update: The xgboost4j.dll binary built by the CI pipeline is corrupted. I was able to reproduce the same error on my Windows machines.

Steps taken:

  1. Download xgboost4j.dll from https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/fix_win/libxgboost4j/xgboost4j_6384bce6aae53b791187fd0db21e574a7a6c1100.dll
  2. Rename the file to xgboost4j.dll and put it under jvm-packages/xgboost4j/src/main/resources/lib/windows/x86_64.
  3. Open the Powershell and run:
$env:MAVEN_SKIP_NATIVE_BUILD=1
mvn test -B -pl :xgboost4j_2.12

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 12, 2024

I probably spent too much time on this. I will try to move the JVM tests to BuildKite and see if it works.

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 14, 2024

Given that 2.1.0 release is just around the corner, I am choosing the least disruptive option (Option 2).

@hcho3 hcho3 changed the title [WIP] [CI] Migrate Windows JVM tests to BuildKite [WIP] [CI] Fix JVM tests on Windows Jun 14, 2024
@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 14, 2024

Even when I tried running Java and Python separately, I still get the same error (DLL initialization failed). If we cannot find a fix, we may need to consider excluding Windows binary in the upcoming release.

@trivialfis @Craigacp @jameslamb

@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 14, 2024

As fate has it, the DLL error had nothing to do with duplication of core libraries. The DLL error was actually caused by improper mixing of two modes of linking for the C runtime.

Background: On Windows, you can choose to link with the C runtime dynamically (/MD build option) or statically (/MT). When building an application, it is imperative that all components are built with the same choice of runtime linking. You will get an undefined behavior when some components are built with /MD and others with /MT.
The current method of choosing between /MD and /MT was not sufficiently robust (#10344). So when the MSVC was updated, some of the components got the /MD flag when the correct default for XGBoost is /MT.

Fix. Use a more robust method to enforce the /MT option for all components. See #10344.

@hcho3 hcho3 changed the title [WIP] [CI] Fix JVM tests on Windows [CI] Fix JVM tests on Windows Jun 14, 2024
@hcho3 hcho3 requested a review from trivialfis June 14, 2024 21:54
@hcho3
Copy link
Collaborator Author

hcho3 commented Jun 15, 2024

@trivialfis Looks like the CI is unblocked now. Can you review?

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge thanks for the fix!

@hcho3 hcho3 merged commit 1ace9c6 into master Jun 15, 2024
53 checks passed
@hcho3 hcho3 deleted the fix_win branch June 15, 2024 07:21
hcho3 added a commit to hcho3/xgboost that referenced this pull request Jun 15, 2024
@hcho3 hcho3 mentioned this pull request Jun 15, 2024
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use CMAKE_MSVC_RUNTIME_LIBRARY to consistently handle linking mode for MSVC runtime
3 participants