Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use gpuci_mamba_retry on Java CI. #9983

Merged
merged 4 commits into from
Jan 10, 2022

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 6, 2022

Resolves #9976.

@bdice bdice added 5 - DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function gpuCI labels Jan 6, 2022
@bdice bdice self-assigned this Jan 6, 2022
@codecov
Copy link

codecov bot commented Jan 6, 2022

Codecov Report

Merging #9983 (a879033) into branch-22.02 (967a333) will decrease coverage by 0.03%.
The diff coverage is n/a.

❗ Current head a879033 differs from pull request most recent head 630b584. Consider uploading reports for the commit 630b584 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02    #9983      +/-   ##
================================================
- Coverage         10.49%   10.45%   -0.04%     
================================================
  Files               119      119              
  Lines             20305    20417     +112     
================================================
+ Hits               2130     2134       +4     
- Misses            18175    18283     +108     
Impacted Files Coverage Δ
python/custreamz/custreamz/kafka.py 29.16% <0.00%> (-0.63%) ⬇️
python/dask_cudf/dask_cudf/sorting.py 92.30% <0.00%> (-0.61%) ⬇️
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/series.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/dtypes.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 0.00% <0.00%> (ø)
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23603d1...630b584. Read the comment docs.

ci/gpu/java.sh Outdated Show resolved Hide resolved
@ajschmidt8
Copy link
Member

Hmmm. The good news is that mamba doesn't seem to be downgrading the packages anymore as I mentioned here. Looks like we're still seeing that same Could NOT find JNI (missing: JAVA_JVM_LIBRARY) error though.

@github-actions github-actions bot added the Java Affects Java cuDF API. label Jan 7, 2022
@bdice
Copy link
Contributor Author

bdice commented Jan 7, 2022

CI is failing because CMake cannot find the JNI despite openjdk being installed.

The issue is arising because conda install "openjdk=8.*" runs the activation script included in the package upon installation, while mamba install "openjdk=8.*" is not able to run that activation script on installation. That means that conda install is able to set the environment variable JAVA_HOME, while mamba does not. To get this to work, we'll need to reactivate the environment or manually set JAVA_HOME=$CONDA_PREFIX.

Bash script to reproduce the issue:

#!/bin/bash

# Enable activating conda environments
. $CONDA_PREFIX/etc/profile.d/conda.sh

export MAMBA_NO_BANNER=1

echo "---------conda---------"
conda --version
conda create -y -q -n conda_jdk_test
conda activate conda_jdk_test
conda install -y -q "openjdk=8.*"
echo "'which java' returns $(which java)"
echo "JAVA_HOME=${JAVA_HOME}"

conda deactivate
conda env remove --name conda_jdk_test
echo "-------end conda-------"
echo

echo "---------mamba---------"
mamba --version
mamba create -y -q -n mamba_jdk_test mamba
conda activate mamba_jdk_test
mamba install -y -q "openjdk=8.*"
echo "'which java' returns $(which java)"
echo "JAVA_HOME=${JAVA_HOME}"  # This is the bug -- JAVA_HOME is not set.
conda deactivate

echo "Retry mamba after re-activating:"

conda activate mamba_jdk_test
mamba install -y -q "openjdk=8.*"
echo "'which java' returns $(which java)"
echo "JAVA_HOME=${JAVA_HOME}"  # This is the bug -- JAVA_HOME is not set.
conda deactivate

conda env remove --name mamba_jdk_test
echo "-------end mamba-------"

Simplified output:

---------conda---------
conda 4.11.0

[... creating / installing ...]
Executing transaction: ...working... done
'which java' returns /home/bdice/mambaforge/envs/conda_jdk_test/bin/java
JAVA_HOME=/home/bdice/mambaforge/envs/conda_jdk_test
Remove all packages in environment /home/bdice/mambaforge/envs/conda_jdk_test:
-------end conda-------

---------mamba---------
mamba 0.19.1
conda 4.11.0

[... creating / installing ...]
Executing transaction: ...working... done
'which java' returns /home/bdice/mambaforge/envs/mamba_jdk_test/bin/java
JAVA_HOME=
Retry mamba after re-activating:
'which java' returns /home/bdice/mambaforge/envs/mamba_jdk_test/bin/java
JAVA_HOME=/home/bdice/mambaforge/envs/mamba_jdk_test
Remove all packages in environment /home/bdice/mambaforge/envs/mamba_jdk_test:
-------end mamba-------

It appears that this PR is related: conda/conda#10783 and particularly this comment: conda/conda#10783 (comment)

The problem in mamba is the following:

  • the environment has to be re-activated after an "install" command to properly execute activation scripts, hash the binaries on the PATH etc.

@github-actions github-actions bot removed the Java Affects Java cuDF API. label Jan 7, 2022
@bdice bdice added 3 - Ready for Review Ready for review by team and removed 5 - DO NOT MERGE Hold off on merging; see PR for details labels Jan 7, 2022
@bdice bdice marked this pull request as ready for review January 7, 2022 22:33
@bdice bdice requested a review from a team as a code owner January 7, 2022 22:33
@bdice bdice added the non-breaking Non-breaking change label Jan 7, 2022
@bdice bdice requested a review from ajschmidt8 January 7, 2022 22:33
@bdice
Copy link
Contributor Author

bdice commented Jan 7, 2022

Thanks @jlowe @ajschmidt8 @raydouglass for the pointers -- Java CI is now passing, this is ready for review and merge. ☕ 🚀

cc: @shwina @sameerz

@bdice bdice changed the title [TST] Try gpuci_mamba_retry on Java CI. Use gpuci_mamba_retry on Java CI. Jan 7, 2022
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bdice for driving this!

@sameerz
Copy link
Contributor

sameerz commented Jan 8, 2022

Thank you @bdice !

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find! I wonder if there's an upstream mamba bug about this that we can track so that we can eventually remove this workaround.

@bdice
Copy link
Contributor Author

bdice commented Jan 10, 2022

Good find! I wonder if there's an upstream mamba bug about this that we can track so that we can eventually remove this workaround.

@ajschmidt8 There’s a conda issue that I Iinked above: conda/conda#10783 The issue is that mamba cannot currently hook into conda to trigger the post-install activation.

@bdice
Copy link
Contributor Author

bdice commented Jan 10, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0d5ec7f into rapidsai:branch-22.02 Jan 10, 2022
@bdice
Copy link
Contributor Author

bdice commented Jan 11, 2022

For the sake of completeness, here's an analysis of the CI runtime before/after this PR. Comparing the CI build logs, I see a significant improvement in run time by switching to mamba.

CI Log Before (conda, commit a4dc42d), CI log After (mamba, commit 630b584), CI log
Total Runtime 0:58:59 0:11:42
Install dependencies 0:22:06 0:00:50
Installing libcudf and libcudf_kafka 0:27:21 0:01:04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Java CI times out trying to solve conda env
4 participants