Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpin or update many packages (mostly Python) in configs/common/packages.yaml, fix S4 site config #1384

Merged

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Nov 17, 2024

Summary

In preparation for spack-stack-1.9.0, this PR unpins or updates several packages in configs/common/packages.yaml (mostly Python packages). Most notably, py-shapely (@ericlingerfelt FYI) and py-numpy (@DavidHuber-NOAA FYI) are updated.

The py-numpy update may require bug fixes with the Intel classic compiler that @DavidHuber-NOAA worked on and that are currently under review in spack develop (see #1276).

Included is an update of the S4 site config, which had several flaws that prevented building and testing this PR.

Testing

Applications affected

All.

Systems affected

None directly.

Dependencies

Issue(s) addressed

Resolves #1065
Working towards #1329

Checklist

  • This PR addresses one issue/problem/enhancement, or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.

@climbfuji climbfuji changed the title WIP - Unpin or update many packages (mostly Python) in configs/common/packages.yaml Unpin or update many packages (mostly Python) in configs/common/packages.yaml Dec 6, 2024
@climbfuji climbfuji marked this pull request as ready for review December 6, 2024 18:59
@climbfuji
Copy link
Collaborator Author

All, the NEPTUNE tests passed. Please start testing with UFS, JEDI, ... we need this PR in as soon as possible for spack-stack 1.9.0 (downstream PRs depend on it). Thanks!

@srherbener
Copy link
Collaborator

@climbfuji I'll test JEDI/Skylab.

@climbfuji
Copy link
Collaborator Author

Thanks very much @srherbener! You should expect problems with the shapely update (@ericlingerfelt may know more), but hopefully nothing else.

@srherbener
Copy link
Collaborator

I'm getting concretize errors like this on several platforms:

[ue-intel] [sherbener@s4-submit ue-intel]$ spack concretize | tee log.concretize
==> Warning: Use of plain text `access_token` in mirror config is deprecated, use environment variables instead (access_token_variable)
==> Error: failed to concretize `jedi-geos-env%intel ^esmf@=8.6.1`, `ufs-srw-app-env%intel ^esmf@=8.6.1`, `global-workflow-env%intel ^esmf@=8.6.1`, `ewok-env%intel~cylc+ecflow`, ..., `gsi-env%intel` for the following reasons:
     1. cannot satisfy a requirement for package 'py-setuptools'.
==> Using cached archive: /data/users/sherbener/projects/spack-stack/cache/source_cache/blobs/sha256/3cc99d42a12e6c34bc187d5146cbdba71ca16d19163fe106f1a4cd3e59b01d2c
==> Using cached archive: /data/users/sherbener/projects/spack-stack/cache/source_cache/blobs/sha256/e8518de25baff7a74bdb42193e6e4b0496e7d0688434c42ce4bdc92fe4293a09
==> Installing "clingo-bootstrap@=spack%gcc@=10.2.1~docs+ipo+optimized+python+static_libstdcpp build_system=cmake build_type=Release generator=make patches=bebb819,ec99431 arch=linux-centos7-x86_64" from a buildcache

This particular message is from S4. Does this need another PR to be merged, or an update for the spack submodule commit hash?

Or perhaps pilot error. I did the following (on S4) before attempting to do the spack-stack build:

module purge
module load intel/2023.2 miniconda/3.8-s4

Is that the issue?

Thanks!

@climbfuji
Copy link
Collaborator Author

Definitely no miniconda, those times are long gone

@srherbener
Copy link
Collaborator

Sorry @climbfuji about all the questions. I'll spread out the questions elsewhere and I greatly appreciate the help and support that you provide. Unfortunately, I don't have the bandwidth for debugging all of these site configurations either. My real need is to just build somewhere so I can test this PR with skylab.

Hopefully S4 will be ready for you by the end of today so that you can test next week!

Thanks @climbfuji - much appreciated!

@climbfuji
Copy link
Collaborator Author

Sorry @climbfuji about all the questions. I'll spread out the questions elsewhere and I greatly appreciate the help and support that you provide. Unfortunately, I don't have the bandwidth for debugging all of these site configurations either. My real need is to just build somewhere so I can test this PR with skylab.

Hopefully S4 will be ready for you by the end of today so that you can test next week!

Thanks @climbfuji - much appreciated!

@srherbener /data/users/dheinzeller/spst-unpin-update/envs/ue-intel-2021.10.0/install/modulefiles/Core

@climbfuji climbfuji changed the title Unpin or update many packages (mostly Python) in configs/common/packages.yaml Unpin or update many packages (mostly Python) in configs/common/packages.yaml, fix S4 site config Dec 16, 2024
@srherbener
Copy link
Collaborator

@srherbener /data/users/dheinzeller/spst-unpin-update/envs/ue-intel-2021.10.0/install/modulefiles/Core

Thanks @climbfuji! I am building jedi-bundle now, and will run ctests and skylab.

@srherbener
Copy link
Collaborator

Okay, I'll forge ahead with python 3.6.8.
Note that the documentation here https://spack-stack.readthedocs.io/en/latest/PreConfiguredSites.html#create-local-environment says to make sure you have python 3.8+ available and is the default.

That is from the old days when we forced spack to use an external Python in the environment. We should remove that. All we need is Python 3.6+ to drive spack and to build the environments.

Note, I just submitted a PR (#1420) to correct the documentation.

@srherbener
Copy link
Collaborator

jedi-bundle looks good, I'm running skylab now.

Copy link
Collaborator

@AlexanderRichert-NOAA AlexanderRichert-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concretizes fine on personal machine, concretizes and builds UWM deps fine on Hera.

@climbfuji
Copy link
Collaborator Author

Concretizes fine on personal machine, concretizes and builds UWM deps fine on Hera.

Yay! Thanks for testing :-)

climbfuji added a commit that referenced this pull request Dec 18, 2024
…ustomization of common packages for Derecho (#1409)

1. Update esmf from 8.8.0b06 to 8.8.0b10
2. Simplify/clean up modification of common packages in Derecho site config (move from openblas to intel-oneapi-mkl, since openblas no longer builds - neither 0.3.24 nor 0.3.28)
3. Configure LLVM 19.1.4 for Bounty; this requires updating openblas from 0.3.24 to 0.3.28, which we cannot do globally yet (waiting for #1384); there is also a build issue for ip that we can't work around, for now one has to manually replace ip with sp in spack-ext/path/to/my/env/package.py.
@srherbener
Copy link
Collaborator

Testing skylab on S4 has been difficult. There is something about the flow that keeps inviting the login resource monitor to kill ecflow. We tried several changes to essentially run the flow in series, but ecflow still gets killed. I think we found an actual issue with a couple tasks based on matplotlib, but the matplotlib version didn't change (from 1.8.0). @fabiolrdiniz took a look and seemed to think that the matplotlib usage was not right, and was feeling that we need to look into that and address it with fixing the tasks.

It's not clear if shapely has anything to do with matplotlib tasks mentioned above, and seems unlikely, but we also saw a plotting task fail (which likely is affected by shapely). This is expected given the issue with the old shapely version.

It seems that the forecasting and variational tasks are intact and working.

I think things are working well enough, that it makes sense to go ahead and merge this PR and then move forward fixing the issues we found. Do you agree @fabiolrdiniz?

@fabiolrdiniz
Copy link
Contributor

Thanks for tagging me, @srherbener. I'm a little concerned about the tasks being killed on S4. What we were speculating about exceeding the number of tasks running on the login node seems not to be the problem (since we set the workflow to run serially). I wonder if the same test works fine with another spack-stack version.

I agree that the matplotlib-related crashes may reveal an issue with how we plot figures. However, what is not clear to me is how confident we are with the functionality of that package. Is it capable of creating simple plots like the one below?

#!/usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4])
y = x*2
plt.plot(x, y)
x1 = [2, 4, 6, 8]
y1 = [3, 5, 7, 9]
plt.plot(x1, y1, '-.')
plt.xlabel("X-axis data")
plt.ylabel("Y-axis data")
plt.title('multiple plots')
plt.tight_layout()
plt.savefig('test.png', format='png', dpi=300)
plt.close()

@srherbener
Copy link
Collaborator

@fabiolrdiniz I just checked and the matplotlib version has been 3.7.4 since at least spack-stack-1.7.0. Perhaps something upstream from the plotting routine has changed (eg, messages from the forecast or variational apps) and the tasks with matplotlib need to be adjusted?

@srherbener
Copy link
Collaborator

I'm building spack-stack with this PR feature branch, and corresponding spack submodule, on my Mac. I'll try testing jedi-bundle and skylab there and see what I get.

@srherbener
Copy link
Collaborator

Thanks for tagging me, @srherbener. I'm a little concerned about the tasks being killed on S4. What we were speculating about exceeding the number of tasks running on the login node seems not to be the problem (since we set the workflow to run serially). I wonder if the same test works fine with another spack-stack version.

I agree that the matplotlib-related crashes may reveal an issue with how we plot figures. However, what is not clear to me is how confident we are with the functionality of that package. Is it capable of creating simple plots like the one below?

#!/usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4])
y = x*2
plt.plot(x, y)
x1 = [2, 4, 6, 8]
y1 = [3, 5, 7, 9]
plt.plot(x1, y1, '-.')
plt.xlabel("X-axis data")
plt.ylabel("Y-axis data")
plt.title('multiple plots')
plt.tight_layout()
plt.savefig('test.png', format='png', dpi=300)
plt.close()

@fabiolrdiniz, I tried this script in my spack-stack-1.8.0 environment and my newly built environment from this PR on my Mac. In both cases the script worked and I got this result:
test

@srherbener
Copy link
Collaborator

With my new env based on this PR, jedi-bundle built successfully. I got a 39 ctest failures but that isn't too out of the ordinary for the Mac. Also, all of the fv3 and mpas tests pass and only two soca test failures: one due to a floating point exception the other due to a tolerance mismatch after the applicate completed successfully. All of the other test failures were ufo and saber tests.

I'm trying skylab now.

@fabiolrdiniz
Copy link
Contributor

Thanks for checking, @srherbener! By any chance, have you tried to run one of the experiments with your laptop build? If yes, did it fail similarly to S4? Thanks again!

@fabiolrdiniz
Copy link
Contributor

Sorry, only now I saw your previous message about SkyLab. We were writing messages at the same time. Thanks!

Copy link
Collaborator

@srherbener srherbener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skylab on my Mac is working! It successfully completed the gfs-3dfgat-c12 experiment and made it through the first cycle successfully in the skylab-atm-land-small experiment.

@climbfuji
Copy link
Collaborator Author

Skylab on my Mac is working! It successfully completed the gfs-3dfgat-c12 experiment and made it through the first cycle successfully in the skylab-atm-land-small experiment.

Wohoo, this is great news. Thanks very much for your efforts testing this PR. I'll merge now. This is going to unlock a lot of updates for spack-stack-1.9.0.

Thanks again everyone for testing and reviewing.

@climbfuji climbfuji merged commit 7e70899 into JCSDA:develop Dec 19, 2024
9 checks passed
@climbfuji climbfuji deleted the feature/unpin_update_common_packages branch December 19, 2024 00:24
@climbfuji climbfuji mentioned this pull request Dec 19, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Update shapely from 1.8.0 to latest version 2.x.y
5 participants