Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let jobs retweak easyconfigs themselves #4669

Open
wants to merge 6 commits into
base: 5.0.x
Choose a base branch
from

Conversation

bartoldeman
Copy link
Contributor

This can be accomplished by tweak() optionally also returning a dict which maps the tweaked easyconfig to the original version. Then the job can run eb ... <original_easyconfig.eb> --try-* and that original easyconfig will be retweaked in the job itself.

If the easyconfig passed to the job is not tweaked, then --try-* is not passed down (so, with --robot, some jobs will have --try-* and some don't).

This removes the requirement of a shared tmpdir with --job --try-*.

Fixes #1355

This can be accomplished by tweak() optionally also returning a dict which
maps the tweaked easyconfig to the original version. Then the job can
run `eb ... <original_easyconfig.eb> --try-*` and that original
easyconfig will be retweaked in the job itself.

If the easyconfig passed to the job is not tweaked, then `--try-*` is
*not* passed down (so, with `--robot`, some jobs will have `--try-*`
and some don't).

This removes the requirement of a shared tmpdir with `--job --try-*`.

Fixes easybuilders#1355
@bartoldeman bartoldeman marked this pull request as draft October 4, 2024 16:40
@bartoldeman bartoldeman added this to the 5.0 milestone Oct 4, 2024
@bartoldeman
Copy link
Contributor Author

Putting to draft because this needs a proper test combining --job --robot --try-*.

@boegel
Copy link
Member

boegel commented Oct 9, 2024

@bartoldeman Seems like test was implemented, so this shouldn't be a draft PR anymore?

@bartoldeman bartoldeman marked this pull request as ready for review October 24, 2024 15:11
@bartoldeman
Copy link
Contributor Author

Justed wanted to test locally using eb HPL-2.3-foss-2023a.eb --try-toolchain=foss,2024a --job. This works fine.

Note that a shared temporary directory is still needed with --job --from-pr, but you can use a shared temporary directory in the submitting easybuild and NOT on the worker node by setting $TMPDIR instead of using --tmpdir or (equivalently) $EASYBUILD_TMPDIR, as$TMPDIR is not passed down to the job; a shared TMPDIR on a worker node can cause quite a performance degredation since even GCC temporary .s (asm) files will be stored on that, causing a lot of expensive networked IOPS.

@bartoldeman
Copy link
Contributor Author

@bartoldeman Seems like test was implemented, so this shouldn't be a draft PR anymore?

@boegel undrafted now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Breaking changes
Development

Successfully merging this pull request may close these issues.

2 participants