Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for using easystack files #3468

Open
1 of 4 tasks
casparvl opened this issue Oct 12, 2020 · 6 comments
Open
1 of 4 tasks

support for using easystack files #3468

casparvl opened this issue Oct 12, 2020 · 6 comments
Assignees
Labels
easystack Issues and PRs related to easystack files EESSI Related to EESSI project feature request
Milestone

Comments

@casparvl
Copy link
Contributor

casparvl commented Oct 12, 2020

Proposed feature

We would like to be able to create a file that specifies all the software we want to build, and call EasyBuild with that file as argument. EasyBuild should then loop over the software specified in the spec file to build it. This is useful in the context of the EESSI project, but is probably more generally useful to EasyBuild users as well.

We've had a discussion with several EasyBuild and EESSI people on the structure of these files and the corresponding desired EasyBuild behaviour. Notes are here https://github.com/EESSI/software-layer/wiki/Brainstorm-meeting-(Oct-9th-2020).

Below follows a more exact specification and explanation of the intended behaviour.

Example spec file

Spec file format is proposed to be YAML and would look e.g. like this:

easybuild_version: 4.3.0
robot: True
software:
  Bioconductor:
    toolchains:
      foss-2020a:
        versions:
          3.11:
            versionsuffix: '-R-4.0.0'
            exclude-labels: arch:aarch64
  EasyBuild:
    toolchains:
      SYSTEM:
        versions: [4.3.1]
  GROMACS:
    toolchains:
      foss-2020a:
        exclude-labels: system:gpu
        versions:
          2020.1:
          2020.3:
            from_pr: 1234
      fosscuda-2020a:
        include-labels: system:gpu
        versions: [2020.1]
  OpenFOAM:
    toolchains:
      foss-2020a:
        versions: [8, v2006]
  R:
    toolchains:
      foss-2020a:
        versions: [4.0.0]

Usage

Build everything in the spec file:

eb eessi-2020.10.yml

Intended behaviour of keywords in the spec file

easybuild_version: optional keyword. If present, EB framework should check if the spec file was intended for the current version of the EB framework. Can be used e.g. when --from-pr is used in the spec file to new features in from develop, which at a later stage may already be part of release.
software: mandatory keyword that will specify the list of software to build. Framework should through a warning if this keyword isn't found.

Within software, the structure will be

software:
  software_name:
    toolchain:
       toolchain_name (incl. version):
          software_version:
            versionsuffix:

Additional command-line arguments to EasyBuild can be passed at any level by specifying a keyword that matches the long argument name, e.g.

robot: True
software:
  GROMACS:
    parallel: 6
    toolchains:
      foss-2020a:
        versions:
          2020.1:
          2020.3:
            from_pr: 1234

Will be equivalent to running

eb GROMAS-2020.1-foss-2020a.eb --robot --parallel=6
eb GROMAS-2020.3-foss-2020a.eb --robot --parallel=6 --from-pr=1234

The deepest nested value will always take priority, i.e.

software:
  GROMACS:
    parallel: 6
    toolchains:
      foss-2020a:
        versions:
          2020.1:
          2020.3:
            parallel: 12

will be equivalent to running

eb GROMAS-2020.1-foss-2020a.eb --parallel=6
eb GROMAS-2020.3-foss-2020a.eb --parallel=12

Labels

Labels are a means to not build everything in the spec file, but only to select certain items that match with the labels specified on the command line.

For example, the spec file could contain

    GROMACS:
      toolchains:
        foss-2020a:
          versions:
            2020.1:
              exclude-labels: foo
            2020.2:
            2020.3:
              include-labels: bar

and the command line invocation could be

eb eessi-2020.09.yml --labels='foo'

Then, EasyBuild will nót build GROMACS-2020.1-foss-2020a.eb, because the command line contained the foo key, and the spec file specified that as a label to exclude (exclude-labels: foo). Furthermore, EasyBuild will also nót build GROMACS-2020.3-foss-2020a.eb, because the spec file specified that it should only be build if the bar label was passed (include-labels: bar) and bar wasn't a label on the command line.

Labels are not interpreted in any way by EasyBuild: it just matches the labels passed on the command line to the include-labels and exclude-labels keywords. The meaning of those labels is entirely up to the developer of the spec file and we suggest the labels are explained through comments at the top of the YAML file.

Practical usage would be

# The 'gpu' label can be used to indicate if you want GPU software to be build. If software has both CPU and GPU versions, passing the 'gpu' label will disable building the CPU version.
    GROMACS:
      toolchains:
        foss-2020a:
          exclude-labels: gpu
          versions: 2020.1
        fosscuda-2020a:
          include-labels: gpu
          versions: 2020.1

Then, a user who wants to build for a GPU based system can do:

eb eessi-2020.09.yml --labels='gpu'

Implementation plan

Probably, we should implement this in steps:

@casparvl casparvl self-assigned this Oct 12, 2020
@akesandgren
Copy link
Contributor

I would probably find it more useful to group things by toolchain instead of by softwarename, i.e. (leaving out a lot of info here)

toolchain:
  foss:
   2020a:
    GROMACS:
      2019.4
    OpenFoam
       2.0

etc.
That is the way I think of the installation we have.

@boegel
Copy link
Member

boegel commented Oct 12, 2020

@akesandgren We discussed that aspect, and eventually we agreed that the first order view on a software stack is the software itself (not the toolchains used to install that software).

Compare this (software first, toolchain second):

software:
  GROMACS:
    foss-2020a:
      versions: [2019.4]
    fosscuda-2019b:
      versions: [2019.4]
  OpenFOAM:
    foss-2020a:
      versions: [8]
    intel-2020a:
      versions: [8]

with this (toolchains first, software second):

toolchains:
  foss-2020a:
    GROMACS:
      versions: [2019.4]
    OpenFOAM:
      versions: [8]
  intel-2020a:
    OpenFOAM:
      versions: [8]
  fosscuda-2019b:
      GROMACS:
        versions: [2019.4]

The 2nd format makes it harder to see at a glance which versions/variants of a particular software package are installed, which is probably a more common question than "what is installed for toolchain X"?

That being said, maybe we don't need to choose at all...
It's possible that the implementation can be done such that both use cases are supported, without too much trouble.

We should definitely avoid making this a bike-shedding discussion. :)

@akesandgren
Copy link
Contributor

akesandgren commented Oct 13, 2020

Yeah, doesn't really matter. And if we could make a tool that converts between them...
If for nothing else than listing things in different orders.

I.e. I'd like to easily list what SW I have in the file for specific toolchain, and of course vice versa

@boegel boegel transferred this issue from easybuilders/easybuild Oct 13, 2020
@boegel boegel added EESSI Related to EESSI project feature request labels Oct 13, 2020
@boegel boegel added this to the 4.x milestone Oct 13, 2020
@akesandgren
Copy link
Contributor

Btw, I would like to see a tool that can generate that yaml from an existing installation.

@deniskristak
Copy link
Contributor

Hello, I'll be trying to make this feature work, you can reach me on EasyBuild slack Denis Kristak.
Have a good day

@boegel boegel changed the title Building from a spec file support for using easystack files Dec 1, 2020
@boegel boegel modified the milestones: 4.x, 4.4.0 Dec 1, 2020
@boegel boegel modified the milestones: 4.3.3, release after 4.3.3 Feb 3, 2021
@deniskristak
Copy link
Contributor

deniskristak commented Mar 25, 2021

#3512
easystack labels issue

@boegel boegel added the easystack Issues and PRs related to easystack files label Mar 29, 2021
@boegel boegel modified the milestones: next release (4.3.4), 4.x Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easystack Issues and PRs related to easystack files EESSI Related to EESSI project feature request
Projects
None yet
Development

No branches or pull requests

4 participants