Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let CMakeMake easyblock also set Python_EXECUTABLE option, as well as Python3_EXECUTABLE and Python2_EXECUTABLE derivatives (when appropriate) #3463

Merged
merged 12 commits into from
Oct 11, 2024

Conversation

Crivella
Copy link
Contributor

@Crivella Crivella commented Sep 26, 2024

EDIT

Following the thread discussion this has changed from setting Python3_FIND_VIRTUALENV to setting Python_EXECUTABLE and all its variants in order to force the CMakeMake easyblock to pick up the correct python when loaded as a module.

This would replace:

OLD

This is related to:

Python3_FIND_VIRTUALENV controls how CMake finds the python3 interpreter in case a virtualenv or conda environment is activated, by default prioritizing the virtual environment.
Setting it to STANDARD switch the behavior to looking for python3 in PATH which allows the correct python to be found when invoking eb from within a virtual environment

No version check has been added as the flag is supported since CMake 3.15 which now is tied to an archived toolchain (GCC 8.3.0 https://docs.easybuild.io/policies/toolchains/)

I think this should always be on by default.
Even if a build process tries to work with virtual environments (eg creating a venv, activating it and tha running cmake trying to use the python from the venv) i think that activating the virtualenv inside the build process would prepend to the PATH and still allow finding the correct python

@ocaisa
Copy link
Member

ocaisa commented Sep 26, 2024

@Flamefire You could probably give the most informed review of this. I would be inclined to accept it.

@Crivella Crivella changed the title Added to CMakeMake Added Python3_FIND_VIRTUALENV=STANDARD to CMakeMake Sep 26, 2024
@Flamefire
Copy link
Contributor

I don't see any problem with this in general besides increasing the number of (potentially superflous) parameters we pass to configure.

However:
The GROMACS change was very focused on that specific software and restricted to "faulty" versions. Here we'd not only need to pass Python3_FIND_VIRTUALENV but also Python_FIND_VIRTUALENV and Python_FIND_VIRTUALENV to cover all cases.

I'd prefer to pass the Python binary path explicitly as done in this PR This not only avoids it finding a virtualenv Python ("fixed" here) but also covers many other cases we might not found yet.
So that is more general without unnecessarily increasing verbosity for cases where we don't need it.

@Crivella What do you think about that? Can you try #3282 with your #3373 not setting the Python3_FIND_VIRTUALENV to see if that works?

@Crivella
Copy link
Contributor Author

@Flamefire Started the test it will take a couple hours to finish, but since stage 1 passed i don't think there would be a problem.
My concern was the order with which the hints where evaluated since #3283 was necessary, but Python_ROOT_DIR seems to take precedence above all

Just ran a minimal test with

    cmake_minimum_required(VERSION 3.15.0)

    project(MyProject)

    find_package(Python3)

and

#!/usr/bin/env bash

module purge
module load CMake/3.29.3-GCCcore-13.3.0
module load Python/3.12.3-GCCcore-13.3.0
module list

source $HOME/.virtualenvs/test/bin/activate
which python

rm -rf build
cmake -B build -S . \
    -DPython3_ROOT_DIR=/usr/bin \
    -DPython3_FIND_VIRTUALENV=ONLY

which results in

...
-- Found Python3: /usr/bin/python3.12 (found version "3.12.5") found components: Interpreter
...

Was there some weirder reason that did not make GROMACS work with #3282 ?

@Flamefire
Copy link
Contributor

I'll try to compile my findings on the order of paths considered.

The documentation states:

    `Python3_FIND_VIRTUALENV`  variable can be set to one of the following:

        FIRST: The virtual environment is used before any other standard paths to look-up for the interpreter. This is the default.

        ONLY: Only the virtual environment is used to look-up for the interpreter.

        STANDARD: The virtual environment is not used to look-up for the interpreter but environment variable PATH is always considered. In this case, variable Python3_FIND_REGISTRY (Windows) or CMAKE_FIND_FRAMEWORK (macOS) can be set with value LAST or NEVER to select preferably the interpreter from the virtual environment.

The relevant file is https://gitlab.kitware.com/cmake/cmake/-/blob/master/Modules/FindPython/Support.cmake which gets "called" by "FindPython*" with ${_PYTHON_PREFIX} set to e.g. "Python3"

We can ignore the "STANDARD" for now because when *_FIND_VIRTUALENV is not set to FIRST or ONLY then $CMAKE_PREFIX_PATH is used which we set in the module.

Which leads to:

And find_program considers:

  1. <PackageName>_ROOT CMake variable, where <PackageName> is the case-preserved package name.
  2. <PACKAGENAME>_ROOT CMake variable, where <PACKAGENAME> is the upper-cased package name.
  3. HINTS: ENV variable ${_PYTHON_PREFIX}_ROOT_DIR
  4. PATHS: ENV variables VIRTUAL_ENV & CONDA_PREFIX

I excluded all of the ones mentioned in the docs that are excluded by the explicit NO_* options. So we should be safe. However there is "[The first 2 steps] can be skipped by setting the CMAKE_FIND_USE_PACKAGE_ROOT_PATH variable to FALSE.

So maybe we need to also set the *_ROOT_DIR env variables to be safe to not get a virtualenv.

@Crivella
Copy link
Contributor Author

By looking at the CMake source you linked (never realized it was so messy XD) and experimenting a bit more, I am wondering if we should just cut to the source and set Python_EXECUTABLE and/or Python[2/3]_EXECUTABLE to avoid going to the HINTS part of the code.

It seems to also find the correct components when specifying COMPONENTS Development

test.sh

#!/usr/bin/env bash

module purge
module load CMake/3.29.3-GCCcore-12.3.0
module load Python/3.11.3-GCCcore-12.3.0
module load SciPy-bundle/2023.07-gfbf-2023a
module list

source $HOME/.virtualenvs/test/bin/activate
which python

_PYTHON="/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/bin/python"

rm -rf build
cmake -B build -S . \
    -DPython_FIND_VIRTUALENV=ONLY \
    -DPython_ROOT=/usr/bin \
    -DPython_EXECUTABLE=${_PYTHON} \

CMakeLists.txt

cmake_minimum_required(VERSION 3.15.0)

project(MyProject)

find_package(
    Python 3.11.0...3.11.10
    COMPONENTS Interpreter Development NumPy
    )

CMakeCache.txt

...
//Compiler reason failure
_Python_Compiler_REASON_FAILURE:INTERNAL=
_Python_DEVELOPMENT_EMBED_SIGNATURE:INTERNAL=6e4a7d11dfc3083ba9e17dd9c0c9ffd5
_Python_DEVELOPMENT_MODULE_SIGNATURE:INTERNAL=7f9cffd5f241febb4fdd6ed780db4712
//Development reason failure
_Python_Development_REASON_FAILURE:INTERNAL=
_Python_EXECUTABLE:INTERNAL=/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/bin/python
//Path to a file.
_Python_INCLUDE_DIR:INTERNAL=/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/include/python3.11
//Python Properties
_Python_INTERPRETER_PROPERTIES:INTERNAL=Python;3;11;3;64;;cpython-311-x86_64-linux-gnu;abi3;/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11;/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11;/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/site-packages;/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/site-packages
_Python_INTERPRETER_SIGNATURE:INTERNAL=a6927044a8e489828e7111e599980346
//Path to a library.
_Python_LIBRARY_RELEASE:INTERNAL=/home/crivella/.local/easybuild/software/Python/3.11.3-GCCcore-12.3.0/lib/libpython3.11.so
_Python_NUMPY_SIGNATURE:INTERNAL=64759e84174ed96a40e217a0380274d0
//Path to a file.
_Python_NumPy_INCLUDE_DIR:INTERNAL=/home/crivella/.local/easybuild/software/SciPy-bundle/2023.07-gfbf-2023a/lib/python3.11/site-packages/numpy/core/include
...

The fact that _Python_Development_REASON_FAILURE is still there is due to it never getting unset in the code, but _Python_LIBRARY_RELEASE: pointing to the correct .so makes me thing stuff is working correctly

@Flamefire
Copy link
Contributor

I am wondering if we should just cut to the source and set Python_EXECUTABLE and/or Python[2/3]_EXECUTABLE

Yes I guess that would also work although I don't know if there is some (old?) CMake user code out there that relies on Python_ROOT. We'd need to check at least the history of the module in the CMake repo.

I also guess there is no trouble in using $EBROOT_PYTHON/bin/python (resolved by the easyblock) as the param value

@Crivella
Copy link
Contributor Author

I've been browsing through the CMake source code previous versions.
I've also tested up to CMake v3.12 where FindPython was introduced and Python_EXECUTABLE works for all of them.

I think the logic was properly implemented in 3.16, using _${PYTHON_PREFIX}_EXECUTABLE as the store variable for find_program but even before that, since find_program is being called directly on ${PYTHON_PREFIX}_EXECUTABLE due to the behavior of find_program

If the program is found the result is stored in the variable and the search will not be repeated unless the variable is cleared. If nothing is found, the result will be -NOTFOUND.

This is still working since we are manually setting the result to what we want and ${PYTHON_PREFIX}_EXECUTABLE is not being unset by the module.

I also tested that there is no problem in case the python version does not matches the requirements since the version is later validated even if we are forcing the find_program result.

I've made a new commit with the proposed changes (47f21b9).

I would argue that unless we want to analyze the CMakeLists.txt we have to set both Python_EXECUTABLE and PYTHON_EXECUTABLE in case some code is still using the deprecated find_package{PythonInterp ...) (unfortunately they are case sensitive)

@Crivella Crivella changed the title Added Python3_FIND_VIRTUALENV=STANDARD to CMakeMake Added Python[X/2/3]_EXECUTABLE to CMakeMake Sep 30, 2024
@Crivella
Copy link
Contributor Author

I guess there is also an argument to be made to actually set Python_EXECUTABLE to a dump location so that if people are making an EC for a code that requires python they are forced to use one through EB modules instead of randomly picking up a system one for reproducibility?

Copy link
Contributor

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works. https://gitlab.kitware.com/cmake/cmake/-/blob/d0ad8fd49c2d211081d32cdc2a7f96990947c340/Modules/FindPython/Support.cmake?page=2#L1877-1878 checks for an absolute path in that variable and always keeps it so the block starting at https://gitlab.kitware.com/cmake/cmake/-/blob/d0ad8fd49c2d211081d32cdc2a7f96990947c340/Modules/FindPython/Support.cmake?page=2#L1909 gets skipped.

However https://github.com/easybuilders/easybuild-easyblocks/pull/3282/files should work for CMake >= 3.12 while this only works for >= 3.16 which includes https://gitlab.kitware.com/cmake/cmake/-/commit/cea2010b5c7 that honors this variable

I've also tested up to CMake v3.12 where FindPython was introduced and Python_EXECUTABLE works for all of them.

What exactly have you tested? Did you test 3.12 and below or 3.12 and up? Because how I understand the change in 3.16 the *_EXECUTABLE isn't honored before.

However checking the code for 3.15 it eventuall runs

        find_program (${_PYTHON_PREFIX}_EXECUTABLE
                      NAMES ${_${_PYTHON_PREFIX}_NAMES}
                            ${_${_PYTHON_PREFIX}_IRON_PYTHON_NAMES}
                      NAMES_PER_DIR
                      PATHS ${_${_PYTHON_PREFIX}_REGISTRY_PATHS}
                      PATH_SUFFIXES bin ${_${_PYTHON_PREFIX}_IRON_PYTHON_PATH_SUFFIXES}
                      NO_DEFAULT_PATH)
        _python_validate_interpreter (${${_PYTHON_PREFIX}_FIND_VERSION})

So unless the python cmd set is invalid it should be kept because find_program does nothing if the variable is already set.

Hence I think this change is OK

easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
@Crivella
Copy link
Contributor Author

Crivella commented Oct 1, 2024

What exactly have you tested? Did you test 3.12 and below or 3.12 and up? Because how I understand the change in 3.16 the *_EXECUTABLE isn't honored before.

Checked 3.12 and above since before there is only FindPythonInterp which has not been changed

So unless the python cmd set is invalid it should be kept because find_program does nothing if the variable is already set.

Yes this is what i meant in my previous comment, even if it is not explicitly the behavior of FindPython due to the behavior of find_program as long as we set ${PYTHON_PREFIX}_EXECUTABLE to a correct path there should be no problem (also concerning version checks since they happens after and will give the correct error if we are passing a non supported version of python)

@Flamefire
Copy link
Contributor

Test report by @Flamefire

Overview of tested easyconfigs (in order)

Build succeeded for 12 out of 15 (7 easyconfigs in total)
i7021 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.8.17
See https://gist.github.com/Flamefire/4db38a9b8c8fdb93627119d0443b789c for a full test report.

@Flamefire
Copy link
Contributor

GROMACS 2023.x fails during the installation of the Python extension which uses CMake through pip and then runs into the same problem. We do that extension installation (in the current way) only since 2023.x and 2024 uses scikit-build which initializes the CMakeCache with some Python variables.

I'm not sure how to solve this. The extension in this EasyConfig is a "regular" PythonPackage but it supports an env variable CMAKE_ARGS to pass arguments to CMake.

We could likely also set Python3_ROOT_DIR as env variables which should also avoid it finding the virtualenv.

However: Any ideas how to pass either the arguments or the env variables to the extension? I guess the gromacs easyblock could set the env variables before installing the extensions but then it would be good to have some way to get the values from the cmake easyblock to avoid duplication

@Crivella
Copy link
Contributor Author

Crivella commented Oct 3, 2024

Have not worked much with extensions yet, but by looking at the code I think the easiest solution would be to modify the CMAKE_ARGS to define either Python3_ROOT_DIR or Python3_EXECUTABLE.
That could be done either overriding install_extensions_sequential setting the variable before calling the original method eg

    def install_extensions_sequential(self, *args, **kwargs):
        """Install the gmxapi extension"""
        python_root = get_software_root('Python')
        python_bin = os.path.join(python_root, 'bin', 'python')
        env.setvar(
            'CMAKE_ARGS', 
            "-Dgmxapi_ROOT=%s -DPython3_EXECUTABLE=%s -C %s/share/cmake/gromacs_mpi/gromacs-hints_mpi.cmake" % 
            (self.installdir, python_bin, self.installdir)
            )
        super(EB_GROMACS, self).install_extensions_sequential(*args, **kwargs)

or possibly defining an hook for the extension (https://github.com/easybuilders/easybuild-framework/blob/8ff6ba0d426b79a9b3da1248a7359d922b50596d/easybuild/framework/easyblock.py#L1916)

I am not sure if there is a way to modify the extension steps in some other way

However: Any ideas how to pass either the arguments or the env variables to the extension? I guess the gromacs easyblock could set the env variables before installing the extensions but then it would be good to have some way to get the values from the cmake easyblock to avoid duplication

I guess this would be a design choice. I can think of two way:

  • Split the creation of the options dictionary from the configure step to a separate function. This would allow a child easyblock to just call that function and then pick and use the options it needs (or atleast save the options after configure is called to allow for later reuse)
  • Add a convenience function eg get_cmakeopts_python to get all cmake options related to python

Also I would say the 2 are not mutually exclusive and both could benefit also from implementing a scoped version of the options dictionary where options are split into groups that make sense to recall together eg ('policies', 'generic/base', 'software_[python/...]', ...)

@Flamefire
Copy link
Contributor

Flamefire commented Oct 3, 2024

Have not worked much with extensions yet, but by looking at the code I think the easiest solution would be to modify the CMAKE_ARGS to define either Python3_ROOT_DIR or Python3_EXECUTABLE. That could be done either overriding install_extensions_sequential setting the variable before calling the original method eg

That specific variable is only for GROMACS: https://github.com/easybuilders/easybuild-easyconfigs/blob/ef56e11c4066828fa4af98b4709ba40728fc4cc6/easybuild/easyconfigs/g/GROMACS/GROMACS-2024.1-foss-2023b.eb#L78

Hence the idea for that specific easyblock to get required env vars from cmakemake (e.g. Python3_ROOT_DIR) to set them before installing an extension. For that it might be easier, if we use those env vars in cmakemake too, instead of *_EXECUTABLE.

* Add a convenience function eg `get_cmakeopts_python` to get all cmake options related to python

Yes that is what I had in mind: get_cmake_env_vars_python or so. Or even just get_cmake_env_vars and it would detect which ones to set dependent on loaded modules, e.g. add the python ones if Python is somewhere in the dependencies

I'm bringing this up here to have a method that can be used in all similar places. And GROMACS was the only example I know of where the virtual env caused problems before. I was thinking if you could access the configopts from the parent EC in the extension but don't think so. So we need something else, hence the brainstorming

@Crivella
Copy link
Contributor Author

Crivella commented Oct 3, 2024

I tested the snippet in the previous comment (added to the gromacs easyblock) and it works also by passing Python3_EXECUTABLE to CMAKE_ARGS. I think it is just about making the find_package here happy.

And GROMACS was the only example I know of where the virtual env caused problems before

I've also this problem in the LLVM easyblock, since I have to do a multistage build while changing CMake variables across the stages. I ended up just reimplementing part of the CMakeMake easyblock and working with a new dictionary, but also there it might help to be able to inherit all/selectively the options from the CMakeMake.configure

@Flamefire
Copy link
Contributor

I tested the snippet in the previous comment (added to the gromacs easyblock) and it works also by passing Python3_EXECUTABLE to CMAKE_ARGS. I think it is just about making the find_package here happy.

I'm not sure how that works from within the easyblock as the variable is set in the easyconfig. So as far as I see even if you set it in the easyblock the easyconfig will overwrite this. Or what exactly did you change and which EC did you test? GROMACS-2023.1-foss-2022a.eb from within a minimal virtualenv is one that fails for me. One of my tests uses export Python3_ROOT_DIR=$EBROOTPYTHON && export CMAKE_ARGS=.... This is running over the (here extended) weekend

I've also this problem in the LLVM easyblock, since I have to do a multistage build while changing CMake variables across the stages. I ended up just reimplementing part of the CMakeMake easyblock and working with a new dictionary, but also there it might help to be able to inherit all/selectively the options from the CMakeMake.configure

That might be a valuable datapoint: If it works easily for GROMACS base, the extension, and LLVM we have quite a few cases covered. But I guess for LLVM setting either the env variable or the *_EXECUTABLE argument would work as you call the cmakemake configure function

I'm just wondering if using the env variables is more flexible as we won't need to figure out how to pass flags to all the cmake invocation, such as in the pip install case with some custom env variable that might/will be different for each software if there even is any.

@Crivella
Copy link
Contributor Author

Crivella commented Oct 3, 2024

I'm not sure how that works from within the easyblock as the variable is set in the easyconfig.

Sorry forgot to mention i also removed the preinstallopts from the easyconfig

I'm just wondering if using the env variables is more flexible as we won't need to figure out how to pass flags to all the cmake invocation, such as in the pip install case with some custom env variable that might/will be different for each software if there even is any.

I am not sure, I think the env is not preserved (or reinitialized) across steps so it is possible the environment variable will have to be set in the needed step on a case by case basis?
Not entirely sure, but before i tried adding that snippet before the call to easyblock.extensions_step, and it was carried to the installation of the extension

@Flamefire
Copy link
Contributor

I am not sure, I think the env is not preserved (or reinitialized) across steps so it is possible the environment variable will have to be set in the needed step on a case by case basis? Not entirely sure, but before i tried adding that snippet before the call to easyblock.extensions_step, and it was carried to the installation of the extension

IIRC the environment is (partially?) reset when loading the fake module for the extension installation. But I think we can use either prepare_for_extensions or install_extensions to set the env variables.
And yes I guess that would be easyblock specific. But I imagine something like this for easyblocks like GROMACS:

def prepare_for_extensions(self):
    super().prepare_for_extensions()
    set_cmake_env_vars_python()

That would be slightly more generic than only getting the -D options. Although I do like to be able to directly set the python executables. Especially as that is more "focused/controlled" than an env variable. But not sure if it is always possible

@Crivella
Copy link
Contributor Author

Crivella commented Oct 4, 2024

I've added both the storing of the generated options inside the class and the suggested convenience function.
Indeed when calling CMake from within a build process there is no defined way to pass it arguments and it will dependent on the implementation.

Concerning the usage of the environment variables one thing to be careful of: i could also see the edge case where the buldsystem itself reset/starts from a new environment before making the call to CMake, but at that point I do not think there is much that can be done beside abiding to the buildsystem way of passing args to CMake or patching.

Copy link
Contributor

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added both the storing of the generated options inside the class and the suggested convenience function. Indeed when calling CMake from within a build process there is no defined way to pass it arguments and it will dependent on the implementation.

Yes storing the options is a good idea.

Concerning the usage of the environment variables one thing to be careful of: i could also see the edge case where the buldsystem itself reset/starts from a new environment before making the call to CMake

Yes, Bazel is one buildsystem that does reset the environment. However that doesn't call CMake AFAIK. I'm not aware of anything that calls CMake but resets the environment.

I do not think there is much that can be done beside abiding to the buildsystem way of passing args to CMake or patching.

Fully agreed.

I'm not sure if we should use 2 different approaches here (args and env variables). We might end up using almost always one approach and never notice when the rarely used one fails or becomes outdated. Using just the environment variables, not the *_EXECUTABLES args might be enough:

Before CMake 3.12 there was only FindPythonInterp which uses a plain find_program Hence that will always pick up the one we have first in PATH which is ours.

After 3.12 it should pick up the one from the env variable according to my earlier analysis. However only when Python*_FIND_STRATEGY is set to LOCATION or the policy is set to new. If that is not the case (e.g. in the GROMACS python extension case) I think it searches for the python3.13 binary first in this loop and if it finds it in the virtualenv after not finding it in our module (e.g. Python 3.10 is the dependency) it will still use the virtualenv python.

So after writing all this I see that we can't reliably control the chosen Python binary via the environment. Hence for direct cmake invocations we should use *_EXECUTABLE and remove the policy workaround. The env variables might still be useful in some cases. E.g. IIRC GROMACS changes the search strategy to LOCATION so it will pick up the env variables first.

@@ -320,6 +336,22 @@ def configure_step(self, srcdir=None, builddir=None):

return out

def set_cmake_env_vars_python(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this one, what's the plan with this?

As is, it can only be used in easyblocks that derive from CMakeMake?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would make 2 free functions out of that:

  1. Return a configure string with -DPython*_EXECUTABLE=..., maybe optionally only as a dict
  2. Return these env variables

The first function is the safe way for every situation we can pass CMake variables. We can use that to enhance the GROMACS easyblock to move the CMake config options out from the EasyConfig to the easyblock and add those options.

The 2nd function can be used whenever we cannot use the first. It might still pick up a python from a virtualenv when Python*_FIND_STRATEGY isn't set to LOCATION ("older" codes default this to LOCATION if unset) but at least we did a best effort to avoid that which might work for most cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the 2 separate functions to return the options as a dict or as a string:

  • get_cmake_python_config_dict
  • get_cmake_python_config_str

and changed the name of the function to set the env vars to set_cmake_python_env_hints (9a31f79)

There might be an argument of making that function a context manager to be used with with

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are taking the convenience function route, i am wondering if we should (probably another PR) also change the detection of all other options into separate/grouped convenience functions to make them reusable by other easyblocks than need to implement their own more complex configure step (eg the new LLVM one)

@boegel boegel changed the title Added Python[X/2/3]_EXECUTABLE to CMakeMake let CMakeMake easyblock also set Python_EXECUTABLE option, as well as Python3_EXECUTABLE and Python2_EXECUTABLE derivatives (when appropriate) Oct 9, 2024
@boegel boegel added this to the 4.x milestone Oct 9, 2024
Copy link
Contributor

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do 2 more changes:

  • Make the 2 functions freestanding. They don't use anything from the class instance so other easyblocks not deriving from CMakeMake directly should be able to use them too
  • Remove the policy code at
    # make sure that newer CMAKE picks python based on location, not just the newest python
    # Avoids issues like e.g. https://github.com/EESSI/software-layer/pull/370#issuecomment-1785594932
    if LooseVersion(self.cmake_version) >= '3.15':
    options['CMAKE_POLICY_DEFAULT_CMP0094'] = 'NEW'
    which is now better handled by the new arguments: Where we'd pass the policy argument we'll also pass the EXECUTABLEs making the policy irrelevant, doesn't it?

Comment on lines 345 to 347
def get_cmake_python_config_dict(self):
"""Get a dictionary with the CMake configuration options for Python hints."""
return self._get_cmake_python_config()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need for this function if it does nothing but forwarding, so they are the same function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented with fa30b9c

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there are still 2 functions where get_cmake_python_config_dict == _get_cmake_python_config. I'd just use one.

Copy link
Contributor Author

@Crivella Crivella Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to have them more modularized in case in the future there is need for more separation between the dict and str functions, but i guess for now they can just be one
c88446f

Copy link
Contributor

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok almost done, thanks for bearing with me so far!

If you want to keep the cmake_version in the function I'd suggest to include it also in the others: For CMake < 3.12 we only need PYTHON_EXECUTABLE

And some suggestions to improve the wording of the docstrings as I'd like to convey the following:

  • The options/argument-str should be preferred
  • That is not a hint but (pretty much) forces CMake to use that.
  • All functions only return some non-empty value when a Python module is loaded
  • Avoid suggesting that setting the env variables fully avoids CMake picking up the wrong Python. Due to the policies it can still happen although it is less likely

Besides that I wouldn't mention "FindPython" which likely isn't clear to anyone not familiar with CMake internals. If at all we could mention find_package(Python*) as that is what is visible from reading the CML. Also "Convenience function" sounds a bit to verbose for describing what the function does.

Check if you like the suggestions or if there's a better wording.

easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
easybuild/easyblocks/generic/cmakemake.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Flamefire Flamefire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until @Flamefire goes the last mile and becomes a maintainer, I'll need to merge this!

@ocaisa
Copy link
Member

ocaisa commented Oct 11, 2024

@boegelbot please test @ generoso
EB_ARGS="GROMACS-2023.3-foss-2023a.eb GROMACS-2021.3-foss-2021a.eb HOOMD-blue-4.0.1-foss-2022a.eb vcflib-1.0.9-gfbf-2023a-R-4.3.2.eb --installpath /tmp/$USER/pr3463"
CORE_CNT=16

@boegelbot
Copy link

@ocaisa: Request for testing this PR well received on login1

PR test command 'EB_PR=3463 EB_ARGS="GROMACS-2023.3-foss-2023a.eb GROMACS-2021.3-foss-2021a.eb HOOMD-blue-4.0.1-foss-2022a.eb vcflib-1.0.9-gfbf-2023a-R-4.3.2.eb --installpath /tmp/$USER/pr3463" EB_CONTAINER= EB_REPO=easybuild-easyblocks /opt/software/slurm/bin/sbatch --job-name test_PR_3463 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 14447

Test results coming soon (I hope)...

- notification for comment with ID 2407211069 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link

Test report by @boegelbot

Overview of tested easyconfigs (in order)

  • SUCCESS GROMACS-2023.3-foss-2023a.eb
  • SUCCESS GROMACS-2021.3-foss-2021a.eb
  • SUCCESS HOOMD-blue-4.0.1-foss-2022a.eb
  • SUCCESS tabixpp-1.1.2-GCC-12.3.0.eb
  • SUCCESS intervaltree-0.1-GCCcore-12.3.0.eb
  • SUCCESS fastahack-1.0.0-GCCcore-12.3.0.eb
  • SUCCESS filevercmp-20191210-GCCcore-12.3.0.eb
  • SUCCESS fsom-20151117-GCCcore-12.3.0.eb
  • SUCCESS multichoose-1.0.3-GCCcore-12.3.0.eb
  • SUCCESS smithwaterman-20160702-GCCcore-12.3.0.eb
  • SUCCESS WFA2-2.3.4-GCCcore-12.3.0.eb
  • SUCCESS vcflib-1.0.9-gfbf-2023a-R-4.3.2.eb

Build succeeded for 12 out of 12 (4 easyconfigs in total)
cnx2 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/56cfbaeef38b5976a67ec8f51229611e for a full test report.

@ocaisa ocaisa merged commit 3cd59c8 into easybuilders:develop Oct 11, 2024
41 checks passed
@Crivella Crivella deleted the feature-cmake_pythonvenv branch October 11, 2024 14:46
@boegel
Copy link
Member

boegel commented Oct 12, 2024

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS ALAMODE-1.4.2-foss-2022b.eb
  • SUCCESS ANTs-2.5.0-foss-2022b.eb
  • SUCCESS AOMP-13.0-2-GCCcore-10.2.0.eb
  • SUCCESS Arcade-Learning-Environment-0.8.1-foss-2023a.eb
  • SUCCESS Arrow-16.1.0-gfbf-2023b.eb
  • SUCCESS awscli-2.11.21-GCCcore-11.3.0.eb
  • SUCCESS btllib-1.7.0-GCC-12.3.0.eb
  • SUCCESS CPPE-0.3.1-GCC-12.2.0.eb
  • SUCCESS CVXPY-1.4.2-foss-2023a.eb
  • SUCCESS CheckM2-1.0.2-foss-2022b.eb
  • SUCCESS cppyy-3.1.2-GCCcore-13.2.0.eb
  • SUCCESS DIRAC-23.0-intel-2023a.eb
  • SUCCESS Doxygen-1.9.1-GCCcore-10.3.0.eb
  • SUCCESS EGTtools-0.1.11-foss-2022a.eb
  • SUCCESS FLINT-3.1.1-gfbf-2023b.eb
  • SUCCESS FlexiBLAS-3.3.1-GCC-13.2.0.eb
  • SUCCESS flatbuffers-23.1.4-GCCcore-12.2.0.eb
  • SUCCESS fumi_tools-0.18.2-GCC-11.2.0.eb
  • SUCCESS HepMC3-3.2.6-GCC-12.3.0.eb
  • SUCCESS HiGHS-1.7.0-gfbf-2023b.eb
  • SUCCESS kineto-0.4.0-GCC-12.3.0.eb
  • SUCCESS LLVM-18.1.8-GCCcore-13.3.0.eb
  • SUCCESS libcint-6.1.2-gfbf-2024a.eb
  • SUCCESS MDI-1.4.29-gompi-2023b.eb
  • SUCCESS MEGAHIT-1.2.9-GCCcore-9.3.0.eb
  • SUCCESS MetaBAT-2.15-gompi-2021b.eb
  • SUCCESS ncbi-vdb-3.1.1-gompi-2023b.eb
  • SUCCESS ONNX-1.15.0-gfbf-2023a.eb
  • SUCCESS OpenAI-Gym-0.26.2-foss-2022a.eb
  • SUCCESS OpenCV-4.8.1-foss-2023a-contrib.eb
  • SUCCESS OpenMEEG-2.5.7-foss-2023a.eb
  • SUCCESS OpenMM-8.0.0-foss-2022a.eb
  • SUCCESS PICI-LIGGGHTS-3.8.1-foss-2022a.eb
  • SUCCESS PySCF-2.4.0-foss-2022b.eb
  • FAIL (build issue) PyTorch-bundle-2.1.2-foss-2023a.eb (partial log available at https://gist.github.com/boegel/e25d9c0bc294b8b23b3f5524822cc485)
  • SUCCESS pFUnit-4.7.3-gompi-2022a.eb
  • SUCCESS poppler-24.04.0-GCC-13.2.0.eb
  • SUCCESS pybind11-2.9.2-GCCcore-11.3.0.eb
  • SUCCESS pypmt-1.2.0-gfbf-2023a.eb
  • SUCCESS python-mujoco-3.1.4-foss-2023a.eb
  • FAIL (build issue) RDKit-2022.03.5-foss-2021b.eb (partial log available at https://gist.github.com/boegel/e67aeb26c548d6bae579b3f9a1b33475)
  • SUCCESS Rust-1.70.0-GCCcore-12.3.0.eb
  • SUCCESS SPAdes-3.15.2-GCC-10.2.0-Python-2.7.18.eb
  • SUCCESS SRA-Toolkit-3.0.5-gompi-2022b.eb
  • SUCCESS SentencePiece-0.2.0-GCC-13.2.0.eb
  • SUCCESS SeqAn-2.4.0-GCCcore-9.3.0.eb
  • SUCCESS scikit-build-core-0.5.0-GCCcore-12.3.0.eb
  • SUCCESS tmap-20220502-GCC-11.2.0.eb
  • SUCCESS UShER-0.5.0-gompi-2021b.eb
  • SUCCESS vcflib-1.0.9-gfbf-2023a-R-4.3.2.eb
  • SUCCESS xtensor-0.24.7-foss-2023a.eb
  • SUCCESS Z3-4.13.0-GCCcore-13.3.0.eb
  • SUCCESS zlib-ng-2.2.1-GCCcore-13.3.0.eb
  • SUCCESS manta-1.6.0-GCC-10.2.0-Python-2.7.18.eb
  • SUCCESS pybind11-2.7.1-GCCcore-11.2.0-Python-2.7.18.eb
  • SUCCESS MetaBAT-2.14-gompi-2019a.eb
  • SUCCESS Gymnasium-0.29.1-foss-2023a.eb
  • SUCCESS ISCE2-2.6.3-foss-2023a.eb

Build succeeded for 56 out of 58 (58 easyconfigs in total)
node4002.donphan.os - Linux RHEL 8.8, x86_64, Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz, 1 x NVIDIA NVIDIA A2, 545.23.08, Python 3.6.8
See https://gist.github.com/boegel/01bd2c5d78f5e0776a55c63ceeb283f0 for a full test report.

@boegel
Copy link
Member

boegel commented Oct 12, 2024

  • PyTorch-bundle-2.1.2-foss-2023a.eb failed due with error -9, so probably a segfault (could be due to limited available memory in the test environment, only ~27GB
  • RDKit-2022.03.5-foss-2021b.eb failed with error below, has nothing to do with the changes being made here
    Downloading https://fonts.google.com/download?family=Comic%20Neue...
    CMake Error at Code/cmake/Modules/RDKitUtils.cmake:257 (MESSAGE):
      The md5 checksum for
      /tmp/easybuild_build/RDKit/2022.03.5/foss-2021b/rdkit-Release_2022_03_5/Code/GraphMol/MolDraw2D/Comic_Neue.zip
      is incorrect; expected: 850b0df852f1cda4970887b540f8f333, found:
      4cd32046e7d03f3905d0d6c433bfae43
    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants