Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrated docker image from miniconda to manylinux2014 #3874

Closed
wants to merge 13 commits into from

Conversation

santacodes
Copy link
Contributor

@santacodes santacodes commented Mar 7, 2024

Description

Migration from conda virtual environment to venv and using Centos 7 based manylinux2014 as base image for docker which is compliant with PEP599.

Fixes #3692 and fixes #3666

Type of change

Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist:

  • No style issues: $ pre-commit run (or $ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)
  • All tests pass: $ python run-tests.py --all (or $ nox -s tests)
  • The documentation builds: $ python run-tests.py --doctest (or $ nox -s doctests)

You can run integration tests, unit tests, and doctests together at once, using $ python run-tests.py --quick (or $ nox -s quick).

Further checks:

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Copy link

codecov bot commented Mar 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.60%. Comparing base (8056d22) to head (abc16d0).

❗ Current head abc16d0 differs from pull request most recent head 5d8417b. Consider uploading reports for the commit 5d8417b to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3874      +/-   ##
===========================================
- Coverage    99.60%   99.60%   -0.01%     
===========================================
  Files          259      259              
  Lines        21273    21270       -3     
===========================================
- Hits         21189    21186       -3     
  Misses          84       84              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Saransh-cpp Saransh-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this up, @santacodes!

scripts/Dockerfile Outdated Show resolved Hide resolved
@agriyakhetarpal
Copy link
Member

Here's a list of what else we would need from this PR, besides these changes:

  1. Update the installation guide in the documentation
  2. Adjust the GitHub Actions workflows to push the pybamm/pybamm:latest image as the one with all solvers, i.e., tag it with latest
  3. Adjust the base image being pulled in order to retain arm64 support
  4. A CHANGELOG entry in the "Breaking changes" section, since this shall be a user-facing change and affects developer installations

Also, could we get an estimate on the compressed size of the image? The ones we currently push are ~1.6 GB, so remaining close to somewhere around that would be great.

@santacodes
Copy link
Contributor Author

Also, could we get an estimate on the compressed size of the image? The ones we currently push are ~1.6 GB, so remaining close to somewhere around that would be great.

Sure I'd try to minimize the image size as much as possible and update you on the estimate

@santacodes santacodes requested a review from Saransh-cpp March 9, 2024 05:52
Copy link
Member

@arjxn-py arjxn-py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python base image looks much more valid as compared to manylinux, I'd request you to run test suite inside the running docker container at least once and see how many tests are getting skipped.
Plus confirm the solver availability inside the container by executing:

  • pybamm.have_idaklu()
  • pybamm.have_jax()
  • pybamm.have_scikits_odes()

@santacodes
Copy link
Contributor Author

python base image looks much more valid as compared to manylinux, I'd request you to run test suite inside the running docker container at least once and see how many tests are getting skipped. Plus confirm the solver availability inside the container by executing:

  • pybamm.have_idaklu()
  • pybamm.have_jax()
  • pybamm.have_scikits_odes()

All of the three methods return true inside the docker container running python venv. I also did run the test suite before pushing the commit, 1 test in each unit and integration tests were getting skipped, I am not exactly sure of the cause. I am awaiting on @agriyakhetarpal response on testing it on ARM.

@santacodes
Copy link
Contributor Author

Upon further researching on PEP599 the manylinux2014 platform tag which demands for the container to be compliant with GLIBC<=2.17, almost all of the linux distributions fulfilling that constraint have reached EOL, except CentOS 7 which will reach its EOL in June, 2024.

Most of the docker containers which have a constant maintenance use Debian/Ubuntu, RHEL or Alpine based distributions in their containers to maintain their integrity.

In reference to the above distributions, these are the OS releases that fulfill glibc<=2.17 -

  • Debian Wheezy which had glibc 2.13 and reached its end of life in May 2018. The packages are now archived and not constantly maintained which might cause some packages to break.
  • Ubuntu 13 Saucy which had glibc 2.17 and reached its EOL in July 2014.
  • CentOS 7 which still runs glibc 2.17 will reach EOL in June 2024 is the primary base image used by PEP599 manylinux2014 policy, but the downside of it in context to this project is multi-architecture building. Manylinux2014 provides A single build image and auditwheel profile per architecture which would mean that:
    • We would either have to use 2 different dockerfiles to build for x86_64 and aarch64 distinctly as the base images have a discrete architecture or
    • Use a multistage build to pull and push different images for each of them.
      This would also complicate things a bit as CentOS 7 would reach EOL soon.

To not do things the hacky way the official Python Bullseye image would make things a bit simple as the base image itself supports multiple architectures and is based on Debian. The size would also not be an issue as there's a slim version of the image and would have a constant support and maintenance for a while. Upgrading to a newer version or migrating would also become easier as the dependencies would be compatible. By default the Python Bullseye image provides glibc 2.31 which provides backward compatibility and wouldn't have any issues running in a container.

Reference

@agriyakhetarpal
Copy link
Member

agriyakhetarpal commented Mar 9, 2024

Thanks for looking into this @santacodes – yes, a Python bullseye image makes sense too as @arjxn-py mentions. The manylinux2014 images are indeed architecture-specific implementations of PEP 599 and therefore won't have support for multiple architectures from a single FROM command. The specification for having glibc < 2.17 is just for ABI compatibility reasons, and does not need to be enforced for a Docker image distribution – it is just for building the wheels.

@santacodes when pulling this branch I noticed that these changes have been made on the develop branch on your fork, and not a feature branch. This is generally not good practice since it causes issues with syncing the repository if you wish to contribute further, and a host of other reasons.

Could you create a new PR based on these changes, and close this one? You can check out a new branch from the current changes and that should bring your commits there (or you can cherry-pick the commits too – whichever you find easier).

@agriyakhetarpal
Copy link
Member

agriyakhetarpal commented Mar 9, 2024

Actually, testing this on ARM on macOS has revealed a bug in our current PyBaMM Docker images that we have been pushing to Docker Hub since some while:

ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 idaklu = importlib.util.module_from_spec(idaklu_spec)

File <frozen importlib._bootstrap>:573, in module_from_spec(spec)

File <frozen importlib._bootstrap_external>:1233, in create_module(self, spec)

File <frozen importlib._bootstrap>:241, in _call_with_frames_removed(f, *args, **kwds)

ImportError: /home/pybamm/PyBaMM/pybamm/solvers/idaklu.cpython-311-aarch64-linux-gnu.so: undefined symbol: _ZN6casadi8Function11deserializeERKSs

this unresolved symbol _ZN6casadi8Function11deserializeERKSs comes from CasADi upstream (see casadi/casadi#2887, dockcross/dockcross#753, dockcross/dockcross#780) and is quite challenging to fix. This error is the same as what I face when testing this new bullseye image – I remember facing this on #3462 and could not find a way towards resolution.

We can keep this PR open for a while – what I remember is that this did work on aarch64 earlier, but somewhere down the line this ended up breaking and therefore the IDAKLU solver cannot be instantiated

@santacodes santacodes closed this Mar 18, 2024
@santacodes
Copy link
Contributor Author

Closed this due to messy branches on my side, will make a new draft with those previous commits for future reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate from conda to venv inside Docker Images Reduce the number of Docker Images to just one image
4 participants