Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy: name native libs after SONAME #157

Closed
jankatins opened this issue Jun 1, 2016 · 12 comments
Closed

Policy: name native libs after SONAME #157

jankatins opened this issue Jun 1, 2016 · 12 comments

Comments

@jankatins
Copy link
Contributor

jankatins commented Jun 1, 2016

[This is an issue from multiple other places: https://github.com/conda-forge/libpng-feedstock/pull/4#issuecomment-222751729, https://github.com/conda-forge/libpng-feedstock/issues/3#issuecomment-222777245 and https://github.com//pull/155#issuecomment-222988756]

If the library is well maintained, the SONAME (which usually includes the major version: libc.so.6) changes if the lib introduces any incompatible changes (aka a app build against a lower version would break if dynamically linked against the newer one -> API/ABI changes as opposed to API additions).

On linux (at least debian), the package which contains the library has to change each time the SONAME changes:

https://www.debian.org/doc/debian-policy/ch-sharedlibs.html#s-sharedlibs-runtime

The run-time shared library must be placed in a package whose name changes whenever the SONAME of the shared library changes. This allows several versions of the shared library to be installed at the same time, allowing installation of the new version of the shared library without immediately breaking binaries that depend on the old version. [...]
Every time the shared library ABI changes in a way that may break binaries linked against older versions of the shared library, the SONAME of the library and the corresponding name for the binary package containing the runtime shared library should change. Normally, this means the SONAME should change any time an interface is removed from the shared library or the signature of an interface (the number of parameters or the types of parameters that it takes, for example) is changed. This practice is vital to allowing clean upgrades from older versions of the package and clean transitions between the old ABI and new ABI without having to upgrade every affected package simultaneously.

Currently the native packages all use generic names (aka libpng instead of libpng16 or zlib instead of zlib1). Following the same pattern as debian has two main advantages:

  • you don't need to use < <next major version> pins, >= <min version> is enough. If the library breaks, the SONAME changes which means a different package name is used. Currently, any packages which miss the < <new major version> pin can be installed with an newer version of the library, but would break.
  • Two incompatible versions of the packages can be installed at the same time. Currently only one version can be installed which might end in a recompile nightmare when conda-forge would update to libpng 17 which would then contain libpng17.dll or libpng17.so.17 -> All packages which link against the old libpng would need to be recompiled to get a higher dependency

Library packages should also include a test for the filename so that the build breaks if the soname/filename changes.

@jakirkham
Copy link
Member

cc @pelson @msarahan @ocefpaf

@jankatins
Copy link
Contributor Author

jankatins commented Jun 1, 2016

It would also be nice if there would be any way to get the minimum required version of a library from the library itself, so that one could write

requirements:
  build:
    - freetype {{libpng_min_requirement}}

instead of needing to manually update (and look up...) that each time a library updates.

Example of the breakage a simple rebuild can trigger:

  • Depended package A builds a recipe against libpng which includes a pin which is valid today
  • libpng updates, gets a new min required version
  • A is rebuild for a different reason (=without changing the min required version on libpng), the autobuilder pulls in new version
  • package A has now a conda level dependency on the old version, but a binary level dependency on the new version and will break on runtime when the old version of libpng is installed
  • example issue

@jankatins
Copy link
Contributor Author

jankatins commented Jun 1, 2016

Re lib files on windows: there can be two versions of these files: static ones, which include all object code and some kind of "description of the dll", similar to header files at source level. IMO the second version of the lib should be treated the same as header files, e.g. should be installed with versionless copies. The link time dependency is then on the full SONAME/filename of the dll, so building against png.lib would still end up as dependency on libpng16.dll.

This has currently the problem that the "header" and the binary library are in the same package and installing multiple versions would overwrite the header (conda does not erro if it overwrites files form another package), so doing a build against an env which would contian both version would randomly (last one wins) link against one of the versions.

In linux distributions, they would usually end up in different binary packages -> the source package zlib installs zlib1g and zlib1g-dev (zlib seems to be the only package I found which does not follow the libxxx<version> naming :-)). zlib1g-dev depends on zlib1g (same version including build number)) -> you usually only install one dev versions (and they probably conflict with each other), but you can still install multiple runtime versions of the library without any file conflicts. If only the ABI changes, but not the API, then then there could also be only a libx-dev and multiple libx<version> packages.

@jankatins jankatins changed the title Policy: name nativ libs after SONAME Policy: name native libs after SONAME Jun 1, 2016
@jankatins
Copy link
Contributor Author

jankatins commented Jun 3, 2016

For a package which currently makes something uninstallable, try installing pillow and pyqt:

  • qt depends on jpeg 8d (qt and libqt this isn't on conda-forge, only in default)
  • latest pillow depends on jpeg 9*
c:\data\external\conda-packages\staged-recipes (pr/644)
[mpltest_35] λ conda install pillow=3.2.0=py35_2
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ................
Solving package specifications: ....

The following specifications were found to be in conflict:
  - pillow 3.2.0 py35_2
  - pyqt (target=pyqt-4.11.4-py35_5.tar.bz2) -> python 2.6*|2.7*|3.3*|3.4*
  - pyqt (target=pyqt-4.11.4-py35_5.tar.bz2) -> qt >=4.8.6|>=4.8.7
  - qt (target=qt-4.8.7-vc14_7.tar.bz2) -> jpeg 8d
Use "conda info <package>" to see the dependencies for each package.

jankatins added a commit to jankatins/matplotlib that referenced this issue Jun 3, 2016
* Use versions from conda-forge (in sync with the conda package build)
* remove pyqt (which right now is installed from default) as the dependencies of that package clashes with packages in conda-forge. See here:

conda-forge/conda-forge.github.io#157 (comment)
@jankatins
Copy link
Contributor Author

jankatins commented Jun 7, 2016

The proposal for splitting and renaming libpng (and probably jpeg and other native packages) is now in conda-forge/libpng-feedstock#7

jankatins added a commit to jankatins/matplotlib that referenced this issue Jun 13, 2016
* Use versions from conda-forge (in sync with the conda package build)
* remove pyqt (which right now is installed from default) as the dependencies of that package clashes with packages in conda-forge. See here:

conda-forge/conda-forge.github.io#157 (comment)
jankatins added a commit to jankatins/matplotlib that referenced this issue Jun 28, 2016
* Use versions from conda-forge (in sync with the conda package build)
* remove pyqt (which right now is installed from default) as the dependencies of that package clashes with packages in conda-forge. See here:

conda-forge/conda-forge.github.io#157 (comment)
jankatins added a commit to jankatins/matplotlib that referenced this issue Jul 3, 2016
* Use versions from conda-forge (in sync with the conda package build)
* remove pyqt (which right now is installed from default) as the dependencies of that package clashes with packages in conda-forge. See here:

conda-forge/conda-forge.github.io#157 (comment)
@ChrisBarker-NOAA
Copy link
Contributor

Yes, yes yes!!!

jpeg is driving me batty right now!

But ideally it's a policy, so we can get things clean, and not play wack-a-mole trying to catch each lib one by one...

Maybe we need an audit to see where this should be done?

@jakirkham
Copy link
Member

So @mingwandroid raised a different point on this issue in a meeting recently. Figured I would copy him here so he could share his perspective.

@mingwandroid
Copy link

On a case-by-case basis, for specific pain points with libraries that are small enough and that follow semantic versioning I don't object, but ..

What @JanSchulz is talking about here is referred to as 'semantic versioning'. Only a small percentage of software projects follow semantic versioning (and attempting to apply this strategy to those that do not is pointless), many of those that use this scheme do it wrong and as soon as you depend on anything that does it wrong you can't do it right yourself either (see Qt as an example of a project that tries hard to do it right; does anyone here believe that you can use Qt 5.6 on an executable that was built against Qt 5.5? IMHO there is not a chance that it will work. I'd be surprised if an application linked against Qt 5.6.0 will run against Qt 5.6.2 without a rebuild).

It requires patching a huge amount of software and that requires a good level of knowledge of the build system used in each case.

Conda packages contain more than shared libraries so attempting to allow more than one of each to be installed in an environment at the same time will lead to the last one installed writing all those other files. Headers (and on Windows, import libraries) will not match the binaries, files in /etc will not be of the right format wrt the shared library, man pages will not match up. Split packages might help here, but only to the extent that in that case, the files would not have been installed, unless conda learns about the concept of conflicts.

I would rather either keep things as they are (as a general principle) or see something less up to chance implemented, such are forming a checksum of the interface definition of each shared library and renaming that library according to the checksum.

IMHO it's not worth the effort, That Debian attempt it doesn't convince me that it works well in general.

The compatibility problem here is that conda defaults is Fedora and conda-forge is Ubuntu (feel free to pick two different distros), and we're trying to mix software from two different distributions and not really anything to do with SONAMEs.

@ocefpaf
Copy link
Member

ocefpaf commented Nov 4, 2016

@mingwandroid I agree with you in most of your points. Not that I do not believe in the SONAME approach, but I do believe it is not worth using that in conda.

We can probably work out a better consistency between defaults and conda-forge without that concept. We need to ensure that both channel has a consistent stack. (Right now conda-forge is lacking on that regard due to the absence of Qt in some platforms.)

Also, we can work out a more consistent pinning with defaults along the way. I am more careful with all my pinnings now, trying to match defaults as much as I can, and breaking that only when it makes sense. I think that jpeg is the only problematic pinning we have right now.

@ChrisBarker-NOAA
Copy link
Contributor

On a case-by-case basis, for specific pain points with libraries that are small enough and that
follow semantic versioning I don't object, but ..

What @JanSchulz is talking about here is referred to as 'semantic versioning'. Only a small
percentage of software projects follow semantic versioning (and attempting to apply this strategy > to those that do not is pointless), many of those that use this scheme do it wrong

Fair enough -- you're right -- this probably can't be done as an always do it policy.

However, we could do for "specific pain points with libraries that are small enough", and that would help a lot. I think we need to be thoughtful when upgrading libs that are dependencies of a LOT of other packages and projects (like png, jpeg, ....). QT, on the other hand, is a real challenge, but not nearly as wide a variety of other packages need it -- so if we update QT, we can update the other QT package a lot more easily. (side note -- we really need to handle optional dependencies better -- I never use QT with matplotlib -- I should be able to install MPL without it (easily).

Conda packages contain more than shared libraries so attempting to allow more than one of each
to be installed in an environment at the same time will lead to the last one installed writing all
those other files. Headers (and on Windows, import libraries) will not match the binaries, files in
/etc will not be of the right format wrt the shared library, man pages will not match up.

This will require some care, and needs to be done on a case-by-case basis. It seems some libs are designed with this in mind -- png comes to mind -- I see it called libpng16 a lot. (maybe not on Windows -- AARRGG!) so it may work fine in some cases without too much fudging.

Also, we can work out a more consistent pinning with defaults along the way. I am more careful
with all my pinnings now, trying to match defaults as much as I can, and breaking that only when it
makes sense. I think that jpeg is the only problematic pinning we have right now.

That's the one that's biting me right now :-(
I'm curious -- was there any compelling reason to upgrade jpeg at all?

Anyway, it's not JUST defaults -- conda forge is getting pretty big now. As soon as you upgrade one package and pin it to a newer lib, you break its compatibility with ALL the other packages that are pinned to an older version of that lib. This is a bigger deal with defaults, because we have no control over defaults, but it's still a lot of work. And do we have any tooling to help with that?

Maybe this problem can be addressed by being more careful -- though with an entire community of people contributing, that's going to be tough!

That Debian attempt it doesn't convince me that it works well in general.

What Debian (and all the distros?) do is version the entire system. so when they want to make a incompatible upgrade to a lib, they can put that in the next version, and upgrade everything else along with it at the same time.

and Continuum does this with Anaconda, too.

So maybe we should do it with conda-forge.

OK -- off to try to solve my $%#ing jpeg problem now.

@ocefpaf
Copy link
Member

ocefpaf commented Nov 4, 2016

OK -- off to try to solve my $%#ing jpeg problem now.

@ChrisBarker-NOAA what are your jpeg problems? We created an internal channel that re-pins stuff to be consistent with defaults but lately we made those available online again because more users needed it. (Although not because of the re-pinning but because we are building with conda-build 2).

So if you are trying to create an env this channel order mat help. It is tested on Windows, OS X, and Linux, and works with all the packages in that list.

@ChrisBarker-NOAA
Copy link
Contributor

well, I've had a lot of issues trying to upgrade the GNOME project to use newer libs, and conda-forge wherever possible:

https://github.com/NOAA-ORR-ERD/PyGnome/blob/master/conda_requirements.txt

note that I think I need to pin all those version, rather than >=.

But here's what I get with it as it is:

clean python2 conda environment:

$ conda config --get channels
--add channels 'defaults' # lowest priority
--add channels 'NOAA-ORR-ERD'
--add channels 'conda-forge' # highest priority

$ conda install --file conda_requirements.txt

Hey! that just worked! Maybe there's been a change since a couple days ago -- yippie!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants