-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Julia packages #14
Comments
cc @tkelman |
Also there was a lot of discussion in this outdated diff thread that is relevant. |
@tkf has some recipes that might serve as a starting point for discussion. Here's an example: https://github.com/tkf/conda-julia/tree/master/julia-compat |
I wouldn't do this unless you can faithfully propagate all compatibility information exactly from the Julia package manager into the conda recipes, and automatically listen for changes since compatibility in the Julia registry can be amended after the fact. All of that information is currently being rewritten, and the current formats will be replaced, so any effort here will need to be completely redone. If given the same set of desired packages and version constraints, the conda resolver ever picks a different set of versions than the Julia package manager, I think that would be highly problematic and very difficult for users to debug. It should be noted that it's already possible to use Conda as a provider of binary dependencies for Julia packages, but few packages do (only packages involving python interoperability or jupyter, for the most part) - Conda is a very heavy dependency to download a binary if that's the main motivation for using it. Using a python-based package manager to manage julia packages doesn't make a lot of sense to me. |
As to the first point, this is really not any different than maintaining any other package. Though the fact that one is allowed to amend existing releases is concerning. These really should be fixed by making new patch or post releases. Something to indicate to the user the original release was broken and a new release is out to fix it. Any links to docs and/or discussions about how Julia package specs work or should work would be really helpful. Is the specification for Julia's package manager's solver documented? If so, could you please provide a link to these docs. It's incorrect to think of The strength of |
You wouldn't write a pypi-to-conda conversion if you knew everything about pypi was in the process of being rewritten, would you? Any existing documentation will be obsolete soon. The code doesn't get changed, but the compatibility constraints used for version resolution can be. It's not a new release, but modified information about compatibility. I didn't say python package manager, I said python-based package manager. It's a large dependency, implemented in python, that doesn't provide a lot of functionality Julia isn't already capable of (without needing any python installation to be present at all). |
We can really see conda as a general-purpose multi-platform package manager and packaging software for a conda-based distribution is in that regard very similar to packaging it for e.g. Debian. The fact that conda itself is written in python is not really relevant. Many of the questions that come up when packaging for conda are the same that will come up when packaging for Debian or RPM-based distribs, and I do think that it makes sense for Julia packages to be packaged for Debian. My take on language-specific package managers (CRAN, pypi), is that in an ideal world they would not exist at all. As soon as you leave the pure-R, or pure python world, everything becomes really dirty. For example, in R, RCppArmadillo vendors all of armadillo's headers (in breach of policy for debian, which would require it to depend on armadillo instead). Similarly, the pypi package for pyzmq vendors a binary for libzmq, while in Conda, we make it depend on the zeromq package... They are similar issues with Julia, which creates a separate installation prefix for each package. So for example, EigenCpp.jl vendors Eigen instead of depending on it. |
We've in fact very strongly told debian packagers not to package Julia packages. They can't even package the language in a usable way, we have to tell people not to use apt-get to install julia because the version they get is outdated and broken. I don't think you're going to solve the multi language dependency problem as long as many conda-forge packages remain linux-only. And has the situation regarding compiler choice on Windows gotten anywhere? Using Julia packages with msvc-built libraries is absolutely not recommended and would be asking for trouble. If vendoring is the best way that language package authors have found to make things work across platforms and distributions, it's less of an evil than relying on a large python program to download equivalent binaries for use with things that don't want to touch python at all.
That's up to each package to decide and hasn't really caused any problems, but could also easily change. There's nothing stopping Julia packages from using Conda (or some other mechanism with a shared prefix) to manage any binary dependencies they have, but few have chosen to do so. Using conda to manage the Julia code doesn't strike me as something Julia users would want to do. Conda users might want to do it, but if there are problems with how it's done, it's likely to get the same "don't use that, it's not supported" treatment from upstream that distro packaging has. |
I think the distribution Julia packages and the binaries they depend on is orthogonal. Trying to recreate the Julia package manager in Conda seems like too much duplicate work. On the other hand, facilitating a way for packages to share binaries would be useful. It would also be good if some setting in Julia could indicate a "preferred" binary provider, so that all binaries preferably come from that provider, ensuring compatibility. |
It would probably not be a very good thing if there are deep reasons why Julia could not be packaged in a generic fashion by general-purpose package managers. It likely would contribute in making it an island rather than a glue language in a polyglot world. Most python packages are independently packaged by Debian/conda/RPM and I would generally not recommend pip whenever the package is available in the host package manager.
Aren't most conda-forge recipes also built on windows?
Quick note on MSVC/mingw, AFAIK, MSVC remains the main platform compiler on which things are preferably built on conda-forge. Although packages like R, and openblas are built with mingw. MSVC and mingw-generated binaries are actually compatible at the C level. C++ mangled name tend to be different.
The faster pace of iteration of the conda world compared to e.g. Debian is a good occasion for us to figure out the best way to distribute native Julia extensions based on CxxWrap. |
No deep technical reasons. It's all down to policy reasons that often don't allow the Julia package to use the versions or configurations of its dependencies that are needed for it to work correctly. And the maintenance effort of repeating this for every single packaging system in each distribution, for very little benefit since the upstream project already distributes binaries that are widely tested and known to work. The simplest and most-recommended option would be to take the binaries exactly as built by upstream and repackage them in whatever deb, rpm, etc format you want - that will definitely work fine. If policies don't allow that, you're on your own. All those policies exist for a reason, but the end result is packaging Julia for Linux distributions takes a lot of work to do correctly, and the upstream project does not find it worth the time to undertake that effort or endorse it when done by a third party.
I suspect you'll find this viewpoint is growing less and less common over time.
Not this one. Not many of its dependencies. Discussing the creation of conda packages out of Julia packages feels premature before the language package itself is even being built cross-platform.
Not if any of the API's involve passing around C runtime library objects. Or if you allocate and free buffers in different libraries. As a concrete counterexample, once upon a time HDF5.jl was using an MSVC-built binary on Windows, and it had a bunch of issues with segfaults and generally not working correctly until it was replaced with a cross-compiled mingw binary.
That discussion and work is ongoing on the Julia side, replacing the existing BinDeps.jl with a solution that involves distributing binaries on all common platforms. You can try to encourage consideration of conda / conda-forge as part of that effort since there is a fair amount of overlap, but I don't think it's a serious candidate given the amount of Python-dependent infrastructure here (conda-forge/conda-smithy#569 would help considerably). |
Quite the contrary. Especially with the arrival of conda ! Although I am mostly in the business of building extensions, where having a clean common unix-style prefix is really important.
A crazy idea would be to adopt the same package format! After all, the conda package tarballs have a simple structure, and the SAT solver is something that could also be wrapped in julia. |
One quick solution is to also have a recipe for |
I'd recommend trying to express yourself more concisely when communicating on github. I have not been part of Julia Computing or involved at a deep level in the Julia ecosystem for nearly a year. If conda-forge or anyone still involved in Julia would like to undertake the task of making the language and its packages a first-class citizen when managed by conda, go ahead. The upstream developers may have a similar opinion to what I did: please do it right, don't introduce more problems by using different compilers or versions of dependencies than the upstream project knows to work and recommends. And if being cross-platform is a supposed benefit of using a package manager not implemented in Julia, then Julia should actually be packaged across multiple platforms. It looks like Mac support is happening in a pull request here, so that's a start. Pkg3 and the replacement for BinDeps.jl now exist as more than prototypes, and while they aren't completely stable and still have their issues you're welcome to move forward and deal with translation between their representation of dependency information and conda's. It's great that infrastructure like the autotick bot now exists in conda-forge, that would help for Julia packages if support for the Pkg3 registry format were implemented there. Things like the ability to use conda's copy of say libzmq when Julia and ZMQ.jl are installed by conda will need to be taken up with the developers of BinaryProvider.jl and/or ZMQ.jl. I suspect it won't work at all on Windows as long as the conda zmq binary is only available built by MSVC. Anecdotally I work at a Python shop today and attempting to use conda (even with conda-forge) has been more trouble than it's worth, it's poorly supported on various cloud services and usually made redundant by binary wheels. But that's just been my personal experience. |
For anyone who is interested in pursuing this, note that the Julia package manager has now changed considerably since this issue was opened, see the Pkg docs or repo. In addition, we now encourage Julia package developers to use the BinaryBuilder.jl cross-compilation infrastructure for binary dependencies. This has proven very successful, and allows package maintainers to easily build dependencies across a wide variety of platforms (even those they don't have access to). |
Maybe we could just "piggyback off of Conda.jl"? Specifically, what sounds like a possible "minimum viable product" to me, would be adding to the build script of this conda package a step which, after installing Julia, installs the Conda.jl Julia package (and probably also the BinaryBuilder.jl package, as was suggested above). The script would probably have to follow the instructions for using pre-existing conda environments, found here, and since those instructions require setting/changing the Also, if we were to do that, it seems it might be necessary to either change the build Python (which is currently 2.7) to Python 3, or to take the precautions listed on the Conda.jl page regarding using Python 2 instead of Python 3 (the default for Conda.jl). Then, for further Julia packages in conda, they could essentially just list |
The discussion here might be useful: https://discourse.julialang.org/t/how-does-one-set-up-a-centralized-julia-installation/13922/3 Here the "center" of the "centralized Julia installation" would be the conda environment in which this package was installed. Also I take back what I said about piggybacking off of Conda.jl -- that package should only be a dependency for Julia packages which have non-Julia dependencies (e.g. IJulia). Pkg3 seems to make the question of how to do this especially confusing+difficult, given its "depots" and "stacked environments":
Should the Conda environment from which the Julia Conda package was installed be the "primary environment"? (And thus the first entry in Presumably any solution should have the root of all nested Julia environments be the Conda environment where the relevant Julia version was installed. Also there is apparently going to be another big change to the packaging system in Julia 1.3: https://julialang.github.io/Pkg.jl/dev/artifacts/ |
We do have a native (C++) package manager now that supports installing conda packages (micromamba) and we can easily write Julia bindings for it (https://github.com/mamba-org/mamba). This would at least remove the problem of the heavy Python dependency of conda :) I am really interested in having the latest Julia on conda-forge. @ViralBShah do you know someone from the Julia community that could help us out? It would be awesome to have you guys with us. Also BinaryBuilder.jl looks very cool. Definitely inspiring :) |
Does anyone know how much the Julia package manager is still being rewritten? It seems lack of stability of the Julia packaging ecosystem would remain a persistent blocker for interoperability of conda and Julia package management. Admittedly I've never really bothered seriously trying to use Julia because of this -- my best practices rely very heavily on Conda for "sandboxing" (i.e. environments) and cross-language interoperability, although Mamba looks interesting. TL;DR below: how Conda interacts with Node/NPM might be a more realistic role model for packaging Julia than how it interacts with Python or R. At least for now. Maybe, at least as a stopgap measure, it could make sense to just make a decent conda package for the Julia language, with the understanding that although users would install Julia in a particular Conda environment, any Julia packages would be installed using that Conda environment's Julia. Presumably the Conda recipe would set Julia's environment variables to use that environment's Python in the necessary ways, and so that the Julia packages installed by that conda environment's Julia are not accessible to any other Julia installations on the system. It seems like the current build script already does at least some of this. It's not clear to me whether this article about best practices for "Julia environments"/Julia package management implies more post-install configuration should be done. To prevent The above suggestion would still allow the conda environment's Julia to create Julia environments of course, so there could be multiple Julia environments inside of a single conda environment. That wouldn't be great, or even good, but maybe it would still be OK as long as the installed Julia is adequately sandboxed. This would resemble what Conda already does for Node/NPM. See a description of that situation here. Honestly in my experience that situation is pretty ugly too, but e.g. I've been able to at least make stopgap conda recipes for things with NPM dependencies (think JupyterLab extensions) using such a setup. So imo the situation between Conda and Node/NPM is definitely OK, and so maybe worthy of emulating for Julia. |
The Julia package manager has been stable and backwards company since Julia 1.0 (about 3 years ago). It provides sandboxed and reproducible project environments by default. |
Thank you for clarifying! I thought they were planning a major refactoring of the package management system but didn't want to delay the 1.0 release for that. But you're right I can't find that written anywhere. |
Now that the recent Julia version 1.6.1 is available (#115), could it make sense to revisit this issue? |
I am looking at two specific issues:
|
What we would want to do is alter the JULIA_DEPOT_PATH: https://docs.julialang.org/en/v1/manual/environment-variables/#JULIA_DEPOT_PATH One possibility is that we stack the conda environment specific depot on top of the default Julia depot. This would allow read only access to the common depot. https://docs.julialang.org/en/v1/base/constants/#Base.DEPOT_PATH https://docs.julialang.org/en/v1/manual/code-loading/#code-loading cc: @ngam |
@isuruf any thoughts/opinions on the path forward, thanks! |
What are the requirements on the conda-forge side? Would a script that just executes the below be sufficient? using Pkg
Pkg.add(url="https://github.com/jhardenberg/RainFARM.jl#some_git_tag") Otherwise, I suppose we could clone the repo and put it in a tarball. We could then add the package locally either through For packages in the Julia general registry, the above is simpler. It could be just the following. using Pkg
Pkg.add(name="RainFARM", version="1.0") An alternative requiring much effort may involve starting a separate conda-forge registry. |
@mkitti thanks for posting this. I am also interested in mocking jll packages that currently vendor dependency and directly depend on the corresponding conda-forge packages.(e.g. zlib, zeromq, cxxwrap, etc). |
I think the recent Julia Depot mechanism per conda environment should help on the 1.7 branch with the recent modifications to activate.sh. One just needs to manage the Overrides.toml, which is Depot specific. Alternatively, one could fork a JLL package as a subdirectory packages of a feedstock. Perhaps @giordano may have some thoughts on how to best implement this. |
On the 1.7 branch, which is master, we implemented items 2 and 3 above. [edited] We have a JULIA_DEPOT_PATH per conda environment. We also have a top level shared Julia environment whose name is derived from the conda environment path. We then modified the JULIA_LOAD_PATH as such that other Julia projects/environments stack on the top level shared Julia environment. |
Some stacking too: julia-feedstock/recipe/scripts/activate.sh Lines 13 to 14 in bd5133e
|
Let me just interpret the whole meaning of the activate script so we are clear. julia-feedstock/recipe/scripts/activate.sh Lines 6 to 14 in bd5133e
There are three environment variables:
$ cd some_project
$ julia --project=.
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.1 (2021-12-22)
_/ |\__'_|_|_|\__'_| | A conda-forge release: https://github.com/conda-forge/julia-feedstock
|__/ |
julia> ENV["JULIA_DEPOT_PATH"]
"/home/mkitti/anaconda3/envs/julia171_test/share/julia:"
julia> DEPOT_PATH
3-element Vector{String}:
"/home/mkitti/anaconda3/envs/julia171_test/share/julia"
"/home/mkitti/.julia"
"/home/mkitti/anaconda3/envs/julia171_test/local/share/julia"
julia> ENV["JULIA_PROJECT"]
"@julia171_test"
julia> Base.active_project() # note the command line arg overrides the above
"/home/mkitti/some_project/Project.toml
julia> ENV["JULIA_LOAD_PATH"]
"@:@julia171_test:@stdlib"
julia> Base.load_path()
3-element Vector{String}:
"/home/mkitti/some_project/Project.toml"
"/home/mkitti/anaconda3/envs/jul" ⋯ 31 bytes ⋯ "ents/julia171_test/Project.toml"
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/stdlib/v1.7"
julia> Base.load_path()[1]
"/home/mkitti/some_project/Project.toml"
julia> Base.load_path()[2]
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/environments/julia171_test/Project.toml"
julia> Base.load_path()[3]
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/stdlib/v1.7" I'll revise what I said above. We actually implemented options 2 and 3 above. |
If we do not specify a project at the command line, then the active project is just the shared "julia171_test" shared environment. $ julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.1 (2021-12-22)
_/ |\__'_|_|_|\__'_| | A conda-forge release: https://github.com/conda-forge/julia-feedstock
|__/ |
julia> Base.active_project()
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/environments/julia171_test/Project.toml"
julia> Base.load_path()
2-element Vector{String}:
"/home/mkitti/anaconda3/envs/jul" ⋯ 31 bytes ⋯ "ents/julia171_test/Project.toml"
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/stdlib/v1.7"
julia> Base.load_path()[1]
"/home/mkitti/anaconda3/envs/julia171_test/share/julia/environments/julia171_test/Project.toml"
julia> ENV["CONDA_PREFIX"]
"/home/mkitti/anaconda3/envs/julia171_test" |
This package may be of interest: https://github.com/GunnarFarneback/LocalRegistry.jl |
Here's a crazy idea. We can selectively package a Julia depot and add that to the bottom of the depot stack. See conda-forge/staged-recipes#20116 (comment) for an example. It's not a great long term strategy, especially if we package artifacts, but it might be a way to get started. The dependence on the Julia package manager is then mostly contained when building the conda-forge package. |
Following this, we have a working prototype for the curious: conda-forge/pysr-feedstock#43 (comment) Please have a look and weigh on the progress. |
Do we have a naming convention in general and for Julia packages in general? What would you call the corresponding conda packages for the following?
While my first preference would be to keep the names as is, I see this may not work per conda-forge/conda-forge.github.io#18 Using a julia prefix, we could do the following:
Any other thoughts? |
Note that this issue was closed but the topic is still "open for debate". If you want to bring julia package naming up again I believe it is a worth topic to discuss. I'm slightly inclined for the named spaced option. It is more verbose but unambiguous. |
Where should we engage in this on topic? Should I create a new issue on conda-forge.github.io? |
That is a start. An issue here is OK too. I'd also post the issue on the gitter main channel. I'll add the topic to our meeting agenda as well. Hopefully we can get some momentum and move things forward. Thanks for helping out on this topic BTW. |
I created an issue specific to naming here: |
Opening this issue to discuss how to best handle Julia packages with
conda
.The text was updated successfully, but these errors were encountered: