-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure extension triggers are only run by the package that satified them #48513
Conversation
Oscar had noticed a problem where loading an extension might trigger loading another package, which might re-trigger attempting to load the same extension. And then that causes a deadlock from waiting for the extension to finish loading (it appears to be a recursive import triggered from multiple places). Instead alter the representation, to be similar to a semaphore, so that it will be loaded only exactly by the final package that satisfied all dependencies for it. This approach could still encounter an issue if the user imports a package (C) which it does not explicitly list as a dependency for extension. But it is unclear to me that we actually want to solve that, since it weakens and delays the premise that Bext is available shortly after A and B are both loaded. \# module C; using A, B; end;; module A; end;; module B; end;; module Bext; using C; end \# using C, Bext / A / B starts C -> requires A, B to load loads A -> defines Bext (extends B, but tries to also require C) loads B -> loads Bext (which waits for C -> deadlock!) finish C -> now safe to load Bext While using this order would have been fine. \# using A, B, Bext / C loads A -> defines Bext (extends B, but tries to also require C) loads B -> starts Bext loads C finish Bext
251dbef
to
d7b2062
Compare
Oscar helped discuss offline that this deadlock case (mentioned in the comments) is actually potentially a severe flaw here right now in EXT_DORMATORY. We have unintentionally created a situation where user might easily accidentally create a loading graph that deadlocks unpredictably when loading it. In practice, that means, in addition to this PR being sensible, that we may want to forbid extension packages from loading any new packages (unless we can prove they do not depend on any of the triggers?). It should thus potentiall be forbidden from accessing any package that was not explicitly listed as one of its triggers (any others it wants transitively it could access via |
Originally, I had that as the implementation and you went via The code that allows that is this: Lines 824 to 825 in c4fd8a4
To understand better, the problem in the previous code was the use of |
This comment was marked as off-topic.
This comment was marked as off-topic.
There are a couple problems we observed, and this only really addresses one of them. This addresses the case where multiple extensions need to be loaded. Previously each extension that loaded would trigger a new walk through the list, and then load the next extension in sequence. This is not strictly a problem, but is a bit awkward as it means the stack keeps getting longer and complicates the locking order of the loaded packages in some cases.
Secondly, we observed that this should never have been allowed. It would be very useful, if it was valid. But it creates a cycle in the loading graph, which leads to unpredictable deadlocks, and causing the extension to sometimes hang (v1.9) or error (master today). It turns out that we must forbid this (for now), to prevent such unreliable behavior from catching users unawares and making PkgEval unreliable. We can re-evaluate later if we want to design a solution for it to reallow it later. Of note, it also must be forbidden (aka strongly discouraged) from loading those extra, unexpected packages via other mechanisms (e.g. during |
FWIW, this was not a problem. I still check both of those lists, I just moved where it checked them, and implemented the TODO comment about making this more efficient than a repeated scan of the list. |
Would moving the extension into the parent package create the same cycle? |
Commonly, yes. The same extension being loaded explicitly by the parent does suffer the same problem (the parent depends on weakdep depends on external which depends on the parent), but not if the same content is moved into another context downstream of the external dependency. |
But the previous code only loaded the extension when all the triggers were in |
Yeah, we realized later this is not really the source of the problem with deadlock that we had encountered (though it was a possible cause), but I didn't go back to update the PR text. This is only a change to make it so that only exactly that package will run all of its triggers, and not any other package (including weakdeps) that happen to finish loading around the same time. |
Something seems strange here, when an extension fails to load, it tries to load it again. I introduced a syntax error in https://github.com/KristofferC/PGFPlotsX.jl/blob/master/ext/ColorsExt.jl and I get: julia> using PGFPlotsX
julia> using Colors
[ Info: Precompiling ColorsExt [283d1826-985b-5544-82b8-7fd9aa83b823]
ERROR: LoadError: UndefVarError: `eeh` not defined
Stacktrace:
[1] top-level scope
@ ~/JuliaPkgs/PGFPlotsX.jl/ext/ColorsExt.jl:7
[2] include
@ ./Base.jl:456 [inlined]
[3] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
@ Base ./loading.jl:2031
[4] top-level scope
@ stdin:2
in expression starting at /Users/kristoffercarlsson/JuliaPkgs/PGFPlotsX.jl/ext/ColorsExt.jl:1
in expression starting at stdin:2
┌ Error: Error during loading of extension ColorsExt of PGFPlotsX
└ @ Base loading.jl:1187
[ Info: Precompiling ColorsExt [283d1826-985b-5544-82b8-7fd9aa83b823]
ERROR: LoadError: UndefVarError: `eeh` not defined
Stacktrace:
[1] top-level scope
@ ~/JuliaPkgs/PGFPlotsX.jl/ext/ColorsExt.jl:7
[2] include
@ ./Base.jl:456 [inlined]
[3] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
@ Base ./loading.jl:2031
[4] top-level scope
@ stdin:2
in expression starting at /Users/kristoffercarlsson/JuliaPkgs/PGFPlotsX.jl/ext/ColorsExt.jl:1
in expression starting at stdin:2
┌ Error: Error during loading of extension ColorsExt of PGFPlotsX
└ @ Base loading.jl:1187 with this PR (and using #48550 to make stuff a bit shorter). On master it only prints once. |
Oh, I think that is actually correct behavior, but only because of a different bug. The original PR for weakdeps usually puts 2 copies of the weakdep into the array. But the try-catch was in the wrong place, so it would initially skip the second one and then come back to all of them the next time you loaded any package to try again. I meant to file an issue, but forgot. |
I actually think #48352 was what made this happen. |
I guess that might sometimes, but I think that PR is correct. I was referring to the presence of a call to that |
(cherry picked from commit b2adb8548d7f2a38dc73ea2de1be271e688a545c)
ensure extension triggers are only run by the package that satified them
This pull request seems to have caused a drop in coverage of 5 percentage points. |
This just happened to be the commit where we fixed the coverage computation (JuliaCI/julia-buildkite#188) |
Oscar had noticed a problem where loading an extension might trigger loading another package, which might re-trigger attempting to load the same extension. And then that causes a deadlock from waiting for the extension to finish loading (it appears to be a recursive import triggered from multiple places). Instead alter the representation, to be similar to a semaphore, so that it will be loaded only exactly by the final package that satisfied all dependencies for it.
This approach could still encounter an issue if the user imports a package (C) which it does not explicitly list as a dependency for extension. But it is unclear to me that we actually want to solve that, since it weakens and delays the premise that Bext is available shortly after A and B are both loaded.
# module C; using A, B; end;; module A; end;; module B; end;; module Bext; using C; end
# using C, Bext / A / B
starts C -> requires A, B to load
loads A -> defines Bext (extends B, but tries to also require C) loads B -> loads Bext (which waits for C -> deadlock!)
finish C -> now safe to load Bext
While using this order would have been fine.
# using A, B, Bext / C
loads A -> defines Bext (extends B, but tries to also require C) loads B -> starts Bext
loads C
finish Bext