-
Notifications
You must be signed in to change notification settings - Fork 732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle overlapping and incomplete markers when forking #4732
Comments
konstin
added a commit
to konstin/packse
that referenced
this issue
Jul 9, 2024
We're currently (astral-sh/uv#4732) failing to install `c` from this scenario: ``` $ echo "fork-incomplete-markers-8137fed5" | uv pip compile --index-url http://127.0.0.1:3141 -p 3.9 --universal - Resolved 4 packages in 38ms # This file was autogenerated by uv via the following command: # uv pip compile -p 3.9 --universal - fork-incomplete-markers-8137fed5==0.0.0 fork-incomplete-markers-a-8137fed5==1.0.0 ; python_version < '3.10' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-a-8137fed5==2.0.0 ; python_version >= '3.11' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-b-8137fed5==1.0.0 # via fork-incomplete-markers-8137fed5 ``` This is one particular scenario that we need to test for solving astral-sh/uv#4732 I had to update the lockfile for the recent uv lockfile changes.
konstin
added a commit
to konstin/packse
that referenced
this issue
Jul 9, 2024
We're currently (astral-sh/uv#4732) failing to install `c` from this scenario: ``` $ echo "fork-incomplete-markers-8137fed5" | uv pip compile --index-url http://127.0.0.1:3141 -p 3.9 --universal - Resolved 4 packages in 38ms # This file was autogenerated by uv via the following command: # uv pip compile -p 3.9 --universal - fork-incomplete-markers-8137fed5==0.0.0 fork-incomplete-markers-a-8137fed5==1.0.0 ; python_version < '3.10' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-a-8137fed5==2.0.0 ; python_version >= '3.11' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-b-8137fed5==1.0.0 # via fork-incomplete-markers-8137fed5 ``` This is one particular scenario that we need to test for solving astral-sh/uv#4732 I had to update the lockfile for the recent uv lockfile changes.
konstin
added a commit
to astral-sh/packse
that referenced
this issue
Jul 9, 2024
We're currently (astral-sh/uv#4732) failing to install `c` from the added scenario: ``` $ echo "fork-incomplete-markers-8137fed5" | uv pip compile --index-url http://127.0.0.1:3141 -p 3.9 --universal - Resolved 4 packages in 38ms # This file was autogenerated by uv via the following command: # uv pip compile -p 3.9 --universal - fork-incomplete-markers-8137fed5==0.0.0 fork-incomplete-markers-a-8137fed5==1.0.0 ; python_version < '3.10' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-a-8137fed5==2.0.0 ; python_version >= '3.11' # via fork-incomplete-markers-8137fed5 fork-incomplete-markers-b-8137fed5==1.0.0 # via fork-incomplete-markers-8137fed5 ``` This is one particular scenario that we need to test for solving astral-sh/uv#4732 I had to update the lockfile for the recent uv lockfile changes, split out to #199 which removes it from the diff here.
This was referenced Jul 11, 2024
BurntSushi
added a commit
that referenced
this issue
Jul 25, 2024
When a fork occurs, we divide not just the dependencies that provoked a fork into distinct groups, but we also add the corresponding sibling dependencies to each fork. Previously, while we track markers on the fork itself, the individual dependencies that had markers only corresponded to markers written from the dependency specification. This meant that the sibling dependencies that got added to each fork would not themselves have markers attached to them. This in turn meant they would not have markers associated with them in the lock file. In many cases, this is actually okay, because the resolver will pick a version that is "universal" across all forks in most cases. But in some cases, this just simply isn't possible as the marker expressions in the fork can and do influence resolution. In which case, it is possible for the same package with different versions to show up in the lock file unconditionally. Which is a big no-no. So in this commit, after we determine the forks, we intersect the markers on each fork with each of its dependencies. This does seem to balloon the marker expressions in some cases. I plucked one low hanging fruit to avoid doing `x and x` in trivial cases. (And this eliminated a portion of the snapshot diffs.) But some pretty gnarly diffs remain. This commit also fixes another bug: previously, when we created a fork to capture the "remaining" universe of an incomplete set of markers, we left out dependencies that should be included in that fork. We rectify that here. Fixes #5086 Partially addresses #4732
BurntSushi
added a commit
that referenced
this issue
Jul 25, 2024
When a fork occurs, we divide not just the dependencies that provoked a fork into distinct groups, but we also add the corresponding sibling dependencies to each fork. Previously, while we track markers on the fork itself, the individual dependencies that had markers only corresponded to markers written from the dependency specification. This meant that the sibling dependencies that got added to each fork would not themselves have markers attached to them. This in turn meant they would not have markers associated with them in the lock file. In many cases, this is actually okay, because the resolver will pick a version that is "universal" across all forks in most cases. But in some cases, this just simply isn't possible as the marker expressions in the fork can and do influence resolution. In which case, it is possible for the same package with different versions to show up in the lock file unconditionally. Which is a big no-no. So in this commit, after we determine the forks, we intersect the markers on each fork with each of its dependencies. This does seem to balloon the marker expressions in some cases. I plucked one low hanging fruit to avoid doing `x and x` in trivial cases. (And this eliminated a portion of the snapshot diffs.) But some pretty gnarly diffs remain. This commit also fixes another bug: previously, when we created a fork to capture the "remaining" universe of an incomplete set of markers, we left out dependencies that should be included in that fork. We rectify that here. Fixes #5086 Partially addresses #4732
BurntSushi
added a commit
that referenced
this issue
Jul 25, 2024
When a fork occurs, we divide not just the dependencies that provoked a fork into distinct groups, but we also add the corresponding sibling dependencies to each fork. Previously, while we track markers on the fork itself, the individual dependencies that had markers only corresponded to markers written from the dependency specification. This meant that the sibling dependencies that got added to each fork would not themselves have markers attached to them. This in turn meant they would not have markers associated with them in the lock file. In many cases, this is actually okay, because the resolver will pick a version that is "universal" across all forks in most cases. But in some cases, this just simply isn't possible as the marker expressions in the fork can and do influence resolution. In which case, it is possible for the same package with different versions to show up in the lock file unconditionally. Which is a big no-no. So in this commit, after we determine the forks, we intersect the markers on each fork with each of its dependencies. This does seem to balloon the marker expressions in some cases. I plucked one low hanging fruit to avoid doing `x and x` in trivial cases. (And this eliminated a portion of the snapshot diffs.) But some pretty gnarly diffs remain. This commit also fixes another bug: previously, when we created a fork to capture the "remaining" universe of an incomplete set of markers, we left out dependencies that should be included in that fork. We rectify that here. Fixes #5086 Partially addresses #4732
BurntSushi
added a commit
that referenced
this issue
Jul 26, 2024
When a fork occurs, we divide not just the dependencies that provoked a fork into distinct groups, but we also add the corresponding sibling dependencies to each fork. Previously, while we track markers on the fork itself, the individual dependencies that had markers only corresponded to markers written from the dependency specification. This meant that the sibling dependencies that got added to each fork would not themselves have markers attached to them. This in turn meant they would not have markers associated with them in the lock file. In many cases, this is actually okay, because the resolver will pick a version that is "universal" across all forks in most cases. But in some cases, this just simply isn't possible as the marker expressions in the fork can and do influence resolution. In which case, it is possible for the same package with different versions to show up in the lock file unconditionally. Which is a big no-no. So in this commit, after we determine the forks, we intersect the markers on each fork with each of its dependencies. This does seem to balloon the marker expressions in some cases. I plucked one low hanging fruit to avoid doing `x and x` in trivial cases. (And this eliminated a portion of the snapshot diffs.) But some pretty gnarly diffs remain. This commit also fixes another bug: previously, when we created a fork to capture the "remaining" universe of an incomplete set of markers, we left out dependencies that should be included in that fork. We rectify that here. Fixes #5086 Partially addresses #4732
BurntSushi
added a commit
that referenced
this issue
Jul 26, 2024
When a fork occurs, we divide not just the dependencies that provoked a fork into distinct groups, but we also add the corresponding sibling dependencies to each fork. Previously, while we track markers on the fork itself, the individual dependencies that had markers only corresponded to markers written from the dependency specification. This meant that the sibling dependencies that got added to each fork would not themselves have markers attached to them. This in turn meant they would not have markers associated with them in the lock file. In many cases, this is actually okay, because the resolver will pick a version that is "universal" across all forks in most cases. But in some cases, this just simply isn't possible as the marker expressions in the fork can and do influence resolution. In which case, it is possible for the same package with different versions to show up in the lock file unconditionally. Which is a big no-no. So in this commit, after we determine the forks, we intersect the markers on each fork with each of its dependencies. This does seem to balloon the marker expressions in some cases. I plucked one low hanging fruit to avoid doing `x and x` in trivial cases. (And this eliminated a portion of the snapshot diffs.) But some pretty gnarly diffs remain. This commit also fixes another bug: previously, when we created a fork to capture the "remaining" universe of an incomplete set of markers, we left out dependencies that should be included in that fork. We rectify that here. Fixes #5086 Partially addresses #4732
BurntSushi
added a commit
that referenced
this issue
Jul 26, 2024
Consider the following packse scenario: ```toml [root] requires = [ "a>=1.0.0 ; python_version < '3.10'", "a>=1.1.0 ; python_version >= '3.10'", "a>=1.2.0 ; python_version >= '3.11'", ] [packages.a.versions."1.0.0"] [packages.a.versions."1.1.0"] [packages.a.versions."1.2.0"] ``` On current `main`, this produces a dependency on `a` that looks like this: ```toml dependencies = [ { name = "fork-overlapping-markers-basic-a", marker = "python_version < '3.10' or python_version >= '3.11'" }, ] ``` But the marker expression is clearly wrong here, since it implies that `a` isn't installed at all for Python 3.10. With this PR, the above dependency becomes: ```toml dependencies = [ { name = "fork-overlapping-markers-basic-a" }, ] ``` That is, it's unconditional. Which is I believe correct here since there aren't any other constraints on which version to select. The specific bug here is that when we found overlapping dependency specifications for the same package *within* a pre-existing fork, we intersected all of their marker expressions instead of unioning them. That in turn resulted in incorrect marker expressions. While this doesn't fix any known bug on the issue tracker (like #4640), it does appear to fix a couple of our snapshot tests. And fixes a basic test case I came up with while working on #4732. For the packse scenario test: astral-sh/packse#206
BurntSushi
added a commit
that referenced
this issue
Jul 26, 2024
Consider the following packse scenario: ```toml [root] requires = [ "a>=1.0.0 ; python_version < '3.10'", "a>=1.1.0 ; python_version >= '3.10'", "a>=1.2.0 ; python_version >= '3.11'", ] [packages.a.versions."1.0.0"] [packages.a.versions."1.1.0"] [packages.a.versions."1.2.0"] ``` On current `main`, this produces a dependency on `a` that looks like this: ```toml dependencies = [ { name = "fork-overlapping-markers-basic-a", marker = "python_version < '3.10' or python_version >= '3.11'" }, ] ``` But the marker expression is clearly wrong here, since it implies that `a` isn't installed at all for Python 3.10. With this PR, the above dependency becomes: ```toml dependencies = [ { name = "fork-overlapping-markers-basic-a" }, ] ``` That is, it's unconditional. Which is I believe correct here since there aren't any other constraints on which version to select. The specific bug here is that when we found overlapping dependency specifications for the same package *within* a pre-existing fork, we intersected all of their marker expressions instead of unioning them. That in turn resulted in incorrect marker expressions. While this doesn't fix any known bug on the issue tracker (like #4640), it does appear to fix a couple of our snapshot tests. And fixes a basic test case I came up with while working on #4732. For the packse scenario test: astral-sh/packse#206
BurntSushi
added a commit
that referenced
this issue
Aug 2, 2024
BurntSushi
added a commit
that referenced
this issue
Aug 7, 2024
BurntSushi
added a commit
that referenced
this issue
Aug 9, 2024
BurntSushi
added a commit
that referenced
this issue
Aug 9, 2024
BurntSushi
added a commit
that referenced
this issue
Aug 13, 2024
BurntSushi
added a commit
that referenced
this issue
Aug 13, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In python, a package can declare conflicting requirements as long as the markers are disjoint, e.g.:
When we see requirements like that, we split the resolution into one for
python_version >= "3.11"
and one forpython_version < "3.11"
and solve each fork individually. This fails when the markers are overlapping or don't cover the entirety of the marker space. For a real-world example, opencv-python 4.10.0.84 used by transformers:Currently, we solve for the following forks (#4687):
They are disjoint, but they are also missing a good bit of the marker space for which we never solve.
Let's give working with markers some math-y foundations, a visualization as mental model and let's look at two examples.
Lemma: Inflating the requirements by artificially introducing a split. A requirement
is equivalent to
Similarly,
is equivalent to
We can extend this to an arbitrary number of arguments, so if Z¹, Z², ..., Z^n are pairwise disjoint and the union over i in 0..n for Z^i is the base set omega, then
is equivalent to
Now assume we have the following set of partially pairwise overlapping (#4640), partially pairwise incomplete (#4687) markers:
We can now inflate the requirements by artificially introducing a split:
We see that each group is disjoint, so we can create three forks. Simplifying and Removing empty markers we get:
We defined a correct way of splitting these markers.
Let us switch to another example to develop a model of markers and marker space. Ignoring extras, we can consider the current environment a point in a 10-dimensional marker space defined by PEP 508 (os_name, sys_platform, platform_machine, platform_python_implementation, platform_release, platform_system, platform_version, python_version/python_full_version, implementation_name, implementation_version). The dimensions are of different datatypes and not all points exist (os_name=="posix" and platform_system=="Windows" doesn't make sense). A marker is a slice in that space.
We can visualize a simpler version of the opencv-python 4.10.0.84 requirements:
There is two ways we can transform these requirements with the diagram:
For splitting overlapping, the options are:
python_version >= "3.10" and python_version < "3.11" and platform_system == "Darwin"
withnumpy >=1.21.4
python_version >= "3.11" and platform_system == "Darwin"
withnumpy >=1.21.2
andnumpy >=1.21.4
python_version >= "3.11" and platform_system != "Darwin"
withnumpy >=1.21.2
I favor to splitting the overlapping, it's the more general solution, and the one able to handle complex cases like the opencv-python markers (even though we pick a more recent numpy version for them anyway). If we merge overlapping, we need to special case the universal marker (no marker provided) for #4640. In either case, we need to implement marker tree negation to support incomplete markers, and proper simplification for negated marker tree to determine if the markers were actually incomplete or did cover the entire markers space, leaving only an empty set.
The text was updated successfully, but these errors were encountered: