-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix package repo uniqueness #922
Conversation
20a64c7
to
45098d9
Compare
45098d9
to
44ca24f
Compare
I have now tested that this code works for at least some of the cases that prompted this change.
|
|
||
for content_dict in batch: | ||
sha256 = content_dict.pop("sha256") | ||
item_query = models.Q(**content_dict) & ~models.Q(sha256=sha256) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is most likely the critical line for performance. This change must not cause a significant performance degradation!
44ca24f
to
387a181
Compare
I'm grateful that you tested out the use case where a package replaces an existing existing package and removes the PackageReleaseComponent. Thank you for that. It sounds like the issue is that you have two packages with the same NVA but different paths. Is it not possible to solve this problem by adding relative_path to repo_key_fields? |
The problem is, that a duplicate package (same package, version, architecture fields, but different relative_path) are only allowed, if they have an identical sha256, that is, they really are the same package, stored in two different pool folder locations (for "reasons"). If they have differing sha256, they are not allowed to exist in the same repo twice. The spec is pretty clear on that. So the key difference between what I built here, and the pulpcore's
The same issue exists for pulpcore's |
@daviddavis I have been wondering if there is a case to be made, to be stricter for uploads than for sync. For example: It is legal for the pool folder to contain an identical copy of some package in two places, but I would not exactly call it good practice. For the sync we must be able to sync such legal repositories. However, we could choose to protect our upload users from shooting themselves in the foot, by making this forbidden for uploads. Of course, we also support mixing uploaded and synced content in a single Pulp repository so such discrimination could quickly get complicated. For now I am leaning towards not doing anything like this, but I would be interested to hear your thoughts. |
@quba42 in terms of our use case, we don't allow users to set the relative_path when they upload. So it's not a problem for us I don't think. But I think being stricter for uploads makes sense. And I think postponing that work until it becomes an issue also makes sense. |
387a181
to
2e2d11f
Compare
closes pulp#921 As a result of the behaviour fixes, we can also drop the now superfluous duplicate distribution checking. We are using validate_duplicate_content to catch incoming duplicates, and we are removing old duplicates, so it is not possible to run into this error. As a result it is not worth the performance cost to check for it.
2e2d11f
to
00c3656
Compare
[noissue]
00c3656
to
6c97a75
Compare
Backport to 3.0: 💚 backport PR created✅ Backport PR branch: Backported as #947 🤖 @patchback |
Backport to 3.1: 💚 backport PR created✅ Backport PR branch: Backported as #948 🤖 @patchback |
Fix package repo uniqueness (cherry picked from commit 7ede5be)
Fix package repo uniqueness (cherry picked from commit 7ede5be)
…d9caa5c3722ed837821eda5052106/pr-922 [PR #922/7ede5bed backport][3.0] Fix package repo uniqueness
…d9caa5c3722ed837821eda5052106/pr-922 [PR #922/7ede5bed backport][3.1] Fix package repo uniqueness
No description provided.