-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuing #833 RE: Generating Checksum Named Files for AptByHash #884
Conversation
012ca8d
to
4d728f0
Compare
Regarding the failed "rady-to-ship" CI test, I believe the CI wants Apart from other small details, I believe the ball is back in our court to provide a final review for the feature itself. I am not sure I will get to it this week, but it is now near the top of our ToDo list. |
I think the commit message overall needs to be improved. It should describe the change/feature. |
4d728f0
to
77d2849
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off some more nitpicks (small things of minor importance). I will post my thoughts on the feature itself separately.
Final Nitpick: The latest iteration of the commit message is now:
Added functionality to generate checksum named metadata files for AptByHash that was initially started by adamsanaglo.
Addressed git commit and code formatting issues pointed out by quba42
closes #795
- The title line should be shorter and use present tense.
- The fact that something from previous review iterations was addressed is not interesting two years from now.
My suggestion is something like:
Add ability to publish package indices using AptByHash format
closes #795
Now my thoughts on the feature itself: I have verified that this works as designed and adds a It does not add the So my final question is, how useful/complete is this feature in its current form? If I read https://wiki.ubuntu.com/AptByHash it just talks about the Release file Is there a risk that advertising Since I have little experience with this feature, I don't know the answer to these questions. However the answer is essential to my final question before merging: Right now this behaviour can be enabled/disabled via the @daviddavis I am guessing some of these big picture questions are as much for you as for @acheng-01 😉 |
That's interesting and I had not noticed that in the spec. If I look at the debian repo for example, it has the by-hash folder in the directory containing the Packages but not in the Release folder. Ubuntu does have a by-hash folder in the Release folder. We should probably add these extra by-hash folders for completeness and to align with the spec but also, I tested it out using Ubuntu and it seemed to work (I confirmed it used the by-hash files) so I think this change is probably safe as is.
I would argue that we should have this feature disabled by default. We're running Pulp behind a reverse proxy that caches various files and one of the important pieces of this feature is that the by-hash files should be cached and served by default for 3 days. Serving these files directly from Pulp doesn't provide for this so I think it makes sense to disable this feature for most users. Moreover, I think we should probably provide accompanying docs with this feature in which we tell users that they are responsible for setting up a reverse proxy with cache in order to cache the by-hash files. |
77d2849
to
fe55235
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks very clean now.
Given what @daviddavis said, I made a suggestion regarding the changelog entry.
Given the feature is now disabled by default, I leave it up to you if you want me to merge the current state, or if you prefer to extend it first. I see the following possible follow up tasks (but I am fine if they do not materialize):
- Add the release file
by-hash/
folder as well. - Add documentation on how to use this feature with reverse proxy.
- Extend a publication test to ensure that indices are in fact being published in the
by-hash
location.
Regarding adding the release file I think Adams's original PR was aimed at following the specs for debian rather than for ubuntu as @quba42 had linked. In that regard, the code works and now there is a by-hash generated for each architecture's packages within the repo. In addition to the difference in spec, I believe pulp reads directly from the Release file that already contains all the hash digests for the repository's packages; it doesn't touch the by-hash folder at all. This being the case, a by-hash folder in release wouldn't add much value other than caching. @daviddavis do you have anything to add? |
I think that makes sense. When I tested this out with Ubuntu here, apt pulled the Packages files from the by-hash folder in the Packages folder. I'm not sure what value the Releases by-hash folder serves. At least for us, the problem was that clients were pulling Packages files that didn't match the Release so providing the Release by-hash folder seems pointless. It might be worth stating in the docs and changelog entry that this implementation conforms to the debian spec that you linked to (and not the Ubuntu although it does appear that apt on Ubuntu does properly handle the hash files). |
@acheng-01 If you update the release notes you can set this to "ready for review" and we can merge it. |
@quba42 Sorry for the delays here. I've been trying to extend one of the publishing tests to include verification that the by-hash folder is created when APT_BY_HASH is enabled without much success. Because that hashed_index_path is being saved to hashed_index, I feel that I might not be able to access it in the test without changing the APIs in pulp_deb-client or modifying some pre-existing model also within that plugin. (Please let me know if I overlooked something, overthough something, or if there is a way to test for this without altering stuff in pulp_deb-client...) I have tested this outside of the testing environment though, and I can confirm that the by-hash folder is generated upon publishing a repo. It seems like you would be ok with me just setting this as ready to merge though, so I'll go ahead and mark this as ready for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to squish all your commits together for the "Deb CI / ready-to-ship (pull_request)" test to turn green.
Apart from that this looks ready to me. The additional "Feature Overview" docs look good to me.
Regarding tests: I am not sure how simple it is to retrieve the checksum of the publication artifact, though if worst came to worst, one could simply download the published/hosted Release file, retrieve a Package index from that, download it, calculate its checksum, and then check for the presence of the relevant APT by hash path. This would work completely without use of Pulp client, but would not be very elegant. I think we can postpone this and live without that test for now.
49bf40f
to
59a0a2b
Compare
59a0a2b
to
e242b8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ongoing discussions about how to develop this feature further not withstanding, I am fine with merging the current state.
This PR is to continue the work of adamsanaglo for AptByHash. It includes code changes for mitigating the Hash Sum Mismatch error by generating additional metadata files named by checksum for AptByHash, as well as changes advised by quba in the adamsanaglo's original PR.
Closes #795