-
-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented local filesystem endpoint with support for hardlinks, symlinks and copy #521
Implemented local filesystem endpoint with support for hardlinks, symlinks and copy #521
Conversation
files/public.go
Outdated
return nil | ||
if storage.linkMethod == LinkMethodCopy { | ||
// if source and destination have the same size, no need to copy | ||
if srcStat.Size() == dstStat.Size() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems risky to only rely on size. In l1k@72dd4ba#diff-6e0cc192367b696b238356fae51fdc00R122 I compute the MD5 of the destination but that's obviously expensive. This should probaby be configurable, I could do that in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the only reason I've used the size instead of a checksum was performance. Adding it optionally is a good idea, I'll try that tomorrow. Perhaps the source md5 is already in the database and we only have to compute the md5 of the destination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sourceMD5 is already passed in to this function.
man/aptly.1.ronn.tmpl
Outdated
files from the internal pool to the published directory. | ||
If not specified, empty or wrong, this defaults to `hardlink`. | ||
|
||
In order to publish to such a local enpoint, specify the endpoint as `local:endpoint-name` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/enpoint/endpoint/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Fixed in 027e865
From a quick review this mostly LGTM. One nit I have is that "local" can be misleading as the filesystem can be mounted from a remote location, so a more generic term like "filesystem" might be more apt. Again, this is just a nit. Unfortunately I won't be able to test this before Monday. :( You're way too fast for me. ;) |
Thanks for your contribution! I'm going to push out new aptly release 1.0.0, most probably it would take today and tomorrow to finish that. I will get back to this PR as soon as I'm done with the release work. |
I've just tested both the copy and the symlink functionality. Seems to work like a charm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 awesome work!
just few nitpicks plus I agree with @l1k, we need more robust way to verify file contents in case of "copy file" mode. We can make it configurable and we can make "checksum" mode not supported yet, but I think we need to make sure users understand implications of size-only check.
files/public.go
Outdated
@@ -21,9 +24,27 @@ var ( | |||
_ aptly.LocalPublishedStorage = (*PublishedStorage)(nil) | |||
) | |||
|
|||
// Constants | |||
const ( | |||
LinkMethodHardLink uint = iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick, but we can drop uint = iota
on all the lines except the first one (https://github.com/golang/go/wiki/Iota)
files/public.go
Outdated
dstSys := dstStat.Sys().(*syscall.Stat_t) | ||
|
||
// if source and destination inodes match, no need to link | ||
if srcSys.Ino == dstSys.Ino { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With hardlinks, it's only possible to create them across the the same filesystem, with symlink, it's no longer true
I don't remember off top of my head if there is good and easy way to verify it's the same filesystem exactly, but we could at lest check that .Dev
fields match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the symlink is followed and thus dstSys
is the stat_t
of the source file. At least that's what I observed when I tested this. Hence the code seems to be correct and is actually quite elegant. But I'm not 100% sure that I'm recalling this correctly, so it would be good if someone else could double-check.
One issue I did observe was that for hardlinks, there's no upfront testing if the published directory is on the same filesystem as the internal pool. If users don't take care of this, they'll only get an error when they're in the middle of publishing the repo. It might be worth to check this earlier. Code to check if two paths are on the filesystem is part of my pull #510. I could rework this and send a pull to @seeraven's branch or he could cherry-pick the commit onto his branch and throw out the unneeded bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, symlink is followed, but if it's symlink it might point to file on different filesystem with same inode. So that check wouldn't make sense if filesystem for source/destination aren't same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, good point. Indeed taking the .Dev
fields also into account should address this concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 thanks!
So, finally I've got some time to work further on this. First, I'll implement the md5 checksum check for the copy method with an option to disable it explicitly. |
|
Implemented the md5 check for the copy method and added an option to switch back to a size check if the user explicitly enables it (config file option |
Another idea I had about naming (I'm really bad at naming things) was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels really good, @seeraven any problem if we merge it the way it is?
I want to resolve conflicts with my work on new package pool hashing.
There're still some small things which might require updates, like docs or messages aptly prints after publishing.
And if it's good for you to merge this PR, could you please squash commits a bit to keep |
Sure, no problem. Glad to have it accepted. I will merge with master first and rebase afterwards. ;-) |
122b666
to
25f9c29
Compare
I've squashed it into one commit, so its ready to merge. |
Rewritten the previous pull request to support multiple local filesystem endpoints, each with a customizable root directory and a link method (hardlinks, symlinks or file copies).
Requirements
All new code should be covered with tests, documentation should be updated. CI should pass.
Description of the Change
The local filesystem endpoints are configured in the configuration file similar to the s3 or swift endpoints. For example, a snippet like the following adds three endpoints, each with a different method to "link" files from the pool to the public directory:
Backwards-compatibility is fully preserved, so without specification of the endpoint the default hardlinked public directory is used.
To specify the new endpoint, the prefix local is used, e.g.,
aptly publish snapshot wheezy-main local:test1:wheezy/daily
Fixes #120, #173
Checklist
AUTHORS