Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented local filesystem endpoint with support for hardlinks, symlinks and copy #521

Merged
merged 1 commit into from
Apr 13, 2017

Conversation

seeraven
Copy link
Contributor

@seeraven seeraven commented Mar 24, 2017

Rewritten the previous pull request to support multiple local filesystem endpoints, each with a customizable root directory and a link method (hardlinks, symlinks or file copies).

Requirements

All new code should be covered with tests, documentation should be updated. CI should pass.

Description of the Change

The local filesystem endpoints are configured in the configuration file similar to the s3 or swift endpoints. For example, a snippet like the following adds three endpoints, each with a different method to "link" files from the pool to the public directory:

"LocalPublishEndpoints": {
               "test1": {
                 "rootDir": "/opt/srv1/aptly_public",
                 "linkMethod": "symlink"
               },
               "test2": {
                 "rootDir": "/opt/srv2/aptly_public",
                 "linkMethod": "copy"
               },
               "test3": {
                 "rootDir": "/opt/srv3/aptly_public",
                 "linkMethod": "hardlink"
               }
             }

Backwards-compatibility is fully preserved, so without specification of the endpoint the default hardlinked public directory is used.

To specify the new endpoint, the prefix local is used, e.g.,
aptly publish snapshot wheezy-main local:test1:wheezy/daily

Fixes #120, #173

Checklist

  • unit-test added (if change is algorithm)
  • functional test added/updated (if change is functional)
  • man page updated (if applicable)
  • documentation updated
  • author name in AUTHORS

files/public.go Outdated
return nil
if storage.linkMethod == LinkMethodCopy {
// if source and destination have the same size, no need to copy
if srcStat.Size() == dstStat.Size() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems risky to only rely on size. In l1k@72dd4ba#diff-6e0cc192367b696b238356fae51fdc00R122 I compute the MD5 of the destination but that's obviously expensive. This should probaby be configurable, I could do that in a separate commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the only reason I've used the size instead of a checksum was performance. Adding it optionally is a good idea, I'll try that tomorrow. Perhaps the source md5 is already in the database and we only have to compute the md5 of the destination.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sourceMD5 is already passed in to this function.

files from the internal pool to the published directory.
If not specified, empty or wrong, this defaults to `hardlink`.

In order to publish to such a local enpoint, specify the endpoint as `local:endpoint-name`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/enpoint/endpoint/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed in 027e865

@l1k
Copy link

l1k commented Mar 24, 2017

From a quick review this mostly LGTM. One nit I have is that "local" can be misleading as the filesystem can be mounted from a remote location, so a more generic term like "filesystem" might be more apt. Again, this is just a nit.

Unfortunately I won't be able to test this before Monday. :( You're way too fast for me. ;)

@smira smira added the 1.1.0 label Mar 27, 2017
@smira
Copy link
Contributor

smira commented Mar 27, 2017

Thanks for your contribution!

I'm going to push out new aptly release 1.0.0, most probably it would take today and tomorrow to finish that. I will get back to this PR as soon as I'm done with the release work.

@l1k
Copy link

l1k commented Mar 27, 2017

I've just tested both the copy and the symlink functionality. Seems to work like a charm.

Copy link
Contributor

@smira smira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 awesome work!

just few nitpicks plus I agree with @l1k, we need more robust way to verify file contents in case of "copy file" mode. We can make it configurable and we can make "checksum" mode not supported yet, but I think we need to make sure users understand implications of size-only check.

files/public.go Outdated
@@ -21,9 +24,27 @@ var (
_ aptly.LocalPublishedStorage = (*PublishedStorage)(nil)
)

// Constants
const (
LinkMethodHardLink uint = iota
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, but we can drop uint = iota on all the lines except the first one (https://github.com/golang/go/wiki/Iota)

files/public.go Outdated
dstSys := dstStat.Sys().(*syscall.Stat_t)

// if source and destination inodes match, no need to link
if srcSys.Ino == dstSys.Ino {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With hardlinks, it's only possible to create them across the the same filesystem, with symlink, it's no longer true

I don't remember off top of my head if there is good and easy way to verify it's the same filesystem exactly, but we could at lest check that .Dev fields match

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the symlink is followed and thus dstSys is the stat_t of the source file. At least that's what I observed when I tested this. Hence the code seems to be correct and is actually quite elegant. But I'm not 100% sure that I'm recalling this correctly, so it would be good if someone else could double-check.

One issue I did observe was that for hardlinks, there's no upfront testing if the published directory is on the same filesystem as the internal pool. If users don't take care of this, they'll only get an error when they're in the middle of publishing the repo. It might be worth to check this earlier. Code to check if two paths are on the filesystem is part of my pull #510. I could rework this and send a pull to @seeraven's branch or he could cherry-pick the commit onto his branch and throw out the unneeded bits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, symlink is followed, but if it's symlink it might point to file on different filesystem with same inode. So that check wouldn't make sense if filesystem for source/destination aren't same.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, good point. Indeed taking the .Dev fields also into account should address this concern.

Copy link
Contributor

@smira smira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks!

@seeraven
Copy link
Contributor Author

seeraven commented Apr 8, 2017

So, finally I've got some time to work further on this. First, I'll implement the md5 checksum check for the copy method with an option to disable it explicitly.
Then the question remains how to call the endpoint. As @l1k said correctly, local seems to be quite general. I've chosen the name mainly because it was shorter than filesystem or something like that. Any suggestions?

@smira
Copy link
Contributor

smira commented Apr 8, 2017

filesystem sounds better than local for me, as aptly already has local repo vs. mirror.

@seeraven
Copy link
Contributor Author

seeraven commented Apr 8, 2017

Implemented the md5 check for the copy method and added an option to switch back to a size check if the user explicitly enables it (config file option verifyMethod). I also added a test to ensure the md5 variant determines changes for files at the same size where the size variant fails.
The endpoint is now called FileSystemPublishEndpoint internally and on the command line the prefix is filesystem, e.g., filesystem:nfsshare.

@smira
Copy link
Contributor

smira commented Apr 8, 2017

Another idea I had about naming (I'm really bad at naming things) was mountpoint or something along those lines, so that you could mount different filesystem as publish roots.

Copy link
Contributor

@smira smira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels really good, @seeraven any problem if we merge it the way it is?

I want to resolve conflicts with my work on new package pool hashing.

There're still some small things which might require updates, like docs or messages aptly prints after publishing.

@smira
Copy link
Contributor

smira commented Apr 13, 2017

And if it's good for you to merge this PR, could you please squash commits a bit to keep master history clean? Thanks.

@seeraven
Copy link
Contributor Author

Sure, no problem. Glad to have it accepted. I will merge with master first and rebase afterwards. ;-)

@seeraven seeraven force-pushed the feature_localfiles_publishstorage branch from 122b666 to 25f9c29 Compare April 13, 2017 18:29
@seeraven
Copy link
Contributor Author

I've squashed it into one commit, so its ready to merge.

@smira smira merged commit 4d38e0b into aptly-dev:master Apr 13, 2017
@seeraven seeraven deleted the feature_localfiles_publishstorage branch August 15, 2017 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Publications to different filesystems
3 participants