Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NFS VolumeStore FQDN not working #8244

Merged
merged 5 commits into from
Dec 17, 2018

Conversation

YanzhaoLi
Copy link
Member

@YanzhaoLi YanzhaoLi commented Aug 30, 2018

When docker using a nfs VolumeStore as a volume, the mount action will fail if the nfs-server url is a FQDN. A dotted ip url works well. When the linux syscall mount(source, target, flag, mountOptions) using for nfs mount, its mountOptions should contain 'addr=serverip' and the serverip must be an ip(ps. the url in source is useless).

Thus the fix resolve the FQDN and replace mountOptions's 'addr=fqdn' to 'addr=ip'. The mountOptions is generated as 'addr=host' at CreateMountSpec when creating the endpointVM. But this fix choose to do the resolve just before the mounting action.

Fix #8043
Fix #8357

[specific ci=1-19-Docker-Volume-Create]

@YanzhaoLi YanzhaoLi requested a review from a team as a code owner August 30, 2018 09:46
@YanzhaoLi YanzhaoLi requested review from hickeng and zjs August 30, 2018 09:47

log.Debugf("Looking up host %s", source.Hostname())
ips, err := net.LookupIP(source.Hostname())
if err != nil {
log.Errorf("mounting %s on %s failed: %s", source.String(), target, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we update this error message to specify looking up the IP for the specified hostname failed? It seems like that might be useful context to provide to aid in debugging, as that might lead to different troubleshooting steps than focusing on the mounting process itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought that the variable 'err' contains corresponding looking-up failure info

return nil
}
}
log.Errorf("mounting %s on %s failed: %s", source.String(), target, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are multiple IPs, we'll only log the last error to occur. Maybe it would be helpful to log each error (and the associated IP address) at DEBUG within the loop, in addition to this ERROR-level log message here.

//We resolve the ip address nearest the mounting action.
//An alternative place is nfs.CreateMountSpec function in /lib/portlayer/storage/nfs/vm.go
mountOptionsFQDN2IP := strings.Replace(mountOptions, source.Hostname(), ip.String(), -1)
if err = Sys.Syscall.Mount(rawSource.String(), target, nfsFileSystemType, syscall.MS_NOATIME, mountOptionsFQDN2IP); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract this call to Sys.Syscall.Mount into a private method that takes rawSource, target, nfsFileSystemType, and the mountOptions*? This would let us avoid duplication with the identical call above and, more importantly, ensure that we keep the behavior between IP- and hostname-based handling consistent (e.g., if we ever replaced syscall.MS_NOATIME or made it configurable — as the comment above suggests we might).

//NOTE: the mountOptions of syscall mount only accept addr=ip; addr=FQDN doesn't work
//We resolve the ip address nearest the mounting action.
//An alternative place is nfs.CreateMountSpec function in /lib/portlayer/storage/nfs/vm.go
mountOptionsFQDN2IP := strings.Replace(mountOptions, source.Hostname(), ip.String(), -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like rawSource.String() will also contain the hostname. Does that need to be replaced too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually hostname in rawSource.String() doesn't affect sys.syscall.mount.

@YanzhaoLi YanzhaoLi force-pushed the topic/fix-nfsmount-fqdn branch from cc4f6f6 to 17f04ff Compare September 4, 2018 06:12
@hickeng
Copy link
Member

hickeng commented Sep 7, 2018

The fix looks good. I've been trying to think of a way to add testing before approving.

for _, ip := range ips {
//NOTE: the mountOptions of syscall mount only accept addr=ip. addr=FQDN doesn't work
//We resolve the ip address nearest the mounting action.
mountOptionsIP := strings.Replace(mountOptions, source.Hostname(), ip.String(), -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there edge cases where mountOptions could contain the source.Hostname() multiple times? Could someone provide a source.Hostname() like addr (a valid hostname, but not a valid FQDN) that actually appears as a key in addition to being a value)?

Maybe prefixing the old/new with addr= would reduce risk here?

strings.Replace(mountOptions, "addr="+source.Hostname(), "addr="+ip.String(), -1)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's really an awesome advice.
Btw, currently the mountOptions is not configurable.
It is always nfsMountOpitons + ",addr=" + host.Host, and the nfsMountOptions is a hard-coded constant.

Copy link
Member

@zjs zjs Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, currently the mountOptions is not configurable.

I see! That's good context to have. That certainly limits the scope of any issues here right now, but I think this change is still useful to make to avoid potential issues in the future.

@zjs zjs added this to the Sprint 36 milestone Oct 26, 2018
@YanzhaoLi
Copy link
Member Author

The fix looks good. I've been trying to think of a way to add testing before approving.

I've created an issue #8357 aiming to expect test cases for this pr.

@YanzhaoLi YanzhaoLi force-pushed the topic/fix-nfsmount-fqdn branch 2 times, most recently from 94c5f45 to 21c1b96 Compare November 5, 2018 08:14
@YanzhaoLi YanzhaoLi force-pushed the topic/fix-nfsmount-fqdn branch 9 times, most recently from 7bc9bf6 to 2153a31 Compare November 22, 2018 08:58
When docker using a nfs VolumeStore as a volume, the mount
action will fail if the nfs-server url is a FQDN. A dotted
ip url works well. When the linux syscall
mount(source, target, flag, mountOptions) using for nfs
mount, its mountOptions should contain 'addr=serverip' and
the serverip must be an ip(ps. the url in source is useless).

Thus the fix resolve the FQDN and give its mountOptions the
'addr=ip'. The mountOptions is generated as 'addr=host' at
CreateMountSpec when creating the endpointVM. The fix choose
to do the resolve just before the mounting action.

Resolve: vmware#8043
@YanzhaoLi YanzhaoLi force-pushed the topic/fix-nfsmount-fqdn branch from 2153a31 to e912dce Compare November 26, 2018 06:50
@YanzhaoLi YanzhaoLi force-pushed the topic/fix-nfsmount-fqdn branch from e912dce to 1d54af6 Compare November 26, 2018 11:09
Copy link
Member

@hickeng hickeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed test functions as expected from CI https://ci-vic.vmware.com/vmware/vic/20465 logs.

@YanzhaoLi YanzhaoLi merged commit 44d6f81 into vmware:master Dec 17, 2018
@YanzhaoLi YanzhaoLi added the impact/doc/note Requires creation of or changes to an official release note label Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-not-required impact/doc/note Requires creation of or changes to an official release note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need to add test cases for FQDN Volume of pr 8244 NFS VolumeStore FQDN not working
6 participants