Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkat fails with EACCES when the target inode has deleted original hardlink #3972

Closed
trchen1033 opened this issue Apr 13, 2019 · 24 comments
Closed

Comments

@trchen1033
Copy link

Please fill out the below information:

  • Your Windows build number: Microsoft Windows [Version 10.0.18362.53]

  • What you're doing and what's happening: (Copy&paste the full set of specific command-line steps necessary to reproduce the behavior, and their output. Include screen shots if that helps demonstrate the problem.)

$ touch foo
$ ln foo bar
$ rm -f foo
$ ln bar ham
ln: failed to create hard link 'ham' => 'bar': Permission denied
  • What's wrong / what should be happening instead:
    The ln command shouldn't fail. A hardlink named ham pointing to the same inode should be created.

  • Strace of the failing command, if applicable: (If some_command is failing, then run strace -o some_command.strace -f some_command some_args, and link the contents of some_command.strace in a gist here).

https://gist.github.com/trchen1033/9f004be23919f8f7a2ffff4b1da9bf7b

@therealkenc
Copy link
Collaborator

Cannot reproduce on either drvfs or lxfs. Sequence speaks for itself though, so if it is failing you are probably onto something. Best guess absent some me2s is you are missing some important CLI steps leading up to the touch foo.

@trchen1033
Copy link
Author

I confirm this does not affect drvfs. My / is mounted as wslfs though. Here's how my /proc/mounts look like:

rootfs on / type wslfs (rw,noatime)
none on /dev type tmpfs (rw,noatime,mode=755)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,gid=5,mode=620)
none on /run type tmpfs (rw,nosuid,noexec,noatime,mode=755)
none on /run/lock type tmpfs (rw,nosuid,nodev,noexec,noatime)
none on /run/shm type tmpfs (rw,nosuid,nodev,noatime)
none on /run/user type tmpfs (rw,nosuid,nodev,noexec,noatime,mode=755)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime,mode=755)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
C:\ on /mnt/c type drvfs (rw,noatime,uid=0,gid=0,case=off)

The above repro was done in my home directory. /home/trchen, all permission setup looks normal. The same repro can be done as root too. I use a self installed Gentoo distro, though I don't think it's related to my distro or glibc. I haven't tried, but I can probably make a repro that makes syscalls directly instead.

@therealkenc
Copy link
Collaborator

I use a self installed Gentoo distro

[....redacted....] Of course you do.

I haven't tried, but I can probably make a repro that makes syscalls directly instead.

You can tee that up if you like. But before you take the trouble, variate some external variables -- I can't guess which. Your strace(1) log has a one-liner linkat() with 'bar' and 'ham' -- files which aren't breathed on before the one syscall. A test case with the one call won't elucidate much. A test case with the touch+ln+rm+ln sequence might. Hard to say. But I'd scratch head for an unspoken externality first, starting with "not Gentoo".

Below is my strace(1) log using ln(1) 8.28 and glibc 2.27-3ubuntu1 FWIW. Which is notable mostly for being identical save for the linkat() fail (on a blurry eyed glance anyway). So we're looking for something else. This on 18875 with the usual caveat.

openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2931584, ...}) = 0
mmap(NULL, 2931584, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f0334f34000
close(3)                                = 0
stat("ham", 0x7fffc24f1ae0)             = -1 ENOENT (No such file or directory)
lstat("bar", {st_mode=S_IFREG|0666, st_size=0, ...}) = 0
linkat(AT_FDCWD, "bar", AT_FDCWD, "ham", 0) = 0

@therealkenc
Copy link
Collaborator

Quick follow-up. Me (still 18875):

$ mount
rootfs on / type lxfs (rw,noatime)
[...]

That lxfs vs wslfs is different enough for me.

@0xbadfca11
Copy link

0xbadfca11 commented Apr 13, 2019

I can reproduce this with build 18875's wslfs, not happen lxfs.
(wslfs is currently default filesystem for VolFs for new distribution install.)

@trchen1033
Copy link
Author

I confirm. Tried on my other computer computer with the same distro except with lxfs. Very likely it's a wslfs bug.

@bonki
Copy link

bonki commented Oct 11, 2019

I can also reproduce this with wslfs, here's a complete strace.

$ rm -f foo bar ham
$ strace -o wslfs_ln_fail.strace -f sh -c 'touch foo; ln foo bar; rm -f foo; ln bar ham'
ln: failed to create hard link 'ham' => 'bar': Permission denied

@andreasstieger
Copy link

Seen to break zypper on openSUSE Leap 15.1 on WSL: https://bugzilla.opensuse.org/show_bug.cgi?id=1159195 which also used hardlinks

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 16, 2020

This one appears to be addressed in WSL2 (which sports an ext4 filesystem).

A test case with the touch+ln+rm+ln sequence might.

Does. From #4816

image

@lnussel
Copy link

lnussel commented Jan 20, 2020

Under which circumstances resp since when is wslfs used as default file system?
If it's the default for everyone now we have a serious problem in openSUSE as the package manager would not work anymore with the images we have in the store.

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 20, 2020

I'll ping the devs internally. The S/N ratio is pretty low so ones like this one sometimes get buried.

As a data point, was there any change in the package manager with respect to using (or not using) hardlinks as part of the process (in the last 18 months or so)? On best evidence the lxfs/wslfs difference points fairly conclusively to a regress; but it will be helpful to know if there is more than one variable in play.

@lnussel
Copy link

lnussel commented Jan 20, 2020

@lnussel
Copy link

lnussel commented Jan 21, 2020

short of a quick fix is there a workaround? Is there a way to downgrade to lxfs for example?

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 21, 2020

Not that I know of. The / mountpoint isn't under user control.

One could hypothetically work-around in the source, acknowledging that's icky. Or LD_PRELOAD a
libzypp.so shim since hardlinkCopy() is public. Typing this blind into the message:

     std::string unlinkpath;
     if ( pi.isExist() )
      {
	// int res = unlink( newpath );
        unlinkpath = newpath + ".unlink";
        int res = ::rename(newpath.c_str(), unlinkpath.c_str());
	if ( res != 0 )
	  return logResult( res );
      }

      // Here: no symlink, no newpath
      if ( ::link( oldpath.asString().c_str(), newpath.asString().c_str() ) == -1 )
      {
        switch ( errno )
        {
	  case EPERM: // /proc/sys/fs/protected_hardlink in proc(5)
          case EXDEV: // oldpath  and  newpath are not on the same mounted file system
	    MIL << " => copy" << endl;
            return copy( oldpath, newpath );
            break;
        }
        return logResult( errno );
      }
      if (unlinkpath.size()) {
        unlink(unlinkpath);
      }
      return logResult( 0 );

Which is to say, in the OP analogy, this works:

$ touch foo
$ ln foo bar
$ # rm -f foo
$ mv foo foo.unlink
$ ln bar ham
$ rm -f foo.unlink

Not advocating actually spinning a libzypp, natch; just that it's a plausible work-around.

@lnussel
Copy link

lnussel commented Jan 21, 2020

Too much for documenting a quick workaround. I was hoping for some simple command to enter on Windows side to downgrade to lxfs. This is a real bug in wslfs after all that also affects other legitimate work loads. #4066 for example also looks like it.

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 21, 2020

#4066 is more likely #1529; can't tell because there's no repro or strace log over there. Bitbake has surfaced before #2665. Analogous everyone's favorite npm #14 except bitbake isn't as popular.

This one was novel because there's no handle open; but educated speculation would be a handle is open on the win32 side of the 9p service. 9p was release in 18342, circa February 2019; uncoincidentally included a couple months later in 18342 per the OP.

@therealkenc
Copy link
Collaborator

therealkenc commented Feb 3, 2020

Can someone with a working WSL 1 try running the OP repro from an elevated ("run as administrator") cmd prompt, then wsl.exe -d OpenSUSE-Leap-15-1. Probably doesn't need to be SUSE. Have a working theory what broke which is probably incorrect, but it is worth a try.

[ed] nvm, I managed to get a WSL 1 live using --export / --import. No dice running elevated.

@bonki
Copy link

bonki commented Jul 26, 2020

@therealkenc Is this fix part of 2004 or do we need to wait for the next release?

@therealkenc
Copy link
Collaborator

Right, Craig's fixinbound was Feb 12th which means there is no way this made 2004.

@therealkenc therealkenc reopened this Jul 26, 2020
gperciva added a commit to Tarsnap/tarsnap that referenced this issue Aug 3, 2020
On WSL, one user reported that they could not create an archive due to:

tarsnap: link(/root/tarsnap-cache/directory.tmp, /root/tarsnap-cache/directory): Permission denied

I believe [1] that this is due to a known WSL issue [2] which was marked
as fixed 2020-06-02, but that fix is currently only available for
"Insider Preview" builds and not public builds.  Also, I'm not certain
how often WSL users update their installed systems.

[1] http://mail.tarsnap.com/tarsnap-users/msg01601.html

[2] microsoft/WSL#3972
@therealkenc
Copy link
Collaborator

therealkenc commented Aug 3, 2020

This one went fixinbound Feb 12th; let's call it amorphous "Stability improvements for virtio-9p (drvfs)" in 19640.

/fixed 19640

@ghost ghost closed this as completed Aug 3, 2020
@ghost
Copy link

ghost commented Aug 3, 2020

This bug or feature request originally submitted has been addressed in whole or in part. Related or ongoing bug or feature gaps should be opened as a new issue submission if one does not already exist.

Thank you!

@yecril71pl
Copy link

yecril71pl commented Aug 26, 2020

Is said 19640 for WSL2 only? Have you just discontinued WSL? 😲

@therealkenc
Copy link
Collaborator

Is said 19640 for WSL2 only?

No.

image

@yecril71pl
Copy link

I am not sure how relevant it is but I am on 19041 and the problem does not occur after upgrading to WSL 2.

gperciva added a commit to Tarsnap/tarsnap that referenced this issue Jun 16, 2021
On WSL, one user reported that they could not create an archive due to:

tarsnap: link(/root/tarsnap-cache/directory.tmp, /root/tarsnap-cache/directory): Permission denied

I believe [1] that this is due to a known WSL issue [2] which was marked
as fixed 2020-06-02, but that fix is currently only available for
"Insider Preview" builds and not public builds.  Also, I'm not certain
how often WSL users update their installed systems.

[1] http://mail.tarsnap.com/tarsnap-users/msg01601.html

[2] microsoft/WSL#3972
gperciva added a commit to Tarsnap/tarsnap that referenced this issue Jan 20, 2022
On WSL, one user reported that they could not create an archive due to:

tarsnap: link(/root/tarsnap-cache/directory.tmp, /root/tarsnap-cache/directory): Permission denied

I believe [1] that this is due to a known WSL issue [2] which was marked
as fixed 2020-06-02, but that fix is currently only available for
"Insider Preview" builds and not public builds.  Also, I'm not certain
how often WSL users update their installed systems.

[1] http://mail.tarsnap.com/tarsnap-users/msg01601.html

[2] microsoft/WSL#3972
gperciva added a commit to Tarsnap/tarsnap that referenced this issue Jan 24, 2022
On WSL, one user reported that they could not create an archive due to:

tarsnap: link(/root/tarsnap-cache/directory.tmp, /root/tarsnap-cache/directory): Permission denied

I believe [1] that this is due to a known WSL issue [2] which was marked
as fixed 2020-06-02, but that fix is currently only available for
"Insider Preview" builds and not public builds.  Also, I'm not certain
how often WSL users update their installed systems.

[1] http://mail.tarsnap.com/tarsnap-users/msg01601.html

[2] microsoft/WSL#3972
gperciva added a commit to Tarsnap/tarsnap that referenced this issue Feb 12, 2022
On WSL, one user reported that they could not create an archive due to:

tarsnap: link(/root/tarsnap-cache/directory.tmp, /root/tarsnap-cache/directory): Permission denied

I believe [1] that this is due to a known WSL issue [2] which was marked
as fixed 2020-06-02, but that fix is currently only available for
"Insider Preview" builds and not public builds.  Also, I'm not certain
how often WSL users update their installed systems.

[1] http://mail.tarsnap.com/tarsnap-users/msg01601.html

[2] microsoft/WSL#3972
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants