Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

m1 related unarchive inconsistency, dropping the root dir #337

Closed
dpastoor opened this issue Jun 7, 2022 · 10 comments
Closed

m1 related unarchive inconsistency, dropping the root dir #337

dpastoor opened this issue Jun 7, 2022 · 10 comments

Comments

@dpastoor
Copy link

dpastoor commented Jun 7, 2022

What version of the package or command are you using?

github.com/mholt/archiver/v4 v4.0.0-alpha.6.0.20220421032531-8a97d87612e9

What are you trying to do?

unarchive a tar.gz directory, given the basic wrapper function:

func Unarchive(input io.Reader, dir string) error {
	// TODO: consider if should write to a more generic interface
	// like a writer, or if maybe if the function itself
	// should take the handler as an input so can be as generic
	// as you'd like in the handler
	format, input, err := archiver.Identify("", input)
	if err != nil {
		return err
	}
	// the list of files we want out of the archive; any
	// directories will include all their contents unless
	// we return fs.SkipDir from our handler
	// (leave this nil to walk ALL files from the archive)

	handler := func(ctx context.Context, f archiver.File) error {
		newPath := filepath.Join(dir, f.NameInArchive)
		if f.IsDir() {
			return os.MkdirAll(newPath, f.Mode())
		}
		newFile, err := os.OpenFile(newPath, os.O_CREATE|os.O_WRONLY, f.Mode())
		if err != nil {
			return err
		}
		defer newFile.Close()
		// copy file data into tar writer
		af, err := f.Open()
		if err != nil {
			return err
		}
		defer af.Close()
		if _, err := io.Copy(newFile, af); err != nil {
			return err
		}
		return nil
	}
	// make sure the format is capable of extracting
	ex, ok := format.(archiver.Extractor)
	if !ok {
		return err
	}
	return ex.Extract(context.Background(), input, nil, handler)
}

What steps did you take?

On the mac, given a tar archive with a root dir of quarto-0.9.532 and directories

tar -tvf ~/Downloads/quarto-0.9.532-linux-amd64.tar.gz
drwxr-xr-x  0 runner docker      0 Jun  6 18:20 quarto-0.9.532/
drwxr-xr-x  0 runner docker      0 Jun  6 18:20 quarto-0.9.532/bin/

unpacking manually, can likewise see a directory structure:

.
└── quarto-0.9.542
   ├── bin
   └── share

however when I add fmt.Println("name in archive: ", f.NameInArchive) I see on the m1 mac

name in archive:  ./
name in archive:  ./bin/
name in archive:  ./share/

On linux, I do see the correct behavior.

name in archive:  quarto-0.9.542/
name in archive:  quarto-0.9.542/bin/
name in archive:  quarto-0.9.542/share/

What did you expect to happen, and what actually happened instead?

expect to unarchive the directory as present in the archive

How do you think this should be fixed?

normalize behavior

Please link to any related issues, pull requests, and/or discussion

likely the inverse issue of #336

Bonus: What do you use archiver for, and do you find it useful?

@mholt
Copy link
Owner

mholt commented Jun 7, 2022

I'm a bit confused by your outputs here. Where is name: . and name: bin and name: share coming from? It's different on Linux.

I'm inclined to think that the code you're using is different than what is being shared here, but I will need to reproduce your setup 100% exactly if I'm going to be able to help. Otherwise, all I can really do is offer troubleshooting tips.

Did you have a chance to look into #336 more? I wasn't able to reproduce it. And now I'm wondering if you're simply running different code on the two platforms.

@dpastoor
Copy link
Author

dpastoor commented Jun 7, 2022

ah shoot I was trying to clean up the log message I had originally included "name: ", f.Name() in the fmt command but it was noise. Updated the code snippet for consistency

I am pretty sure its not different code, namely, I'm generating a binary via goreleaser and then dropping that binary into the various platforms specifically to make sure I didn't do such a thing :-)

Code is currently here: https://github.com/dpastoor/qvm/blob/main/internal/unarchive/unarchive.go

@dpastoor
Copy link
Author

dpastoor commented Jun 7, 2022

I'm going to try to make a reproducible example in a standalone repo you can pull from and run that across the couple platforms so it'll be easier to diagnose

@dpastoor
Copy link
Author

dpastoor commented Jun 9, 2022

quick update - started with the existing (large) release and found a different bug on windows, the root dirs don't even show up at all to the handler func (at least per printing in the handler) 🤪

definitely need a cross platform reproducible example to unravel all this

@mholt
Copy link
Owner

mholt commented Jun 9, 2022

Thanks for the update. Definitely need to get our ducks in a row before we can make any progress I think. Will be very interested in a x-plat repro once you have it ready 👍

@dpastoor
Copy link
Author

I have some "good" news - I have been unable to reproduce the differences across mac/linux for the tarballs with the simple examples.

This led to a careful rabbit hole of going back to the original releases I was pulling from github and I had some findings:

  • the difference with zip files is due to me naively assuming that the same structure would be upheld (empty folders for directories that can be detected with IsDir() to create them before unpacking files). This seems to not be the case and can see with the unzip -lv that those just don't exist in the archive period, its not that archiver doesn't see them)

  • the mac and linux tarballs were actually fundamentally different inside their respective archive. Potentially a different bundling process??? Regardless not on archiver - the red herring was the differences were smoothed out by the GUI utilit(ies) when unpacking, but by dialing in on inspecting the archives carefully directly the differences are becoming clear.

More details to follow, but wanted to FYI you going into the weekend that this seems more about archiver doing exactly what it sees, not buggy behavior. I still need to look into archiving inconsistencies for #336 and now at least I have some minimal examples to check against.

@mholt
Copy link
Owner

mholt commented Jun 11, 2022

Thanks for the investigation @dpastoor. That's a relief and really good to know. Keep me posted (and on #336) :)

@mholt
Copy link
Owner

mholt commented Jul 7, 2022

Any more updates? If there's nothing to do here, I'll probably close the issue. Thanks for your participation!

@dpastoor
Copy link
Author

dpastoor commented Jul 7, 2022

So far been running smoothly and haven't had any (small set) of users report any issues across any platform! I'd say good to close

@mholt
Copy link
Owner

mholt commented Jul 7, 2022

Great! Let me know if anything more specific crops up.

@mholt mholt closed this as completed Jul 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants