Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

error when using #332 trailing slash functionality #336

Closed
dpastoor opened this issue Apr 22, 2022 · 7 comments
Closed

error when using #332 trailing slash functionality #336

dpastoor opened this issue Apr 22, 2022 · 7 comments

Comments

@dpastoor
Copy link

What version of the package or command are you using?

go get github.com/mholt/archiver/v4@8a97d87 given desire to take advantage of #332

What are you trying to do?

want to leverage the new trailing / to have contents at the root of the archive

What steps did you take?

code snippets for intent:

package bundler

import (
	"context"
	"os"

	"github.com/mholt/archiver/v4"
)

func NewArchive(root string) error {
	files, err := archiver.FilesFromDisk(nil, map[string]string{
		root: "",
	})
	if err != nil {
		return err
	}

	// files, err := archiver.FilesFromDisk(nil, map[string]string {

	// })
	format := archiver.CompressedArchive{
		Compression: archiver.Gz{},
		Archival:    archiver.Tar{},
	}
	out, err := os.Create("/tmp/test.tar.gz")
	if err != nil {
		return err
	}
	defer out.Close()
	return format.Archive(context.Background(), out, files)
}

then running command bundler.NewArchive("/Users/devin/repos/rstudio/environments.rstudio.com/_site/") causes the below bug report. It works fine when not using the trailing slash, so bundler.NewArchive("/Users/devin/repos/rstudio/environments.rstudio.com/_site") works but then doesn't elevate the content to the archive root

tar tvf /tmp/test.tar.gz
tar: Archive entry has empty or unreadable filename ... skipping.
... # seems to list all files though

trying to open it:

image

What did you expect to happen, and what actually happened instead?

Given a desire to bundle the contents of _site if I use the following path: path/to/_site everything works fine, with the caveat of all the contents are within the _site dir of the resulting tarball. If instead I do path/to/_site/ then the contents are in the root, but the above error preseents

How do you think this should be fixed?

That the error no longer presents/its a valid archive. I'm happy to work through tracing this down.

Please link to any related issues, pull requests, and/or discussion

I'm happy to upload sample tarballs of the working/broken archives if that would help

Bonus: What do you use archiver for, and do you find it useful?

Plan to use it as part of a cli for publishing https://quarto.org/ created sites so hoping to use this to bundle up the created _site directory when creating a quarto website: https://quarto.org/docs/reference/projects/websites.html

@mholt
Copy link
Owner

mholt commented Apr 23, 2022

Thanks for the report! I'll try to look at it, might not be till after the weekend though.

Do any other archive tools read it successfully?

@dpastoor
Copy link
Author

So far no!

I'm hoping that there is just a logical hiccup somewhere in the code (I didn't look too hard at the original PR)

One thing I can also confirm works, I think hinting that there is something around the path adjustment logic...

knowing that the root dir would be _site, I added a manual adjustment after poking around the archiver File struct a bit and seeing the NameInArchive.

When I add this line:

for i, file := range files {
	files[i].NameInArchive = strings.TrimPrefix(file.NameInArchive,"_site")
}

this created the archive without that root dir just fine! It did leave the original _site folder empty though, and there are also some other files I wanted to get rid of while at it (like .DS_Store) - so after a little more experimentation here is the current code that is getting me to where I need for now:

note, this isn't really generalizable as its really for the specific website bundle I need to generate and the site generator always generates a sitemap.xml so that was an ok enough way to hunt for where the root should be.

func NewArchive(root string) error {
	files, err := archiver.FilesFromDisk(nil, map[string]string{
		root: "",
	})
	if err != nil {
		return err
	}
	var filteredFiles []archiver.File
	// lets find the root dir and store it if it hasn't been set yet so
	// we can strip it out
	var rootDir string
	for _, file := range files {
		// the sitemap should always be at the site root
		if file.Name() == "sitemap.xml" {
			rootDir = strings.TrimSuffix(file.NameInArchive, "/sitemap.xml")
			break
		}
	}
	for _, file := range files {
		// TODO: consider what other files/folders to exclude
		if file.Name() == ".DS_Store" || file.NameInArchive == rootDir {
			continue
		}
		if rootDir != "" {
			file.NameInArchive = strings.TrimPrefix(file.NameInArchive, rootDir+"/")
		}
		filteredFiles = append(filteredFiles, file)
	}
	// files, err := archiver.FilesFromDisk(nil, map[string]string {

	// })
	format := archiver.CompressedArchive{
		Compression: archiver.Gz{},
		Archival:    archiver.Tar{},
	}
	out, err := os.Create("/tmp/test.tar.gz")
	if err != nil {
		return err
	}
	defer out.Close()
	return format.Archive(context.Background(), out, filteredFiles)
}

and can call bundler.NewArchive("/Users/devin/repos/rstudio/environments.rstudio.com/_site")

and tada we're in business:

image

no errors, happily unpacked by tar + the mac archive utility

@mholt
Copy link
Owner

mholt commented Apr 25, 2022

Interesting, thanks. I'm unable to reproduce this with a folder/file from my file system:

package main

import (
	"context"
	"log"
	"os"

	"github.com/mholt/archiver/v4"
)

func main() {
	files, err := archiver.FilesFromDisk(nil, map[string]string{
		"/home/matt/Downloads/Takeout/": "",
	})
	if err != nil {
		log.Fatal(err)
	}

	format := archiver.CompressedArchive{
		Archival:    archiver.Tar{},
		Compression: archiver.Gz{},
	}
	out, err := os.Create("test.tar.gz")
	if err != nil {
		log.Fatal(err)
	}
	defer out.Close()

	err = format.Archive(context.Background(), out, files)
	if err != nil {
		log.Fatal(err)
	}
}

And when I open the archive on Linux, it reads just fine using Nautilus + Archive Utility (Ubuntu's default file browser), tar -tzvf, and also using format.Extract() in a small Go program.

So, I'm not sure what the problem is; so far it sounds like a bug in how macOS reads archive files.

@dpastoor
Copy link
Author

Thanks matt, let me give it a whirl on some different machines and I'll report back in the next day or two.

I'm on an m1 mac, but can try an x86 mac + windows to see if either of those work differently as well

@mholt
Copy link
Owner

mholt commented May 9, 2022

@dpastoor Did you ever get to look at this some more?

@mholt mholt closed this as completed May 17, 2022
@dpastoor
Copy link
Author

Hey matt,

Currently on day 19 of sequential covid as it rolls through the family / daycare. Sorry I haven't gotten to look at this and understand why you closed out the issue. As soon as I get a chance I'll update with any findings and can reopen/keep closed based on those findings.

@mholt
Copy link
Owner

mholt commented May 22, 2022

Oof, no worries. Our family is just getting over that too. Take care!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants