Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zip: compress with deflate instead of bzip2 #5

Closed
mhamberg1 opened this issue Jul 18, 2017 · 17 comments
Closed

zip: compress with deflate instead of bzip2 #5

mhamberg1 opened this issue Jul 18, 2017 · 17 comments

Comments

@mhamberg1
Copy link

I tried downloading the main metadata file to look at the underlying CSVs: https://os.unil.cloud.switch.ch/fma/fma_metadata.zip

I'm getting a rejection on both mac and windows when I try to unzip this. Am I missing something?

@mdeff
Copy link
Owner

mdeff commented Jul 19, 2017

Can you verify that the archive is not corrupted by checking its SHA-1 hash? It should be f0df49ffe5f2a6008d7dc83c6915b31835dfe733.

@mhamberg1
Copy link
Author

The SHA-1 hash comes out the same so that's good.

What I see when I try to unzip this on MacOS is this: "Unable to expand "fma_metadata.zip into Desktop" (Error 1 - Operation Not permitted)"

Is it not actually a .zip compression by chance?

@ciotog
Copy link

ciotog commented Jul 19, 2017

I was able to successfully download and unzip the file with linux, so I imagine it's an environment issue.

[chump 3] misc $ file fma_metadata.zip 
fma_metadata.zip: Zip archive data
[chump 4] misc $ sha1sum fma_metadata.zip 
f0df49ffe5f2a6008d7dc83c6915b31835dfe733  fma_metadata.zip
[chump 5] misc $ unzip fma_metadata.zip 
Archive:  fma_metadata.zip
 bunzipping: fma_metadata/README.txt  
 bunzipping: fma_metadata/checksums  
 bunzipping: fma_metadata/not_found.pickle  
 bunzipping: fma_metadata/raw_genres.csv  
 bunzipping: fma_metadata/raw_albums.csv  
 bunzipping: fma_metadata/raw_artists.csv  
 bunzipping: fma_metadata/raw_tracks.csv  
 bunzipping: fma_metadata/tracks.csv  
 bunzipping: fma_metadata/genres.csv  
 bunzipping: fma_metadata/raw_echonest.csv  
 bunzipping: fma_metadata/echonest.csv  
 bunzipping: fma_metadata/features.csv  
[chump 6] misc $ 

@mdeff
Copy link
Owner

mdeff commented Jul 19, 2017

Yep I think too. It looks like a permission issue. Have you write access to Desktop ? Can you try another directory by chance ?

@mhamberg1
Copy link
Author

Same thing from another folder. Also we tried a colleague's windows machine and the default windows unzip couldn't do it.
We were successfully able to do it with 7zip but I don't think that's a good solution for most people. Are you compressing it in some weird way? I've never seen this Mac error and I open lots of compressed files.

Here are some additional details. When I tried "unzip -a fma_metadata.zip" on mac I get results like this:
"skipping: fma_metadata/README.txt need PK compat. v4.6 (can do v4.5)"

@mdeff
Copy link
Owner

mdeff commented Jul 20, 2017

That's indeed not an acceptable solution. I wonder how many people encountered this issue, as there has been many downloads.

The archive is created by the create_zip() function in <creation.py> using the zipfile package.

The PK compat suggests that the zip version needed to uncompress the file is greater than what your utility supports. And indeed, the BZIP2 compression method I used has been introduced in 4.6 (from 2001). Now the question is, how common are zip utilities who do not support 4.6?

@mhamberg1
Copy link
Author

I'm not sure but I'm trying on Archive Utility on a brand new macbook pro. We also tried it with the default windows unzip on Windows 10.
Maybe everyone is unzipping it with Linux :)

@mdeff
Copy link
Owner

mdeff commented Jul 20, 2017

Hehe maybe. I'm surprised the last version of the two major OS do not support a zip format from 2001... Anyway that should be fixed. I'll compress the text files of the next release with deflate instead of bzip2. The increase in size should be relatively small, compared to the audio. Thanks for reporting this! :)

@mdeff mdeff changed the title Issue unzipping the zip: compress with deflate instead of bzip2 Jul 20, 2017
@mhamberg1
Copy link
Author

Thanks. When is the next release?

@mdeff
Copy link
Owner

mdeff commented Jul 21, 2017

Some time after my current vacations. Let's say around mid-August.

@mdeff
Copy link
Owner

mdeff commented Aug 8, 2017

For the record, here's another user having issues with the zip:

I have tried to download the data set of size 413M from this link https://github.com/mdeff/fma but the zip file is corrupt. Can I request you if you can provide me a copy of the data set or the link from where i can dwnload its valid copy, it would have great for me. I will highly appreciate your favor in this regards.

I have downloaded the zip file "fma_metadata.zip" twice but after completion of downloading process, when i try to extract files from zip folder it says "Windows could not complete file extraction The Destination file could not be created". please see the snapshot of error attached with email.

I'm still surprised by this "destination file could not be created" error (similar to @mhamberg1 macOS message). It's clearly misleading.

@Antobiotics
Copy link

Antobiotics commented Aug 10, 2017

Hi, I had the exact same issue. I ended up using p7zip to extract the data by running:

7za x fma_metadata.zip

and it worked!

@statsmaths
Copy link

I had the exact same issue on my Mac. I was able to unzip it just fine from within R, however:

unzip("fma_metadata.zip")

@hendriks73
Copy link

Yep.
Even on macOS HighSierra the zip file cannot be extracted without extra effort.
You could do a bunch of people a big favor by fixing this, use standard zip or at least provide a secondary format (bzip2, tgz, whatever).
Especially with ISMIR 2017 on the horizon.

@mdeff
Copy link
Owner

mdeff commented Oct 11, 2017

bzip2 is in standard zip since 2001. ;)

I'm currently working on getting the next release out. It will be fixed before ISMIR. I just wanted to wait a bit to get more feedback and not having to release many times, which is cumbersome for everybody.

@francisbrito
Copy link

francisbrito commented Jan 14, 2018

Hey there,

I also ran into this issue on my MacOS High Sierra machine. Was able to fix 7-Zip as suggested by @Antobiotics.

Perhaps, while the new file is being uploaded, a clause could be added instructing users of MacOS to use 7-Zip instead of the built-in zip command.

It is there. I missed it on the first read. My bad.

(PS: Here's a nice explanation of the cause and solution to this issue: https://unix.stackexchange.com/a/183453)

@mdeff
Copy link
Owner

mdeff commented Jun 13, 2020

Closing as

  1. the 7zip workaround is documented in the README, and
  2. it's listed as a known issue in Known issues (and next release) #41 and in the README.

It'll be fixed on the next data release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants