Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Download #2

Open
Aznaveh opened this issue Nov 7, 2017 · 6 comments
Open

Batch Download #2

Aznaveh opened this issue Nov 7, 2017 · 6 comments

Comments

@Aznaveh
Copy link
Contributor

Aznaveh commented Nov 7, 2017

I think most of people usually need a bunch of matrices. It would be great if there is a functionality to download in bulk.

@ScottKolo ScottKolo changed the title New functionality needed Batch Download Nov 7, 2017
@lahwaacz
Copy link

Or at least allow a recursive download with something like this:

wget --mirror --no-parent --force-directories --cut-dirs=1 --accept "*.gz" --delete-after "https://sparse.tamu.edu/MM/"

I'm getting a lot of 403 Forbidden responses after about 4 successful queries...

@DrTimothyAldenDavis
Copy link
Collaborator

DrTimothyAldenDavis commented Jan 14, 2020 via email

@lahwaacz
Copy link

Relying on Java is not an option for me... 😞

@ScottKolo
Copy link
Owner

Or at least allow a recursive download with something like this:

wget --mirror --no-parent --force-directories --cut-dirs=1 --accept "*.gz" --delete-after "https://sparse.tamu.edu/MM/"

I'm getting a lot of 403 Forbidden responses after about 4 successful queries...

This should actually work. I don't have a good explanation for why you would be getting 403 responses for this. I just tried and saw the same behavior.

My hunch is that this has something do with our Apache configuration.

@lahwaacz
Copy link

If you could provide an FTP access like the MatrixMarket (ftp://math.nist.gov/pub/MatrixMarket2/) has, that would be great.

@neoblizz
Copy link

If it still helps, this worked for me in downloading ALL matrix market datasets:

wget --recursive --no-parent --force-directories -l inf -X RB,mat --accept "*.tar.gz" "https://suitesparse-collection-website.herokuapp.com/"
  • --recursive recursively download
  • --no-parent prevent wget from starting to fetch links in the parent of the website
  • --l inf keep downloading for an infinite level
  • -X RB,mat ignore subdirectories RB and mat, since I am only downloading matrix market MM, you can choose to download any of the others or remove this entirely to download all formats
  • --accept accept the following extension only
  • --force-directories create a hierarchy of directories, even if one would not have been created otherwise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants