Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc push for http remote storage #3247

Closed
harold1505 opened this issue Jan 28, 2020 · 8 comments
Closed

dvc push for http remote storage #3247

harold1505 opened this issue Jan 28, 2020 · 8 comments
Assignees
Labels
awaiting response we are waiting for your reply, please respond! :) feature request Requesting a new feature good first issue help wanted

Comments

@harold1505
Copy link

I wanted to upload and download data files over http. But, i came to know dvc only support dvc fetch and dvc pull for http remotes. Why is it so? How to make dvc push work for http remote? New to dvc, please help.

⚠️ HTTP remotes only support downloads operations

@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Jan 28, 2020
@efiop
Copy link
Contributor

efiop commented Jan 28, 2020

Hi @harold1505 !

Are you using http to access AWS s3 or something else? Or do you want to use pure http remote?

@efiop efiop added the feature request Requesting a new feature label Jan 28, 2020
@triage-new-issues triage-new-issues bot removed the triage Needs to be triaged label Jan 28, 2020
@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Jan 28, 2020
@harold1505
Copy link
Author

It is neither AWS S3 nor a popular platform. It is a platform with Rest API in which files can be retrieved and restored via http. So I guess, I want a pure http remote.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2020

@harold1505 Got it 🙂 So the reason why it is not supported is simple: we just didn't get any requests for supporting it until now. 🙂 To implement it, we will need to implement _upload method in dvc/remote/http.py. There might be some edge cases that we need to handle carefully, but overall the implementation should be pretty straightforward. If you feel like it, maybe even consider taking a shot and contributing a PR for it, we will be happy to help with everything we can. 🙂

@ayushais
Copy link

It would be really nice if this feature is available.

@xkortex
Copy link

xkortex commented Jan 30, 2020

Seconded! I think there is some nuance in HTTP posts larger than 2 GB, but there is the Content-Range header, which allows you to split up the transfer into chunks, and I think a few other techniques. But it would be great to be able to write a POST endpoint to roll your own DVC remote backend.

@pmrowla
Copy link
Contributor

pmrowla commented Feb 14, 2020

As discussed with Ivan, I'll be working on implementing this

@pmrowla
Copy link
Contributor

pmrowla commented Feb 14, 2020

Just to clarify, the desired behavior for this is to have basic push support via making HTTP POST requests to http://<remote_url>/<filename>, and not push via some other HTTP based protocol like WebDAV (see also #1153)?

@harold1505 can you give some details on what exactly the platform you are using expects in terms of PUT/POST requests for file uploads?

The main reason I'm asking is that as far as I understand it, git's HTTP remote is read-only (like dvc) for "dumb" web servers. HTTP remote write/push is only supported for web servers that can talk git's "smart" protocol or WebDAV.

@efiop
Copy link
Contributor

efiop commented Feb 14, 2020

Just to clarify, the desired behavior for this is to have basic push support via making HTTP POST requests to http://<remote_url>/, and not push via some other HTTP based protocol like WebDAV (see also #1153)?

@pmrowla Great question! Correct, http remote should support POST requests. HTTP-based protocols (like s3 btw), are handled separately in their own remotes, so if we will ever need to add WebDAV support, we will likely create a separate remote class for it.

The main reason I'm asking is that as far as I understand it, git's HTTP remote is read-only (like dvc) for "dumb" web servers. HTTP remote write/push is only supported for web servers that can talk git's "smart" protocol or WebDAV.

I suppose you are trying to clarify if @harold1505 is trying to push to git HTTP server, but I just wanted to clarify that what we need to do here is support just generic HTTP servers that allow uploading through standard POST requests, without any special protocols on top.

pmrowla added a commit to pmrowla/dvc that referenced this issue Feb 20, 2020
- uploaded files are sent as chunked encoding POST data

Fixes iterative#3247
@efiop efiop closed this as completed in 68e543f Feb 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) feature request Requesting a new feature good first issue help wanted
Projects
None yet
Development

No branches or pull requests

5 participants