-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Remote API Error (max retries exceeded) #3265
Comments
@sid-marain what remote type do you use? Google Drive or GCP? can you push at least one file or it happens in the middle? |
It's GCP. I can push at least two files, I believe, but the last fails. I am running dvc in a distributed manner from docker in a batch process. That is, I have (maybe dozens to hundreds) of containers running a script in parallel. These finish asynchronously and then push results to the dvc remote. This happens somewhat infrequently (on a recent job with ~140 tasks, I saw this occur 3 times... there didn't seem to be anything about the tasks that would be indicative of failure. I.e., these seemed to be rather random failures). |
@sid-marain thanks! could you also please share the last part of the stack trace? It's not clear yet what is happening. To be honest looks more like an environment error to me. We would need to find a way (together?) to reproduce this most likely to understand what's happening. |
@sid-marain , looks like GCP rate-limits you if you are doing consecutive requests: https://cloud.google.com/storage/docs/request-rate#auto-scaling |
@shcheklein : That's all I'm getting from the stack trace. It's difficult to reproduce this error because, as mentioned above, it only seems to happen once every so often. @MrOutis : that might be it. We are potentially doing quite a few requests. Could retry the dvc call w/ exponential backoff. |
@sid-marain , as far as I remember, |
@MrOutis: Yeah. Based on review of the timing of this error on logs, the requests were submitted and failed within a 20 second window. We don't really have a way to coordinate the call to dvc push across the batch, so I think the retry mechanism will have to do. Thanks! |
due to
dvc push
excerpt from
dvc push -v
Settings
DVC 0.82.6
Python 3.6
Ubuntu 18.04
The text was updated successfully, but these errors were encountered: