Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use _call within _get_file to ensure retries #384

Closed
wants to merge 7 commits into from

Conversation

oliverwm1
Copy link
Contributor

@oliverwm1 oliverwm1 commented Apr 29, 2021

Use _call within _get_file to ensure that the retry logic in that function is exercised. This PR makes a significant change in implementation, since now multiple http requests are made within single _get_file call (if the file is large) whereas previously a single request was made. Not sure about performance implications (speed, memory usage, reliability).

Resolves #383

@oliverwm1 oliverwm1 changed the title Use _call with _get_file to ensure retries Use _call within _get_file to ensure retries Apr 30, 2021
@martindurant
Copy link
Member

@isidentical , this is probably also of interest to you

@oliverwm1
Copy link
Contributor Author

@martindurant @isidentical thoughts on this? It passes the (skipped) test_get_put tests for me locally, and I've manually called fs.get on a small and large file with success. I'm not sure about performance implications and also perhaps there are complications for compressed files? (https://cloud.google.com/storage/docs/transcoding#decompressive_transcoding)

@oliverwm1 oliverwm1 marked this pull request as ready for review April 30, 2021 20:23
@martindurant
Copy link
Member

I'm pretty sure we need a way to split up the reads and stream to disc, else we risk memory issues.

@oliverwm1
Copy link
Contributor Author

I'm pretty sure we need a way to split up the reads and stream to disc, else we risk memory issues.

Okay. Other idea is to define a retry decorator that we can apply to _call and _get_file. Not totally straightforward to get an interface that will work for both functions, but I think it's doable. Trying to avoid more copy and paste.

@martindurant
Copy link
Member

Perhaps you can pass the output file into _call ?

@oliverwm1
Copy link
Contributor Author

Closing in favor of #387

@oliverwm1 oliverwm1 closed this May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_get_file does not use _call, and so does not have any retry logic
2 participants