-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling HTTP errors in search.items() generator #712
Comments
Thanks for taking the time to write this up. If you include your import lines and the items = list(get_search_result(bbox, start, end).items())
for item in items:
# download needed assets
# process them into product |
Hi Julia, sorry for only coming back to you now. The values were bbox = [13.215265,51.118933,13.260498,51.147263]
start = "2021-02-15"
end = "2021-04-30" and the only import used in the code I posted above is from pystac_client import Client as stac Feel free to check this specific case, to help you pinpoint it: The error happened when processing Thanks also for the workaround idea. I think it would minimise the risk of this happening again, as the requests are squeezed into a much shorter timespan instead of stretching them over possibly hours. But on the bottom line, if such an error occurs again, I would still be left with an aborted pipeline. To try the failed request again, I would need to restart the whole fetching and the code would need to look something like this: search = get_search_result(bbox, start, end)
try:
items = list(search.items()) # normal try
except:
try:
items = list(search.items()) # try again
except:
return # giving up after two fails
# if we made it here it worked
for item in items:
# download needed assets
# process them into product Which is why I'd repeat my statement/question from the initial post: I think it would be nice if pystac_client would automatically retry failed requests one or two times? But yeah, your idea at least makes it less likely that this will be an issue again, so as a first step I'll implement it. Another question about the details of it: Meanwhile I came across some code that uses |
I wrote up responses to your comments and then I did a little search to see if this conversation has come up before. It seems that it has (#532) and retries are actually already implemented 🙈 . You can read about how to configure retries in the docs: https://pystac-client.readthedocs.io/en/latest/usage.html#configuring-retry-behavior This is what I had written before:
Yeah I hear you. I was just wondering if this kind of failure is sporadic (and therefore a good candidate for retries) or a genuine timeout.
Using |
I'm searching a STAC catalog and then iterate over the result with the
items()
generator:It's quite a lengthy loop, as each iteration takes about a minute (I don't know if that is relevant).
The other day, about 20 minutes into the loop, my worker crashed with a
RemoteDisconnected
error:Apparently something went wrong during the communication with the server. Until today, I didn't even know that each yielding of the next item issues another HTTP request, but of course that makes sense, as all the details of that item have to be fetched.
That one time it failed -- happens.
But how to handle this? Adding a
try ... except
around the loop would certainly be smart and at least save my worker from a total crash. But it would still throw me out of the loop. I think it would be nice if pystac_client would automatically retry failed requests one or two times?Something similar seems to have been discussed recently in #680. That discussion ended with "not planned", because the issue was not seen on the pystac_client side. Maybe this example gives a new perspective on the topic?
The text was updated successfully, but these errors were encountered: