-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataGranules.get raises IndexError when Granules found = zero #526
Comments
earthaccess/earthaccess/search.py Lines 521 to 525 in 0e4f392
This has a smell to me. Why are we doing this post-process step to calculate if all granules are cloud-hosted based on the first granule, instead of having the DataGranule class set cloud_hosted for each granule at init-time? It would be great to put that rationale in a comment for future us. If there's good rationale, or if we're looking for a short-term fix, I think we can add a quick: if not results:
return [] to resolve the bug. |
I would suggest simply changing the assignment to cloud = len(response) > 0 and self._is_cloud_hosted(response[0]) However, I think this reveals another bug. It is erroneous to assume that if For example, it is valid to specify multiple collection concept IDs in a CMR granule search query. If one collection is cloud hosted and the other is not, then the granules will be a mix of cloud hosted and non-cloud hosted. Therefore, I believe the proper fix to address both bugs (IndexError and wrong assumption) is to completely remove the line that assigns a value to return [DataGranule(granule, cloud_hosted=self._is_cloud_hosted(granule)) for granule in response] |
💯 This is the smell I was trying to communicate, thanks for explaining it better than me! |
Since |
Agreed that @cached_property
def cloud_hosted(self) -> bool:
... I might also suggest dropping the cc: @betolink |
The |
I'm referring to the |
💯 |
Anybody know why With my current understanding, it doesn't seem to make sense to me to explicitly specify such a parameter, particularly if we move the logic in I would think that with such a move, we would want to deprecate the |
@betolink, do you have any background knowledge on why this is implemented this way? |
Sorry for the delayed response, there are collections that exist both in the DAAC and AWS with the same |
Sure, injecting a flag would be faster (although I suspect this would be negligible, thus a questionable "optimization"), but more importantly, it's possibly simply wrong, per an earlier comment I made:
|
Observations added from duplicate #816:
|
As a workaround, I used the following: try:
granules = earthaccess.search_data(concept_id = id, temporal = (start_date,end_date))
except:
print("no granules found") |
What Happens
If I do a simple search for granules that returns zero hits, I get an index error.
What I expect
I would expect this to return an empty list as print
Granules found: 0
to stdout.Proposed change
Check that response is not empty before checking
_is_cloud_hosted
. Return empty list if it is empty.The text was updated successfully, but these errors were encountered: