Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some caching to AggregatorSecondaryServices.service_types #84

Closed
JohanKJSchreurs opened this issue Dec 7, 2022 · 3 comments
Closed

Comments

@JohanKJSchreurs
Copy link
Contributor

Add some caching to AggregatorSecondaryServices.service_types.

This is a follow up ticket for the to do's of #78, to plan and track the tasks separately .
See also: #78 (comment)

@soxofaan
Copy link
Member

soxofaan commented Dec 7, 2022

What also can be covered in this ticket is another todo from #78 (comment):

determining which upstream backend to pick should (also) be based on service type

@soxofaan
Copy link
Member

soxofaan commented Dec 7, 2022

some more background/explanation:

  • you can work with assumption "there is only one upstream backend per service type" (e.g. XYZ -> SentinelHub at the moment) so deciding which upstream backend to use can be done purely based on service type. There is also no need to implement "merging" of secondary service information. This assumption will not hold in the long term (depends on discussion Federation of Secondary Services #83) but is good enough for now. Just put "TODO" comment wherever that makes sense
  • don't just purely cache the return value of AggregatorSecondaryServices.service_types. Instead build (and cache) an internal housekeeping structure and use that to:
    • extract AggregatorSecondaryServices.service_types response
    • use it in AggregatorSecondaryServices.create_service to determine upstream backend from given service_type. So replace
      # TODO: hardcoded/forced "SentinelHub only" support for now.
      # Instead, properly determine backend based on service type?
      # See https://github.com/Open-EO/openeo-aggregator/issues/78#issuecomment-1326180557
      # and https://github.com/Open-EO/openeo-aggregator/issues/83
      if "sentinelhub" in self._backends._backend_urls:
      backend_id = "sentinelhub"
      else:
      backend_id = self._processing.get_backend_for_process_graph(
      process_graph=process_graph, api_version=api_version
      )
  • for the internal housekeeping structure: just use nested dicts/lists, so that serialization in caching layer (e.g. byte strings for zookeeper backend) works without a lot of additional work.
  • for the caching: look for "_memoizer" and "memoizer_from_config" usage in backend.py for inspiration how caching is done in other subsystems there. For example: AggregatorCollectionCatalog._get_all_metadata_cached is also a caching layer around a single internal housekeeping structure that is used to extract different derivative products

@soxofaan
Copy link
Member

merged #86

soxofaan added a commit that referenced this issue Dec 12, 2022
soxofaan added a commit that referenced this issue Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants