Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize back-end requests (e.g. for /jobs) #28

Closed
soxofaan opened this issue Jan 11, 2022 · 2 comments
Closed

Parallelize back-end requests (e.g. for /jobs) #28

soxofaan opened this issue Jan 11, 2022 · 2 comments

Comments

@soxofaan
Copy link
Member

soxofaan commented Jan 11, 2022

The aggregator has to combine results from multiple back-ends for some API endpoints, e.g. /collections, /processes, /file_formats, /udf_runtimes, ... At the moment these are not user-specific and don't change often, so it's not hard to avoid performance bottlenecks with a bit of caching (e.g. see #2).

The /jobs endpoint however is user-specific and dynamic, thus allows very little caching opportunity. At the moment the requests to the underlying back-ends are done one after the other, resulting in sometimes long response times (order of tens of seconds, related: #27, openEOPlatform/architecture-docs#179). By doing the back-end requests in parallel in some way (async, threads, ...) response time can be improved considerably.

Note that the web editor polls /jobs regularly (and maybe a future notebook component will do too) , so it's worthwhile to optimize this endpoint (at least for perceived performance)

(internal ref EP-4122)

@soxofaan
Copy link
Member Author

soxofaan commented Aug 3, 2022

#33 was duplicate:

In various places (and especially the large area processing feature), the aggregator makes synchronous back-end requests in series. While this was the easiest to set up a proof of concept implementation in the existing openeo_driver framework, it is obviously bad for performance because some request can take quite long (e.g. batch job starting).
A lot can be gained by waiting for back-end responses in parallel.

@soxofaan
Copy link
Member Author

soxofaan commented Oct 5, 2023

implemented parallellized handling of /jobs requests

I don't think there are other opportunities for this kind of perf optimization by parallelization (collection/process metadata requests are now optimized through caching).

going to close (for now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant