More robust back-end connection handling #18

soxofaan · 2021-10-12T10:08:20Z

At the moment, the aggregator creates connection objects to the back-ends at startup time, and re-uses these "infinitely".
This worked fine as proof-of-concept, but has some issues.

it would be good refresh state from time to time by dropping old connections and starting fresh (e.g. to make sure latest capabilities as discovered properly)
sometimes back-end go down, causing failure on most aggregator endpoints. It would be good if aggregator could continue working in best-effort way with back-ends that are still up

m-mohr · 2021-10-15T15:11:56Z

sometimes back-end go down, causing failure on most aggregator endpoints. It would be good if aggregator could continue working in best-effort way with back-ends that are still up

This is an important one as we found out the hard way during the Editor demo at the Launch event today. One back-end was offline and afterward, the Platform was mostly unusable for anything except discovery.

soxofaan · 2021-10-19T10:17:15Z

FYI: is high prio on my planning now

@m-mohr some kind of warning system ("warning: this response is partial/incomplete/best effort") would be handy in this context. Did you see Open-EO/openeo-api#412 already?

m-mohr · 2021-10-19T10:21:36Z

No, sorry, that slipped through in my vacation, I think. I'll have a look, although this seems a bit out of scope for the core API spec and would more belong into an extension that handles federation aspects.

m-mohr · 2021-10-19T10:42:19Z

A related question is how clients should communicate this (assuming we go for the 206 status code). Except for the Web Editor, I don't really see yet how clients would communicate and handle this in a good way. Do you have any ideas yet, @soxofaan ?

soxofaan · 2021-10-19T11:15:47Z

I don't really see yet how clients would communicate and handle this in a good way. Do you have any ideas yet, @soxofaan ?

In Python context, I would just trigger a logging.warning or warnings.warn , that's a pretty common thing to do. By default it will be shown in notebooks (message with red background) and non-notebook runs (standard error).

m-mohr · 2021-10-19T13:11:05Z

Sounds good to me, I think that's possible in all clients, just the Web Editor would need a bit more additional code for it.

soxofaan · 2021-10-19T13:31:08Z

Web Editor would need a bit more additional code for it.

at minimum you could just do a console.warning I guess?

m-mohr · 2021-10-19T13:33:28Z

Sure, but 99+% of (targeted) Web Editor users would not look at the Browser console. I'd better open a toast warning or so, but the JS client right now doesn't support passing through such additional details while for the JS client itself a warning in the console would be enough. So most of the code will likely be written in the JS client itself...

…nnection

current implementation fails to update OIDC provider id mapping

…aching

…y and caching

…nnection

current implementation fails to update OIDC provider id mapping

…aching

current implementation fails to update OIDC provider id mapping

…aching

soxofaan · 2021-10-27T18:28:45Z

merged #21 in develop:

instead of holding on to same connection objects to back-ends all the time, they are re-refreshed every 5 minutes (for now), so that changes in availability can be adapted to more properly
aggregator now also an start up when one back-end is down (before the aggregator could only be (re)started when all back-ends where up)
various other harderning and caching tweaks

This should cover the most important resilience problems. Will close this for now.
Open new ticket when we find new situations where resilience could be improved

…nnection

current implementation fails to update OIDC provider id mapping

…aching

soxofaan added architecture federation design labels Oct 12, 2021

soxofaan mentioned this issue Oct 19, 2021

Generic way for warnings/deprecations on API response Open-EO/openeo-api#412

Open

soxofaan added a commit that referenced this issue Oct 20, 2021

Issue #18/EP-4049 Add resilience tests about collection metadata

b9a2442

soxofaan added a commit that referenced this issue Oct 20, 2021

Issue #18/EP-4049 finetune resilience of AggregatorProcessing

ae9ab51

soxofaan added a commit that referenced this issue Oct 20, 2021

Issue #18/EP-4049 more hardening for listing collections and jobs

738615a

soxofaan mentioned this issue Oct 20, 2021

Resilience and connection refreshing #21

Closed

soxofaan added a commit that referenced this issue Oct 25, 2021

Issue #18/EP-4049 initial implementation of refreshing MultiBackendCo…

71fd9d6

…nnection

soxofaan added a commit that referenced this issue Oct 25, 2021

Issue #18/EP-4049 trigger cache flush when connections change

8374cea

soxofaan added a commit that referenced this issue Oct 25, 2021

Issue #18/EP-4049 Finetune from self-review #21

876b8bf

soxofaan added a commit that referenced this issue Oct 25, 2021

Issue #18/EP-4049 add tests about resilience

a05e524

soxofaan added a commit that referenced this issue Oct 25, 2021

fixup! Issue #18/EP-4049 Finetune from self-review #21

9aa368d

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 add test about OIDC handlin on connection refresh

26d5a41

current implementation fails to update OIDC provider id mapping

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 rework OIDC metadata building, responsibility and c…

6e512c3

…aching

soxofaan added a commit that referenced this issue Oct 27, 2021

fixup! Issue #18/EP-4049 rework OIDC metadata building, responsibilit…

f1ddc37

…y and caching

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 initial implementation of refreshing MultiBackendCo…

25d7470

…nnection

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 trigger cache flush when connections change

5daccdd

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 Finetune from self-review #21

c51e46f

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 add tests about resilience

2f9aa8e

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 add test about OIDC handlin on connection refresh

85f8369

current implementation fails to update OIDC provider id mapping

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 rework OIDC metadata building, responsibility and c…

39cb94f

…aching

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 add test about OIDC handlin on connection refresh

d30e4c2

current implementation fails to update OIDC provider id mapping

soxofaan added a commit that referenced this issue Oct 27, 2021

Issue #18/EP-4049 rework OIDC metadata building, responsibility and c…

96c4073

…aching

soxofaan closed this as completed Oct 27, 2021

soxofaan added a commit that referenced this issue Oct 28, 2021

Issue #18/EP-4049 add changelog entry

d3d8dc9

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 initial implementation of refreshing MultiBackendCo…

e97f5f9

…nnection

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 trigger cache flush when connections change

bc21d27

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 Finetune from self-review #21

65c1c4e

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 add tests about resilience

a17f624

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 add test about OIDC handlin on connection refresh

e52a2ae

current implementation fails to update OIDC provider id mapping

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 rework OIDC metadata building, responsibility and c…

1f43c4d

…aching

soxofaan added a commit that referenced this issue Nov 4, 2021

Issue #18/EP-4049 bump version to 0.3.1a1 and add changelog entry

ecd2449

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More robust back-end connection handling #18

More robust back-end connection handling #18

soxofaan commented Oct 12, 2021

m-mohr commented Oct 15, 2021

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021 •

edited

Loading

m-mohr commented Oct 19, 2021 •

edited

Loading

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021 •

edited

Loading

soxofaan commented Oct 27, 2021

More robust back-end connection handling #18

More robust back-end connection handling #18

Comments

soxofaan commented Oct 12, 2021

m-mohr commented Oct 15, 2021

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021 • edited Loading

m-mohr commented Oct 19, 2021 • edited Loading

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021

soxofaan commented Oct 19, 2021

m-mohr commented Oct 19, 2021 • edited Loading

soxofaan commented Oct 27, 2021

m-mohr commented Oct 19, 2021 •

edited

Loading

m-mohr commented Oct 19, 2021 •

edited

Loading

m-mohr commented Oct 19, 2021 •

edited

Loading