When hard refresh fails with an exit code, make SILO exit #3151

theosanderson · 2024-11-01T16:45:11Z

This is an RFC.

Currently if a Silo hard refresh fails this doesn't trigger the container to exit and so can't be seen, for example, in the argoCD status. This is confusing. At some point we should make this cause the container to exit. However this could lead to downtime of the query API so potentially we could do it only after splitting up preprocessing into different pods than the main SILO. But that only makes sense if we expect the refresh to fail which I'm not sure we do?

theosanderson · 2024-11-01T17:30:52Z

~~Specifically atm I think we have an inconsistency between what happens in a hard refresh and a non hard refresh in silo preprocessing (we should check if that's true). The inconsistency seems bad.~~

corneliusroemer

I'm not 100% sure having preprocessing fail is without negative side effects. There might have been a reason that I didn't make it exit. But we can try.

Note that this PR only makes it exit in the first invocation. Not in the second. I think we want same treatment in both cases.

corneliusroemer · 2024-11-01T18:07:10Z

Why do you think there's an asymmetry? We never exit if there are errors afaict. It's only the logging that's different, no exit code logged in second invocation with Etag.

Preprocessing can crash for example if backend is down, or if keycloak is down. I'm not sure it should exit in those cases. Maybe we can decide what to do based on exit codes. If silo proper fails, sure, we can exit, but not if there's a network issue in curl for example?

theosanderson · 2024-11-01T20:11:21Z

Why do you think there's an asymmetry? We never exit if there are errors afaict. It's only the logging that's different, no exit code logged in second invocation with Etag.

Yes having looked I agree - and I will add crossed lines to my post above. I was writing quickly with my interpretation of the patterns we observed today, which I now suspect were incorrect.

When hard refresh fails with an exit code, make SILO exit

38702fa

theosanderson added the discussion Open questions label Nov 1, 2024

corneliusroemer reviewed Nov 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When hard refresh fails with an exit code, make SILO exit #3151

When hard refresh fails with an exit code, make SILO exit #3151

theosanderson commented Nov 1, 2024

theosanderson commented Nov 1, 2024 •

edited

Loading

corneliusroemer left a comment

corneliusroemer commented Nov 1, 2024

theosanderson commented Nov 1, 2024

When hard refresh fails with an exit code, make SILO exit #3151

Are you sure you want to change the base?

When hard refresh fails with an exit code, make SILO exit #3151

Conversation

theosanderson commented Nov 1, 2024

theosanderson commented Nov 1, 2024 • edited Loading

corneliusroemer left a comment

Choose a reason for hiding this comment

corneliusroemer commented Nov 1, 2024

theosanderson commented Nov 1, 2024

theosanderson commented Nov 1, 2024 •

edited

Loading