Health checking should allow for health data to come from a 3rd service #304

samsp-msft · 2020-07-06T23:28:31Z

If you have a large deployment, with multiple instances of YARP fronting the same services across a number of destinations, then having each instance of YARP collecting health data from all destinations can become a problem. If used in a mesh scenario, it becomes N^2 as each service is trying to determine the health of the other.

We should account for this from the start and support having a 3rd service be able to supply health data. So rather than having to poll each destination, you can ask a different authority for heath info, and it will supply it for all destinations.

samsp-msft · 2020-07-09T18:31:00Z

Note: Design here needs to support the ability to do this scenario, we don't need to write the code to integrate with those 3rd services, just have the extensibility point.

3GDXC · 2020-09-17T21:43:56Z

@samsp-msft this could be IHMO be done in a similar way to service discovery, where the services register with the reverse-proxy do a heartbeat broadcast (udp/tcp or gRPC/REST) if said heartbeat isn't done within a given period the reverse-proxy service informs others the destination is offline either by way of updating shared state store or broadcasting a message to nodes in cluster again via (udp/tcp or gRPC/REST)

Tratcher · 2020-11-17T19:34:48Z

One proposal is that this could be implemented as an IActiveHealthCheckMonitor that queried the central health store rather than individual destinations.

samsp-msft · 2020-11-17T21:42:37Z

This is a scenario where every deployment is probably different, and will use different mechanisms. The goal should be to provide the extensibility required so that the developer can easily integrate with whatever mechanisms have been chosen for that deployment.
Is it sufficient to define an interface like IActiveHealthCheckMonitor that can be implemented by the proxy consumer, or do we need to have a REST/gRPC type endpoint for collecting health data.

Tratcher · 2020-11-17T22:00:21Z

Fair question. We'll have to look at some health reporting systems to see if they're pull or push, what protocols they support, etc.

Does anybody have suggestions for centralized health systems we should look at interoping with?

rwkarg · 2022-03-01T23:26:44Z

Our usage isn't yet at the scale to require this, but our plan is to have the YARP proxies also part of an Orleans cluster. Then a Grain would represent an Active Health Check against a (destination, source_availability_zone) pair. Ex. dest_1/west-1a would be one Grain and all proxy instances in west-1a would ask that Grain for dest_1's health. Similarly for dest_1/west-1b, etc. Grain Observers could also be used to push health status transitions instead of having each proxy actively polling for status.

As the proxies scale up/down, the health checks would be distributed across the available instances and there would be only N Active checks for N services (per AZ).

This should be possible to implement very similarly to ActiveHealthCheckMonitor by swapping out the EntityActionScheduler and calling in to Grain instances instead.

It would require some extra setup if opting in to this (storage/config for Orleans clustering) but it does remove the need to manage a completely separate health checking system.

samsp-msft · 2022-05-17T19:55:13Z

Duplicate of #1723

samsp-msft added the Type: Idea This issue is a high-level idea for discussion. label Jul 6, 2020

samsp-msft added this to the 1.0.0 milestone Jul 9, 2020

Tratcher mentioned this issue Sep 21, 2020

Proxy should have active health checks to confirm the state of destinations #228

Closed

alnikola mentioned this issue Oct 27, 2020

Active and passive health checks #459

Merged

karelz modified the milestones: YARP 1.0.0, Backlog Mar 29, 2021

samsp-msft marked this as a duplicate of #1723 May 17, 2022

samsp-msft closed this as completed May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Health checking should allow for health data to come from a 3rd service #304

Health checking should allow for health data to come from a 3rd service #304

samsp-msft commented Jul 6, 2020

samsp-msft commented Jul 9, 2020

3GDXC commented Sep 17, 2020

Tratcher commented Nov 17, 2020

samsp-msft commented Nov 17, 2020

Tratcher commented Nov 17, 2020 •

edited

Loading

rwkarg commented Mar 1, 2022

samsp-msft commented May 17, 2022

Health checking should allow for health data to come from a 3rd service #304

Health checking should allow for health data to come from a 3rd service #304

Comments

samsp-msft commented Jul 6, 2020

samsp-msft commented Jul 9, 2020

3GDXC commented Sep 17, 2020

Tratcher commented Nov 17, 2020

samsp-msft commented Nov 17, 2020

Tratcher commented Nov 17, 2020 • edited Loading

rwkarg commented Mar 1, 2022

samsp-msft commented May 17, 2022

Tratcher commented Nov 17, 2020 •

edited

Loading