Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overload manager: overload signals based on number of downstream connections and active requests #12419

Open
antoniovicente opened this issue Jul 31, 2020 · 22 comments · Fixed by #23361, #18256 or #28202
Assignees

Comments

@antoniovicente
Copy link
Contributor

It would be helpful to add resource monitors to track the number of sockets used for (1) downstream connections, (2) active requests and (3) possibly upstream connections in order to provide better protection against resource attacks against configured fd rlimits. Tracking these counts proxy-wide should be sufficient. The motivations behind this enhancement include a desire for more consistent configuration of resource limits and actions we can take when approaching overload, recently introduced parameters to limit the max number of client connections (globally or per listener) to protect against fd rlimit attacks, and the introduction of more graceful options to handle increases in resource usage, including the introduction of adaptive HTTP request timeouts in #11427.

Possible future enhancements:

  • A resource pressure signal based on number of upstream connections would benefit from some additional overload actions specific to upstream connections, including reductions in idle upstream connection timeouts, disabling of keep-alive or adjustments to prefetch logic.
  • Being able to configure thresholds and apply overload manager actions to client connections associated with specific listeners in order to prioritize connections to specific listeners. That said, I don't think our deployment scenarios have an immediate need for such a hook.
@antoniovicente
Copy link
Contributor Author

cc: @nezdolik

@mattklein123 mattklein123 added area/overload_manager no stalebot Disables stalebot from closing an issue labels Jul 31, 2020
@mattklein123
Copy link
Member

@nezdolik once you accept the org invite we can assign the issue. Thank you for working on this!

@nezdolik
Copy link
Member

nezdolik commented Aug 1, 2020

Thanks @mattklein123 @antoniovicente, I believe issue assignment should work now.

@akonradi
Copy link
Contributor

akonradi commented Aug 3, 2020

Do we have any established patterns for aggregating counts of objects across multiple workers efficiently? We probably have stats for most of the metrics we'd want to use but I don't know if the flush frequency is high enough for the overload manager to make use of them.

@antoniovicente
Copy link
Contributor Author

It may make sense to keep an atomic counter with the aggregate number of active client connections across all listeners and clusters.

It may make sense to think of that number as separate from stats.

@akonradi
Copy link
Contributor

akonradi commented Aug 3, 2020

What about having aggregate counts in the individual worker threads via a TLS object, then having the resource monitor aggregate those using an atomic counter when the overload manager initiates a poll?

@mattklein123
Copy link
Member

Thanks @nezdolik! I will take a look next week. OOO for a bit.

@mattklein123 mattklein123 added help wanted Needs help! and removed no stalebot Disables stalebot from closing an issue labels Mar 8, 2021
@nezdolik
Copy link
Member

I think this task needs to be splitted into 3 parts:

  • Introduce reactive checks into overload manager
  • Plug OM into dispatcher
  • Write the actual resource monitor & migrate code from TCP Listener

@antoniovicente
Copy link
Contributor Author

I think this task needs to be splitted into 3 parts:

  • Introduce reactive checks into overload manager
  • Plug OM into dispatcher
  • Write the actual resource monitor & migrate code from TCP Listener

Splitting sounds fine. I'm curious about the details behind each of the 3 parts you describe above.

@nezdolik
Copy link
Member

  1. Introduce reactive checks into overload manager: OM currently only supports periodic recalculations of resource usage + flushing updates to action callbacks. For the case of global connection limit overflow this type of check may be too slow, so we should support "immediate/reactive" check. If we proceed with example of global downstream connections, every time before TCP listener accepts new connection it fires on-demand check to OM for the number of global connections, OM would reactively recalculate corresponding resource usage and trigger callbacks for overload action (if it changes state).
  2. Plug OM into dispatcher: If we continue with example of global downstream connections and have reactive checks in place, then TCPListener will need to have access to OM instance to invoke a reactive check (with incremented counter for number of accepted connections). I did some prior investigation on scope of work and it feels like a big task.
  3. Write the actual resource monitor & migrate code from TCP Listener: The last part is to move code that tracks number of accepted connections and compares it against global maximum (set via runtime) from TCP listener to OM (+write new monitor). TCP listener should instead invoke reactive checks in OM and register callbacks for "Stop accepting connections" overload action.

@antoniovicente
Copy link
Contributor Author

  1. Introduce reactive checks into overload manager: OM currently only supports periodic recalculations of resource usage + flushing updates to action callbacks. For the case of global connection limit overflow this type of check may be too slow, so we should support "immediate/reactive" check. If we proceed with example of global downstream connections, every time before TCP listener accepts new connection it fires on-demand check to OM for the number of global connections, OM would reactively recalculate corresponding resource usage and trigger callbacks for overload action (if it changes state).
  2. Plug OM into dispatcher: If we continue with example of global downstream connections and have reactive checks in place, then TCPListener will need to have access to OM instance to invoke a reactive check (with incremented counter for number of accepted connections). I did some prior investigation on scope of work and it feels like a big task.

I wonder if the listener can get the thread local overload state from thread-local-storage in some way. If that approach is not possible we may want to consider plumbing OM to dispatcher creation in a way that is similar to the plumbing done in #14679 to add extra arguments to the factory method that creates dispatchers in worker threads.

  1. Write the actual resource monitor & migrate code from TCP Listener: The last part is to move code that tracks number of accepted connections and compares it against global maximum (set via runtime) from TCP listener to OM (+write new monitor). TCP listener should instead invoke reactive checks in OM and register callbacks for "Stop accepting connections" overload action.

@nezdolik
Copy link
Member

@antoniovicente the listener would get thread local overload state but it should also be able to trigger reactive check in OM (which would recalculate overload action state on demand and post it to all interested callbacks on worker threads), so is bidirectional interaction.
How about something like this:

overload_manager.updateResource(
      std::string resource_name, uint_64 local_increment);

The listener would register for action in overload manager via registerForAction(const std::string&, Event::Dispatcher&, OverloadActionCb) as usual. OM would instantiate a reactive resource monitor (if configured) with global atomic counter for number of connections. Listener will invoke on demand overload_manager.updateResource(...) upon receiving socket event, reporting it's number of currently accepted sockets and asking to update reactive resource monitor for global number of connections. Om would immediately (on next main dispatcher event loop run, eg not via delayed periodic flushes) trigger resource usage recalculation in resource monitor, which will in turn invoke callbacks in OM and reevaluate overload actions tied to that resource. Om then would post updated overload action state (tied to that resource) to worker threads via their dispatchers and invoke registered callbacks.

This is how I envision reactive checks in overload manager but maybe there are better ways.

cc @akonradi

@nezdolik
Copy link
Member

nezdolik commented Mar 30, 2021

The proposed above approach requires OM reference to be accessible in Tcp Listener.

@akonradi
Copy link
Contributor

which would recalculate overload action state on demand and post it to all interested callbacks on worker threads

If this is the only way that workers can learn that they shouldn't make new downstream connections (and if there's another way, I missed it), that's not going to be fast enough. If we want to absolutely make sure of anything, we need a counter that is incremented and read atomically. Otherwise there's still the possibility of a race: suppose worker A calls updateResource(), incrementing the "envoy.overload.downstream_connection_count" counter to the configured max. This posts an update event to set all workers' thread-local state for the connection count. Worker B, already in the middle of processing a callback, checks its thread-local state, sees that the count is below the limit, and calls the same updateResource.

Having a thread-local cached copy of these atomic counters sounds like a good way to get useful information in cases where we don't need perfect consistency, like if we wanted to reduce timeouts as the number of connections increased, but if there's a hard cap we want to honor, we need to pick some point along the line between "single shared atomic counter" and "allocate each worker its own quota" (with a midpoint being "workers pull from the shared counter into their local bucket then allocate from there").

nit: updateResource should take something other than a string, or should be called on some kind of handle object. String comparison (including as part of lookup or case matching) is unnecessarily expensive and we should avoid it as much as possible on the data path, including when establishing connections. Also, updateResource() needs to take a signed integer to be able to signal when a connection is closed, right?

@antoniovicente
Copy link
Contributor Author

antoniovicente commented Apr 6, 2021

I think that the name "reactive" threw me off. I think that the check that you're proposing is a "proactive" check rather than a reactive mechanism like the current reactive check that operates based on the most recently propagated overload state information which is computed periodically.

Regarding interface, I think something like bool tryAllocateResource() which atomically increments the number of active resource users if resource usage is below the limit would do the trick. This method would return true if allocation succeeded, else return false. The resource tracker would also need a releaseResource() method that can be used to decrement usage.

Ideally you'ld query the overload manager for the proactive resource tracker during startup as part of the process of creating listeners and keep a reference to it in the listener so each update to the resource tracker does not require a map lookup by name.

@nezdolik
Copy link
Member

nezdolik commented Apr 6, 2021

@akonradi i may have not provided enough details on suggested approach. Worker threads will not be aware of global max+global current counters, they only maintain local counters (eg per listener) and thread local state for overload action. Atomic global counter is stored in resource monitor in OM on main thread. Upon trying to accept new connection worker thread contacts OM with "increment", OM in turn propagates that increment to resource monitor which atomically increments the global counter. Resource monitor triggers recalculation of overload actions (tied to that resource) and then OM propagates new state of overload action to worker threads.
This approach is slower compared to doing inline check that returns boolean (what @antoniovicente mentioned), as OM needs to trigger all related callbacks and update overload action state for them. On the other hand it aligns more with existing OM model&approach of OM client code registering for actions.
@antoniovicente thank you for looking at PR, i did not want to link it until we agree on approach. And yes, "proactive" term suits better :)
From past discussions, it felt like we are optimising for speed (of checks), so going with boolean checks+atomic counters in OM sounds like a way to go.
I will update PR according to our discussions.

@nezdolik
Copy link
Member

nezdolik commented Nov 2, 2022

@KBaichoo this issue is not done yet. We are 60-70% done with one of the suggested monitors. Need to plug downstream conns monitor into tcp listener and deprecate existing mechanism that tracks downstream conns. The remaining monitors are 0% done (active requests and upstream connections).

@KBaichoo
Copy link
Contributor

KBaichoo commented Nov 2, 2022

Whoops sorry, this autoclosed when merging the PR. Reopen, thanks!

@KBaichoo KBaichoo reopened this Nov 2, 2022
kyessenov pushed a commit that referenced this issue Oct 3, 2023
Deprecate runtime key `overload.global_downstream_max_connections` and track global active downstream connections limit in overload manager instead. The runtime key is still usable but using it yields deprecation warning. If both mechanisms are configured (overload resource monitor and runtime key), overload manager config will be preferred.

Commit Message:
Additional Description:
Risk Level: High (change affects request hot path)
Testing: Done
Docs Changes: Done
Release Notes: TBD
Platform Specific Features: NA
Fixes #12419
Deprecated: Added deprecation note for runtime key `overload.global_downstream_max_connections`
@nezdolik
Copy link
Member

nezdolik commented Oct 3, 2023

@kyessenov could you please reopen this issue? This is an umbrella ticket for multiple tasks in overload manager.

@nezdolik
Copy link
Member

nezdolik commented May 6, 2024

@KBaichoo @botengyao would it be beneficial to introduce resource monitor (3) from original suggestion to track & limit the number of sockets used for the upstreams? Is sounds useful to me.

@KBaichoo
Copy link
Contributor

KBaichoo commented May 7, 2024

If you have a strong use case it could make sense, but I'm a bit more skeptical on the value of limiting the number of upstream connections globally given It is not entirely controlled by an attacker e.g. we'd have some connection reuse for various streams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment