Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent relative expiry from emitting more events than can be processed #12002

Merged
merged 5 commits into from
Apr 25, 2022

Conversation

rosstimothy
Copy link
Contributor

This alters relative expiry in four ways:

  1. Relative expiry is no longer exclusively run by the auth cache
  2. Relative expiry no longer emits delete events
  3. There is now a limit to the number of nodes removed per interval
  4. Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.
@rosstimothy rosstimothy requested a review from fspmarshall April 15, 2022 16:49
@rosstimothy rosstimothy added backport-required scale Changes required to achieve 100K nodes per cluster. labels Apr 19, 2022
lib/cache/cache.go Outdated Show resolved Hide resolved
lib/cache/cache.go Outdated Show resolved Hide resolved
@rosstimothy
Copy link
Contributor Author

friendly ping @LKozlowski

@rosstimothy rosstimothy enabled auto-merge (squash) April 25, 2022 13:00
@rosstimothy rosstimothy merged commit b8394b3 into master Apr 25, 2022
@rosstimothy rosstimothy deleted the tross/relative_expiry_limits branch April 25, 2022 13:22
rosstimothy added a commit that referenced this pull request Apr 26, 2022
…ed (#12002)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)

# Conflicts:
#	lib/cache/cache.go
#	lib/cache/cache_test.go
@rosstimothy
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
branch/v7
branch/v8
branch/v9

Questions ?

Please refer to the Backport tool documentation

rosstimothy added a commit that referenced this pull request Apr 26, 2022
…ed (#12002)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)
rosstimothy added a commit that referenced this pull request Apr 26, 2022
…ed (#12002)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)
rosstimothy added a commit that referenced this pull request Apr 27, 2022
…ed (#12002) (#12245)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)
rosstimothy added a commit that referenced this pull request May 2, 2022
…ed (#12002) (#12246)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)

# Conflicts:
#	lib/cache/cache.go
#	lib/cache/cache_test.go
rosstimothy added a commit that referenced this pull request May 2, 2022
…ed (#12002) (#12247)

* Prevent relative expiry from emitting more events than can be processed

This alters relative expiry in four ways:

1) Relative expiry is no longer exclusively run by the auth cache
2) Relative expiry no longer emits delete events
3) There is now a limit to the number of nodes removed per interval
4) Relative expiry now runs more frequently

We can remove the need to emit any fake delete events during relative
expiry by not running it exclusively in the auth cache. All caches
will now run relative expiry themselves, even the components that
don't watch for nodes - this is effectively a noop for them. To
prevent the individual caches from getting too far out of sync, the
expiry interval is set to a much smaller value and we limit the
number of nodes being deleted per interval.

(cherry picked from commit b8394b3)
@webvictim webvictim mentioned this pull request Jun 8, 2022
rosstimothy added a commit that referenced this pull request Feb 15, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
github-merge-queue bot pushed a commit that referenced this pull request Feb 15, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
github-actions bot pushed a commit that referenced this pull request Feb 15, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
github-actions bot pushed a commit that referenced this pull request Feb 15, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
github-merge-queue bot pushed a commit that referenced this pull request Feb 17, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
github-merge-queue bot pushed a commit that referenced this pull request Feb 17, 2024
When relative node expiry was originally implemented it only ran
on Auth and propagated delete events downstream. However, that
was quickly changed in #12002 because the event system could not
keep up in clusters with high churn and would result in buffer
overflow errors in downstream caches. The event system has been
overhauled to use FanoutV2, which no longer suffers from the
burst of events causing problems. The change to not propagate
delete events also breaks any node watchers on the local
cache from ever expiring the server since they never receive
a delete event.

This reverts some of the changes from #12002 such that relative
expiry now only runs on the auth cache and emits delete events.
Each relative expiry interval is however still limited to only
remove up to a fixed number of nodes.

Closes #37527
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-required scale Changes required to achieve 100K nodes per cluster.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants