Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop loading the entire node set into memory per tsh ssh connection #12014

Merged
merged 10 commits into from
May 2, 2022

Conversation

rosstimothy
Copy link
Contributor

When doing a tsh ssh node the proxySubsys would load the entire
set of nodes into memory to determine which server to route the
request to. Under heavy load, like an automated bot that periodically
spawns numerous session, a proxy could easily consume all
available memory.

To do prevent this, proxies now utilize a NodeWatcher, that maintains
a single node set in memory. This prevents loading the nodes into
memroy more than once, and also eliminates the need to unmarshal the
types.Server on each retrieval of the nodes. The NodeWatcher only
provides a GetNodes function that require a filter function to make
it intentionally challenging to retrieve a copy of the entire node set.

@rosstimothy rosstimothy force-pushed the tross/node_watcher branch 7 times, most recently from 7f09c3f to 95524ce Compare April 19, 2022 16:25
@rosstimothy rosstimothy marked this pull request as ready for review April 19, 2022 17:28
@github-actions github-actions bot added the tsh tsh - Teleport's command line tool for logging into nodes running Teleport. label Apr 19, 2022
@rosstimothy rosstimothy added backport-required robustness Resistance to crashes and reliability labels Apr 19, 2022
@rosstimothy
Copy link
Contributor Author

friendly ping @smallinsky @ibeckermayer

Copy link
Contributor

@smallinsky smallinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rosstimothy
Sorry for the delay. I have left one question about NodeWatcher approach vs access point cache where the NodeType is already watched

func (c *node) processEvent(ctx context.Context, event types.Event) error {

lib/reversetunnel/srv.go Show resolved Hide resolved
lib/services/presence.go Outdated Show resolved Hide resolved
lib/srv/regular/sshserver.go Outdated Show resolved Hide resolved
@rosstimothy
Copy link
Contributor Author

@rosstimothy Sorry for the delay. I have left one question about NodeWatcher approach vs access point cache where the NodeType is already watched

func (c *node) processEvent(ctx context.Context, event types.Event) error {

@smallinsky the NodeWatcher is intended to be how anything that needs to retrieve nodes, or a subset of them, would do so. While a cache may already be watching for types.KindNode, and a caller is already able to retrieve nodes from the cache via GetNodes or ListNodes, doing so can be quite expensive on larger clusters. Each time the entire set of nodes is retrieved from the cache each node has to be unmarshalled from the JSON representation that they are stored in. When multiple callers do this at once it causes a spike in memory that is proportional to the number of nodes, which has been observed to cause the proxy to be OOM killed on a cluster that had ~50,000 nodes. This is easily triggered by multiple tsh ssh sessions opened at once, which result in loading the entire set of nodes from the cache per session.

servers, err = t.srv.proxyAccessPoint.GetNodes(ctx.CancelContext(), t.namespace)

By leveraging the NodeWatcher we can be sure that the entire node set is only loaded into memory once, and by storing them as types.Server we don't have to pay the cost to unmarshal from JSON. The method by which a caller retrieves nodes from the NodeWatcher is also meant to ensure that only copies of the matched subset are made.

Copy link
Contributor

@smallinsky smallinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By leveraging the NodeWatcher we can be sure that the entire node set is only loaded into memory once, and by storing them as types.Server we don't have to pay the cost to unmarshal from JSON. The method by which a caller retrieves nodes from the NodeWatcher is also meant to ensure that only copies of the matched subset are made.

All right, thanks for the explanation.

I was curios what is actually benefit and checked benchmark for old cache approach vs watcher for 40_000 nodes:

BenchmarkListNodesWatcher-8   	      16	  68339640 ns/op	29017941 B/op	   40183 allocs/op
BenchmarkListMaxNodes-8   	           5      229637793 ns/op	80429200 B/op	 1520193 allocs/op

I have left some minor comment otherwise it LGTM.

When doing a `tsh ssh node` the proxySubsys would load the entire
set of nodes into memory to determine which server to route the
request to. Under heavy load, like an automated bot that periodically
spawns numerous session, a proxy could easily consume all
available memory.

To do prevent this, proxies now utilize a NodeWatcher, that maintains
a single node set in memory. This prevents loading the nodes into
memroy more than once, and also eliminates the need to unmarshal the
types.Server on each retrieval of the nodes. The NodeWatcher only
provides a GetNodes function that require a filter function to make
it intentionally challenging to retrieve a copy of the entire node set.
Copy link
Contributor

@ibeckermayer ibeckermayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rosstimothy apologies, I've been head down and wasn't giving needed attention to my GitHub notifications.

Possibly a naive question and/or the premise may be false, but aren't we using the GetNodes you're replacing in many other parts of the codebase? Why should those not also be replaced with a call to a NodeWatcher to prevent a similar potential problem?

@rosstimothy
Copy link
Contributor Author

@rosstimothy apologies, I've been head down and wasn't giving needed attention to my GitHub notifications.

Possibly a naive question and/or the premise may be false, but aren't we using the GetNodes you're replacing in many other parts of the codebase? Why should those not also be replaced with a call to a NodeWatcher to prevent a similar potential problem?

Good question. Anything service other than auth which could potentially call GetNodes concurrently should definitely make use of the NodeWatcher. Doing a quick check it looks like there may be a couple spots in the web api that should probably be migrated.

@ibeckermayer
Copy link
Contributor

Doing a quick check it looks like there may be a couple spots in the web api that should probably be migrated.

Sounds good, your call whether to refactor them here or just create an issue for it.

@rosstimothy
Copy link
Contributor Author

Doing a quick check it looks like there may be a couple spots in the web api that should probably be migrated.

Sounds good, your call whether to refactor them here or just create an issue for it.

Addressed this in 18f9570

@rosstimothy rosstimothy merged commit fa12352 into master May 2, 2022
rosstimothy added a commit that referenced this pull request May 10, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)

# Conflicts:
#	lib/reversetunnel/api.go
#	lib/service/service.go
#	lib/services/watcher.go
#	lib/srv/regular/proxy.go
#	lib/srv/regular/sshserver.go
#	lib/web/apiserver.go
#	lib/web/apiserver_test.go
#	lib/web/terminal.go
#	tool/tsh/proxy_test.go
#	tool/tsh/tsh.go
rosstimothy added a commit that referenced this pull request May 11, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)

# Conflicts:
#	lib/services/watcher.go
#	lib/web/apiserver_test.go
@rosstimothy
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
branch/v7
branch/v8
branch/v9

Questions ?

Please refer to the Backport tool documentation

rosstimothy added a commit that referenced this pull request May 11, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 11, 2022
…12014) (#12571)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 11, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 11, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 12, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 12, 2022
…12014)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 12, 2022
…12014) (#12562)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
rosstimothy added a commit that referenced this pull request May 12, 2022
…12014) (#12573)

* Prevent proxy from loading entire node set into memory more than once

When establishing a new session to a node, the proxy would load the
entire node set into memory in an attempt to find the matching host. For
smaller clusters this may not be that problematic. But on larger clusters,
loading >40k nodes into memory from the cache can be quite expensive.
This problem is compounded by the fact that it happened**per** session,
which could potentially cause the proxy to consume all available memory
and be OOM killed.

A new `NodeWatcher` is introduced which will maintain an in memory list
of all nodes per process. The watcher leverages the existing resource
watcher system and stores all nodes as types.Server, to eliminate the
cost incurred by unmarshalling the nodes from the cache. The `NodeWatcher`
provides a way to retrieve a filtered list of nodes in order to reduce the number
of copies made to only the matches.

(cherry picked from commit fa12352)
@webvictim webvictim mentioned this pull request Jun 8, 2022
@rosstimothy rosstimothy deleted the tross/node_watcher branch July 6, 2022 21:21
@rosstimothy rosstimothy changed the title Stop loading the enitre node set into memory per tsh ssh connection Stop loading the entire node set into memory per tsh ssh connection Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-required robustness Resistance to crashes and reliability tsh tsh - Teleport's command line tool for logging into nodes running Teleport.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants