Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/cgroups: method to read file-backed memory on inactive LRU list #68310

Merged
merged 1 commit into from
Aug 12, 2021

Conversation

abarganier
Copy link
Contributor

util/cgroups: method to read file-backed memory on inactive LRU list

Currently we have methods that allows us to read the cgroup memory
limit, as well as the current memory usage, for processes running in
unix containers. However, to more accurately determine the current
memory usage in the eyes of the container provider, we must subtract
the "cache usage" from the total memory usage, which is represented
by the inactive file-backed memory stat.

From the Docker documentation:

"On Linux, the Docker CLI reports memory usage by subtracting cache
usage from the total memory usage. The API does not perform such a
calculation but rather provides the total memory usage and the amount
from the cache so that clients can use the data as needed. The cache
usage is defined as the value of total_inactive_file field in the
memory.stat file on cgroup v1 hosts...On cgroup v2 hosts, the cache
usage is defined as the value of inactive_file field."

https://docs.docker.com/engine/reference/commandline/stats/#extended-description

In an effort to gain better observability into current memory usage in
the eyes of the container provider for purposes of identifying whether
or not a CRDB node is on its way to OOM, we add the ability to read
these values from the memory subsystem. This heuristic can then be
used in debug tools such as periodic query dumps and eventually,
more generalized crash dump platforms.

Informs #66901

@abarganier abarganier requested review from itsbilal, knz and a team August 1, 2021 17:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@abarganier
Copy link
Contributor Author

Please ignore 1st commit that adds current mem. usage when reviewing (can be reviewed separately @ #68204). Will rebase once merged.

@abarganier abarganier changed the title Cgroups cache usage util/cgroups: method to read file-backed memory on inactive LRU list Aug 1, 2021
Copy link
Collaborator

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @itsbilal, and @knz)


pkg/util/cgroups/cgroups_test.go, line 118 at r2 (raw file):

}

func TestCgroupsGetMemoryInactiveFileUsage(t *testing.T) {

Nit: Can you add a comment that elaborates on what this test checks?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1, 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @itsbilal)


pkg/util/cgroups/cgroups.go, line 63 at r2 (raw file):

// current process.
//
//lint:ignore U1001 TODO(abarganier): will use in querydumper once merged

nit: the idiomatic way would be to do

// TODO(...): ...
var _ = GetMemoryUsage

in the file where you plan to use this later.


pkg/util/cgroups/cgroups.go, line 89 at r2 (raw file):

	}

	// no memory controller detected

full sentences in comments.


pkg/util/cgroups/cgroups.go, line 111 at r2 (raw file):

}

// `root` is set to "/" in production code and exists only for testing.

it's customary to start function comments by a sentence that describes the function "xxx does yyy." and then follow up with details.


pkg/util/cgroups/cgroups.go, line 120 at r2 (raw file):

	}

	// no memory controller detected

full sentences in comments.


pkg/util/cgroups/cgroups.go, line 142 at r2 (raw file):

}

// `root` is set to "/" in production code and exists only for testing.

ditto


pkg/util/cgroups/cgroups.go, line 301 at r2 (raw file):

}

// Finds memory limit value for cgroup V1 via looking in

here and below, ditto. Function name first then what the function does.

This ensures that pop-ups in code editors do the right thing.


pkg/util/cgroups/cgroups.go, line 504 at r2 (raw file):

	// Check for controller specifically in cgroup v1 (it is listed in super options field),
	// as the value can't be found if it is not enforced

missing period at end of sentence.

@abarganier abarganier force-pushed the cgroups-cache-usage branch from 4577e27 to 597c8e3 Compare August 6, 2021 19:52
Copy link
Contributor Author

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, much appreciated!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @rimadeodhar)


pkg/util/cgroups/cgroups.go, line 63 at r2 (raw file):

Previously, knz (kena) wrote…

nit: the idiomatic way would be to do

// TODO(...): ...
var _ = GetMemoryUsage

in the file where you plan to use this later.

Ah yes, this is much better, thanks for the tip. Since the file I want to call this in hasn't been merged yet, would calling it at the top of this same file be preferred?


pkg/util/cgroups/cgroups.go, line 89 at r2 (raw file):

Previously, knz (kena) wrote…

full sentences in comments.

Done.


pkg/util/cgroups/cgroups.go, line 111 at r2 (raw file):

Previously, knz (kena) wrote…

it's customary to start function comments by a sentence that describes the function "xxx does yyy." and then follow up with details.

Done.


pkg/util/cgroups/cgroups.go, line 120 at r2 (raw file):

Previously, knz (kena) wrote…

full sentences in comments.

Done.


pkg/util/cgroups/cgroups.go, line 142 at r2 (raw file):

Previously, knz (kena) wrote…

ditto

Done.


pkg/util/cgroups/cgroups.go, line 301 at r2 (raw file):

Previously, knz (kena) wrote…

here and below, ditto. Function name first then what the function does.

This ensures that pop-ups in code editors do the right thing.

Done.


pkg/util/cgroups/cgroups.go, line 504 at r2 (raw file):

Previously, knz (kena) wrote…

missing period at end of sentence.

Done.


pkg/util/cgroups/cgroups_test.go, line 118 at r2 (raw file):

Previously, rimadeodhar (Rima Deodhar) wrote…

Nit: Can you add a comment that elaborates on what this test checks?

Done - I assume this is to call out the publicly exported function that this is testing, since we call the private delegate function here?

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @itsbilal)


pkg/util/cgroups/cgroups.go, line 63 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

Ah yes, this is much better, thanks for the tip. Since the file I want to call this in hasn't been merged yet, would calling it at the top of this same file be preferred?

yes that would work too

Currently we have methods that allows us to read the cgroup memory
limit, as well as the current memory usage, for processes running in
unix containers. However, to more accurately determine the current
memory usage in the eyes of the container provider, we must subtract
the "cache usage" from the total memory usage, which is represented
by the inactive file-backed memory stat.

From the Docker documentation:

"On Linux, the Docker CLI reports memory usage by subtracting cache
usage from the total memory usage. The API does not perform such a
calculation but rather provides the total memory usage and the amount
from the cache so that clients can use the data as needed. The cache
usage is defined as the value of total_inactive_file field in the
memory.stat file on cgroup v1 hosts...On cgroup v2 hosts, the cache
usage is defined as the value of inactive_file field."

https://docs.docker.com/engine/reference/commandline/stats/#extended-description

In an effort to gain better observability into current memory usage in
the eyes of the container provider for purposes of identifying whether
or not a CRDB node is on its way to OOM, we add the ability to read
these values from the memory subsystem. This heuristic can then be
used in debug tools such as periodic query dumps and eventually,
more generalized crash dump platforms.

Release note: none
@abarganier abarganier force-pushed the cgroups-cache-usage branch from 597c8e3 to 6f3981f Compare August 11, 2021 14:35
Copy link
Contributor Author

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @knz)


pkg/util/cgroups/cgroups.go, line 63 at r2 (raw file):

Previously, knz (kena) wrote…

yes that would work too

Thanks, sounds good/Done.

@abarganier
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 12, 2021

Build succeeded:

@craig craig bot merged commit de520ba into cockroachdb:master Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants