Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use extraFiles feature to cull idle kernels #563

Merged
merged 6 commits into from
Aug 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions docs/howto/configure/culling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
(configure:culling)=
# Manage resource culling
choldgraf marked this conversation as resolved.
Show resolved Hide resolved

To improve resource management, every user server that's not actively being used, it's shut down by the [jupyterhub-idle-culler](https://github.com/jupyterhub/jupyterhub-idle-culler) Hub service. Thus, any user pod, will be taken down by the idle culler when they are in an idle state.

Since the server's kernel activity counts as server activity, the idle-culler also operates at a kernel level. This means that if a user leaves a notebook with a running kernel, the kernel will be shut down, if idle for the specified `timeout` period.

## User server culling configuration

To configure the server's different culling options, these options must be specified on a per-hub basis, under the appropriate configuration file in `config/hubs`.

Example:

```yaml
config:
jupyterhub:
cull:
# Cull after 30min of inactivity
every: 300
timeout: 1800
# No pods over 12h long
maxAge: 43200
```

More culling options and information about them can be found in the [idle-culler documentation](https://github.com/jupyterhub/jupyterhub-idle-culler#readme).

## Kernel culling configuration

The kernel culling options are configured through the `jupyter_notebook_config.json` file, located at `/usr/local/etc/jupyter/jupyter_notebook_config.json` in the user pod. This file is injected into the pod’s container on startup, by defining its location and content under [`singleuser.extraFiles`](https://zero-to-jupyterhub.readthedocs.io/en/latest/resources/reference.html#singleuser-extrafiles) dictionary.

You can modify the current culling options values, under `singleuser.extraFiles.data`, in the `hub-templates/basehub/values.yaml` file.

Example:

```yaml
singleuser:
extraFiles:
jupyter_notebook_config.json:
mountPath: /usr/local/etc/jupyter/jupyter_notebook_config.json
data:
MappingKernelManager:
# shutdown kernels after no activity
cull_idle_timeout: 3600
# check for idle kernels this often
cull_interval: 300
# a kernel with open connections but no activity still counts as idle
cull_connected: true
```

### Note
If a user leaves a notebook with a running kernel, the idle timeout will typically be the cull idle timeout of the server + the cull idle timeout set for the kernel, as culling the kernel will register activity, resetting the `no_activity` timer for the server as a whole.

1 change: 1 addition & 0 deletions docs/howto/configure/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@
auth-management.md
add-hub-domains.md
update-env.md
culling.md
```
34 changes: 18 additions & 16 deletions hub-templates/basehub/values.yaml
Original file line number Diff line number Diff line change
@@ -1,19 +1,3 @@
etcJupyter:
jupyter_notebook_config.json:
# if a user leaves a notebook with a running kernel,
# the effective idle timeout will typically be CULL_TIMEOUT + CULL_KERNEL_TIMEOUT
# as culling the kernel will register activity,
# resetting the no_activity timer for the server as a whole
MappingKernelManager:
# shutdown kernels after no activity
cull_idle_timeout: 3600
# check for idle kernels this often
cull_interval: 300
# a kernel with open connections but no activity still counts as idle
# this is what allows us to shutdown servers
# when people leave a notebook open and wander off
cull_connected: true

nfsPVC:
enabled: true
shareCreator:
Expand Down Expand Up @@ -100,6 +84,24 @@ jupyterhub:
letsencrypt:
contactEmail: [email protected]
singleuser:
extraFiles:
jupyter_notebook_config.json:
mountPath: /usr/local/etc/jupyter/jupyter_notebook_config.json
# if a user leaves a notebook with a running kernel,
# the effective idle timeout will typically be cull idle timeout
# of the server + the cull idle timeout of the kernel,
# as culling the kernel will register activity,
# resetting the no_activity timer for the server as a whole
data:
MappingKernelManager:
# shutdown kernels after no activity
cull_idle_timeout: 3600
# check for idle kernels this often
cull_interval: 300
# a kernel with open connections but no activity still counts as idle
# this is what allows us to shutdown servers
# when people leave a notebook open and wander off
cull_connected: true
startTimeout: 600 # 10 mins, because sometimes we have too many new nodes coming up together
defaultUrl: /tree
nodeSelector:
Expand Down