Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

garbage collection plan could leave job-exec checkpoint with dangling refrerences #4201

Open
Tracked by #4428
garlick opened this issue Mar 9, 2022 · 0 comments
Open
Tracked by #4428

Comments

@garlick
Copy link
Member

garlick commented Mar 9, 2022

Problem: if we garbage the content store by preserving only blobs referenced by the most recent kvs-primary checkpoint, as suggested in #258, then the guest namespace root references written to job-exec.kvs-namespaces (for preserving running jobs across a restart) could end up wtih dangling blobrefs.

Here's an example of what we are writing now:

[
  {
    "id": 1515598580285440,
    "owner": 5588,
    "kvsroot": "sha1-2747efbcf83b485c0d9c62dd3616e724241a609e"
  }
]

The problem is that sha1-2747efbcf83b485c0d9c62dd3616e724241a609 might not be referenced from a directory that would be visitied when walking from the final kvs-primary rootref.

Three approaches come to mind:

  1. Write the guest namespaces out as independent checkpoints. For example, in addition to kvs-primary, we write out kvs-guest-<jobid>.

  2. Create a subdirectory for each preserved namespace in job-exec, e.g. the directory entry is a dirref that references the root blobref for the namespace.

  3. Replace the job.<id>.guest symlink with the actual directory, exactly like we normally do for completed jobs. On restart, job-exec would need to change the directory back to a symlink when it recreates the namespace.

I think 3 would be the path of least resistance, since it's what we do already, and it doesn't complicate the checkpoint key space, which we might want to use exclusively for defensive checkpoints of the primary namespace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant