You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem: if we garbage the content store by preserving only blobs referenced by the most recent kvs-primary checkpoint, as suggested in #258, then the guest namespace root references written to job-exec.kvs-namespaces (for preserving running jobs across a restart) could end up wtih dangling blobrefs.
The problem is that sha1-2747efbcf83b485c0d9c62dd3616e724241a609 might not be referenced from a directory that would be visitied when walking from the final kvs-primary rootref.
Three approaches come to mind:
Write the guest namespaces out as independent checkpoints. For example, in addition to kvs-primary, we write out kvs-guest-<jobid>.
Create a subdirectory for each preserved namespace in job-exec, e.g. the directory entry is a dirref that references the root blobref for the namespace.
Replace the job.<id>.guest symlink with the actual directory, exactly like we normally do for completed jobs. On restart, job-exec would need to change the directory back to a symlink when it recreates the namespace.
I think 3 would be the path of least resistance, since it's what we do already, and it doesn't complicate the checkpoint key space, which we might want to use exclusively for defensive checkpoints of the primary namespace.
The text was updated successfully, but these errors were encountered:
Problem: if we garbage the content store by preserving only blobs referenced by the most recent kvs-primary checkpoint, as suggested in #258, then the guest namespace root references written to
job-exec.kvs-namespaces
(for preserving running jobs across a restart) could end up wtih dangling blobrefs.Here's an example of what we are writing now:
The problem is that
sha1-2747efbcf83b485c0d9c62dd3616e724241a609
might not be referenced from a directory that would be visitied when walking from the finalkvs-primary
rootref.Three approaches come to mind:
Write the guest namespaces out as independent checkpoints. For example, in addition to
kvs-primary
, we write outkvs-guest-<jobid>
.Create a subdirectory for each preserved namespace in
job-exec
, e.g. the directory entry is a dirref that references the root blobref for the namespace.Replace the
job.<id>.guest
symlink with the actual directory, exactly like we normally do for completed jobs. On restart, job-exec would need to change the directory back to a symlink when it recreates the namespace.I think 3 would be the path of least resistance, since it's what we do already, and it doesn't complicate the checkpoint key space, which we might want to use exclusively for defensive checkpoints of the primary namespace.
The text was updated successfully, but these errors were encountered: