Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repo write operations (e.g. pulls, commits) should respect exclusive locks even when not using transaction API #2474

Closed
jlebon opened this issue Oct 26, 2021 · 9 comments

Comments

@jlebon
Copy link
Member

jlebon commented Oct 26, 2021

Right now, there is nothing preventing e.g. ostree pull/pull-local to write things to the repo while it's actively being pruned. This introduces possible races resulting in incomplete commit objects. E.g. an object might not get pulled because it's already present, but the pruner has already decided it is unreachable and plans to delete it.

It looks like ostree commit currently always uses transactions, though rpm-ostree compose tree is susceptible to this because it will disable transactions in some situations. We should adapt it to try to get a shared lock if not using transactions.

ostree pull and ostree pull-local don't currently use transactions AFAICT, so they are susceptible as well. We could use transactions there, though that would change the semantics. Just getting a shared lock would be enough and be more lightweight.

I guess another approach would be to do this at the API level, though there's quite a few, and it'd be more invasive and prone to cause regressions.

@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

More context for this in coreos/fedora-coreos-releng-automation#79.

jlebon added a commit to jlebon/rpm-ostree that referenced this issue Oct 26, 2021
In the case we're using transactions, that's already done for us.
Otherwise, we should do it ourselves. Otherwise we run the risk of
racing with the pruner.

For more context, see:
ostreedev/ostree#2474
coreos/fedora-coreos-releng-automation#79
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Oct 26, 2021
In the case we're using transactions, that's already done for us.
Otherwise, we should do it ourselves. Otherwise we run the risk of
racing with e.g. a prune operation.

For more context, see:
ostreedev/ostree#2474
coreos/fedora-coreos-releng-automation#79
@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

I was re-reading the pull code more carefully and actually it does prepare a transaction by default, unless inherit-transaction is passed:

pull_data->legacy_transaction_resuming = FALSE;
if (!inherit_transaction &&
!ostree_repo_prepare_transaction (pull_data->repo, &pull_data->legacy_transaction_resuming,
cancellable, error))
goto out;

So I think ostree pull/ostree pull-local are safe too. And that leaves rpm-ostree compose tree. Patch for that in coreos/rpm-ostree#3193.

@dustymabe
Copy link
Contributor

And that leaves rpm-ostree compose tree. Patch for that in coreos/rpm-ostree#3193.

Nice.. In the case of rpm-ostree compose tree most users of that are probably using transactions already, right? Are there other ways to disable transactions other than the "running on NFS" case?

@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

So I think ostree pull/ostree pull-local are safe too. And that leaves rpm-ostree compose tree. Patch for that in coreos/rpm-ostree#3193.

Just to sanity-check, I've confirmed this with:

diff --git a/src/libostree/ostree-repo-prune.c b/src/libostree/ostree-repo-prune.c
index c4ce64ab..dfb9be88 100644
--- a/src/libostree/ostree-repo-prune.c
+++ b/src/libostree/ostree-repo-prune.c
@@ -404,6 +404,12 @@ ostree_repo_prune (OstreeRepo        *self,
   if (!lock)
     return FALSE;

+  if (g_getenv ("OSTREE_PRUNE_LOCKED") != NULL)
+    {
+      g_print ("took lock and stopping\n");
+      raise (SIGSTOP);
+    }
+
   g_autoptr(GHashTable) objects = NULL;
   gboolean refs_only = flags & OSTREE_REPO_PRUNE_FLAGS_REFS_ONLY;

@@ -490,6 +496,13 @@ ostree_repo_prune_from_reachable (OstreeRepo        *self,
   if (!lock)
     return FALSE;

+  if (g_getenv ("OSTREE_PRUNE_LOCKED") != NULL)
+    {
+      g_print ("took lock and stopping\n");
+      raise (SIGSTOP);
+    }
+
+
   g_autoptr(GHashTable) objects = NULL;

   if (!ostree_repo_list_objects (self, OSTREE_REPO_LIST_OBJECTS_ALL | OSTREE_REPO_LIST_OBJECTS_NO_PARENTS,

Then in one terminal, running:

$ sudo env OSTREE_PRUNE_LOCKED=1 ostree prune
took lock and stopping

[1]+  Stopped                 sudo env OSTREE_PRUNE_LOCKED=1 ostree prune
$

And running ostree pull in another terminal:

$ sudo ostree pull --commit-metadata-only fedora:fedora/x86_64/coreos/next

error: Locking repo shared failed: Resource temporarily unavailable

And similarly for pull-local:

$ sudo ostree pull-local ../pkgcache-repo rpmostree/pkg/libssh/0.9.5-3.fc35.x86__64

error: Locking repo shared failed: Resource temporarily unavailable

@dbnicholson
Copy link
Member

I don't think this is true. This was one of my primary motivations for writing the locking. Pruning takes an exclusive lock and a transaction takes a shared lock. Unless you're somehow avoiding the transaction in the pull, then you should avoid this. BTW, --verbose or G_MESSAGES_DEBUG=OSTree will spew a bunch of locking details.

@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

Hey @dbnicholson, can you specify which part isn't correct? I'm not sure if you saw the follow-up in #2474 (comment) or not.

@dbnicholson
Copy link
Member

Sorry, I just didn't read the entire thread before commenting. You're correct in what you've written that pulls are safe from the lock taken when preparing the transaction. If you're composing a tree from the repo, though, then you currently would need to take the lock yourself. It might be possible to do it in ostree when constructing an mtree, but that's really the type of multi step operation where you need the composer to do the locking since it's the one that knows when the operation is done.

@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

Sorry, I just didn't read the entire thread before commenting. You're correct in what you've written that pulls are safe from the lock taken when preparing the transaction. If you're composing a tree from the repo, though, then you currently would need to take the lock yourself. It might be possible to do it in ostree when constructing an mtree, but that's really the type of multi step operation where you need the composer to do the locking since it's the one that knows when the operation is done.

Right yup, agreed. This is the gap that coreos/rpm-ostree#3193 fixes for rpm-ostree. It makes sense to bake this into the pull API because the entrypoint is more well-defined (basically just a call to ostree_repo_pull_with_options) compared to committing.

cgwalters pushed a commit to coreos/rpm-ostree that referenced this issue Oct 26, 2021
In the case we're using transactions, that's already done for us.
Otherwise, we should do it ourselves. Otherwise we run the risk of
racing with e.g. a prune operation.

For more context, see:
ostreedev/ostree#2474
coreos/fedora-coreos-releng-automation#79
@jlebon
Copy link
Member Author

jlebon commented Oct 26, 2021

With coreos/rpm-ostree#3193 merged, I think we can close this.

@jlebon jlebon closed this as completed Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants