Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daemon: Unify pkgcache with system repo #1055

Closed
wants to merge 4 commits into from

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Oct 13, 2017

We originally needed the pkgcache to be a separate repo due to ostree's
overzealous pruning policies. The idea was to maintain multiple commits
in each pkg branch for different SELinux policies. In practice, there's
not much use in maintaining old copies and it's just easier to always
relabel on the fly. So then, the need for a separate repo completely
melts away.

This helps simplify the core a bunch and just generally makes it more
pleasant to work with. It also allows us to avoid subtle issues like
#1047.

The tricky bit is migrating the cache. In theory, we only really need to
migrate local RPM overrides/overlays, but if we're going to build
support for that, we might as well migrate it all. We do this by
checking at key points where the cache will be used whether a migration
is necessary. After migrating once, it boils down to one fstatat() on
subsequent runs to check that the repo no longer exists.

@cgwalters
Copy link
Member

Wow awesome!

@jlebon
Copy link
Member Author

jlebon commented Oct 13, 2017

Marking as WIP because I only tested this lightly so far and would like to write at least a basic test for it. I'm also thinking of moving the migration to daemon start. E.g. even with 64 pkgs, it only takes about 3-4 seconds.

@jlebon jlebon added WIP jira for syncing to jira and removed jira for syncing to jira labels Oct 13, 2017
@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 4d1b5b4) made this pull request unmergeable. Please resolve the merge conflicts.

@cgwalters
Copy link
Member

I think my main concern here is rollback/deploy to older. If one does a rollback from an rpm-ostree with this commit to an older one, the old one will still try to use the pkgcache. Rolling forward again to the new one will lead to a situation where refs are in both repos. How do we handle that situation, particularly if the user upgrades again?

I think the "rollback requires redownloading" model is probably OK, but we need to be careful to ensure that we always migrate refs in the pkgcache or so.

@jlebon
Copy link
Member Author

jlebon commented Oct 17, 2017

I originally considered just leaving a symlink behind, which I think should be safe, since e.g. pruning only ever prunes our specific prefix. But yeah, it's hacky and could limit us if we had to support that forever. Much better to just re-import as needed.

I think the "rollback requires redownloading" model is probably OK, but we need to be careful to ensure that we always migrate refs in the pkgcache or so.

One hiccup is that the pkgcache is not a pure cache. Notably, locally installed RPMs can't be redownloaded. Though I think I'm OK with requiring users to provide the RPMs again when redeploying from an older version.

@jlebon
Copy link
Member Author

jlebon commented Nov 17, 2017

While working on rebasing this, one thing that comes up is that with the new --ex-unified-core, we currently use a separate repo as pure pkgcache. If we unify them, then there are two concerns that arise:

  1. We would need to start thinking about garbage collection, which is cumbersome and bound to be less useful than keeping them separate.
  2. It complicates the workflow if the target repo is directly synced to the production location, where we don't want pkgcache refs to show up. At least Fedora does an intermediate pull-local for the specific ref, though it's still a behaviour change to watch out for. Although, right now --ex-unified-core requires a BARE_USER repo anyway.

Hmm, maybe the cleaner solution is to just always do the final commit in the pkgcache repo (essentially being unified from the core point of view), and then do a pull-local into the target repo. This would also allow --ex-unified-core to transparently work with archive repos. We would also correctly respect --cache-dir.

@cgwalters
Copy link
Member

Right, the requirement for --ex-unified-core to target BARE_USER is specifically so we don't have to worry about the "writing directly to prod repo" case.

But can't we keep the "separate pkgcache repo" code just for the compose side, and still do the unification on the client? ostree refs is already noisy if one starts using atomic syscontainers.

@jlebon
Copy link
Member Author

jlebon commented Nov 17, 2017

But can't we keep the "separate pkgcache repo" code just for the compose side, and still do the unification on the client? ostree refs is already noisy if one starts using atomic syscontainers.

Right, definitely. Just to clarify, when I said:

Hmm, maybe the cleaner solution is to just always do the final commit in the pkgcache repo (essentially being unified from the core point of view), and then do a pull-local into the target repo.

I meant this to only apply to the --ex-unified-core case. I.e. always use a separate pkgcache repo to import and commit the assembly and then pull local into the final repo. In the --cache-dir case of course, we reuse the same pkgcache repo. On the client side though, we definitely don't do that and keep it all in the system repo. So then we can simplify the core and drop rpmostree_pull_content_only. Is that what you had in mind?

@cgwalters
Copy link
Member

I see what you're getting at, yeah we probably need the "pull content" code for the unified core path. At this point we should put that in libostree.

@jlebon
Copy link
Member Author

jlebon commented Nov 20, 2017

OK, rebased onto master and with tests! ⬆️

I ended up mostly leaving the core untouched for now and only taking care of the unification on the client side. Seeing as the --ex-unified-core and jigdo work will most likely affect the core design, it seemed safer to just let it be for now so we can rework it properly later. This does mean we don't yet reap the benefits of a simpler core, as promised on the outset. Though this PR makes us more ready for that.

@jlebon jlebon changed the title WIP: daemon: unify pkgcache with system repo daemon: Unify pkgcache with system repo Nov 20, 2017
@jlebon jlebon removed the WIP label Nov 20, 2017
@@ -727,6 +727,11 @@ rpmostreed_sysroot_populate (RpmostreedSysroot *self,
if (!ostree_sysroot_get_repo (self->ot_sysroot, &self->repo, cancellable, error))
return FALSE;

/* Migrate legacy pkgcache repo into system repo. After the first time, this boils down to
* one stat() call. */
if (!rpmostree_migrate_pkgcache_repo (self->repo, cancellable, error))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potentially expensive operation to do while blocking the main loop on startup. Feels like we should model this as a transaction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In practice for me at least my main system is NVMe and I barely notice IO but still I think the separate txn is the right thing to do)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had this be part of the deploy transaction before. I switched to occur at daemon startup since even large numbers of pkgs didn't incur a noticeable delay. Though yeah, that's a skewed perspective here too since I have fast storage. :)

Giving it its own transaction is an interesting idea. Though since it shouldn't be very long-lived, it seems like a bit overkill vs e.g. sneaking it into the deploy transaction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, doing it on-demand in deploy seems nicer to me indeed; that way we only change things if the admin changes the current deployment after booting too, so the "update, find breakage, rollback" scenario will still be OK right?

@cgwalters
Copy link
Member

I've thought in the past of having something like a "libostree version" in the commit metadata so that we could use new formats or techniques only when all deployments have that version. If we had such a thing we could in this case migrate the refs back into the separate pkgcache repo if an older OS was deployed (also, only do the migration if all deployments support it).

@jlebon
Copy link
Member Author

jlebon commented Nov 28, 2017

Hmm, so thinking more about this, I think this would have been the first truly backwards incompatible change. I ended up just going the symlink route. This ensures that we fully retain backwards compatibility with older rpm-ostree setups, and e.g. don't lose any local RPMs.

I changed the commit message to reflect the new approach, but split the changes into a separate fixup. ⬆️

@jlebon
Copy link
Member Author

jlebon commented Nov 28, 2017

I've thought in the past of having something like a "libostree version" in the commit metadata so that we could use new formats or techniques only when all deployments have that version.

That makes sense. The tricky part is that it'd have to reflect the version of rpm-ostree that's in the compose, not the one that actually did the compose. Given that rolling back/deploying older versions is part of the model, maybe we should determine what our compatibility guarantee should be, e.g. something like "every rpm-ostree release must be compatible with releases from the past year" or some other defined window of time. And then later we can prune out those workarounds so we're not constantly accumulating them.

@cgwalters
Copy link
Member

Fixup is against the wrong commit right?

}

/* leave a symlink for compatibility with older rpm-ostree versions */
if (symlinkat ("../..", repo_dfd, RPMOSTREE_OLD_PKGCACHE_DIR) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, magic.

@jlebon
Copy link
Member Author

jlebon commented Nov 29, 2017

Proper fixup! ⬆️

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 9c004e1) made this pull request unmergeable. Please resolve the merge conflicts.

No functional changes.
Prep for unified repo work.
We originally needed the pkgcache to be a separate repo due to ostree's
overzealous pruning policies. The idea was to maintain multiple commits
in each pkg branch for different SELinux policies. In practice, there's
not much use in maintaining old copies and it's just easier to always
relabel on the fly. So then, the need for a separate repo completely
melts away.

This helps simplify the mental model a bit and allows us to avoid subtle
issues like coreos#1047. Note however that the core is still capable of
handling split repos for the `--ex-unified-core` compose use case. Once
that and the jigdo work are a bit more settled, we can have a clearer
picture of how to simplify the core further.

The tricky bit is migrating the cache. When deploying, we check if a
pkgcache repo exists and migrate its refs if so. We then leave behind a
symlink to the system repo to remain compatible with older rpm-ostrees.
@jlebon
Copy link
Member Author

jlebon commented Nov 29, 2017

Green and rebased! ☑️

@jlebon
Copy link
Member Author

jlebon commented Dec 1, 2017

Let's get this one in?

@cgwalters
Copy link
Member

Ooops yep, sorry dropped off my radar 📡 LGTM!

@rh-atomic-bot r+ 3a31204

@rh-atomic-bot
Copy link

⌛ Testing commit 3a31204 with merge 7056e6b...

rh-atomic-bot pushed a commit that referenced this pull request Dec 1, 2017
We originally needed the pkgcache to be a separate repo due to ostree's
overzealous pruning policies. The idea was to maintain multiple commits
in each pkg branch for different SELinux policies. In practice, there's
not much use in maintaining old copies and it's just easier to always
relabel on the fly. So then, the need for a separate repo completely
melts away.

This helps simplify the mental model a bit and allows us to avoid subtle
issues like #1047. Note however that the core is still capable of
handling split repos for the `--ex-unified-core` compose use case. Once
that and the jigdo work are a bit more settled, we can have a clearer
picture of how to simplify the core further.

The tricky bit is migrating the cache. When deploying, we check if a
pkgcache repo exists and migrate its refs if so. We then leave behind a
symlink to the system repo to remain compatible with older rpm-ostrees.

Closes: #1055
Approved by: cgwalters
@rh-atomic-bot
Copy link

☀️ Test successful - status-atomicjenkins
Approved by: cgwalters
Pushing 7056e6b to master...

@jlebon jlebon deleted the pr/unify-repos branch December 6, 2017 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants