-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overlord/snapshotstate/backend: introducing the snapshot backend #5066
Changes from 2 commits
81ae315
d400538
75c4e91
bb821f5
e43d1af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,256 @@ | ||
// -*- Mode: Go; indent-tabs-mode: t -*- | ||
|
||
/* | ||
* Copyright (C) 2018 Canonical Ltd | ||
* | ||
* This program is free software: you can redistribute it and/or modify | ||
* it under the terms of the GNU General Public License version 3 as | ||
* published by the Free Software Foundation. | ||
* | ||
* This program is distributed in the hope that it will be useful, | ||
* but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
* GNU General Public License for more details. | ||
* | ||
* You should have received a copy of the GNU General Public License | ||
* along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
* | ||
*/ | ||
|
||
package backend | ||
|
||
import ( | ||
"archive/zip" | ||
"crypto" | ||
"encoding/json" | ||
"fmt" | ||
"io" | ||
"os" | ||
"os/exec" | ||
"path/filepath" | ||
"regexp" | ||
"sort" | ||
"time" | ||
|
||
"golang.org/x/net/context" | ||
|
||
"github.com/snapcore/snapd/client" | ||
"github.com/snapcore/snapd/dirs" | ||
"github.com/snapcore/snapd/logger" | ||
"github.com/snapcore/snapd/osutil" | ||
"github.com/snapcore/snapd/snap" | ||
"github.com/snapcore/snapd/strutil" | ||
) | ||
|
||
const ( | ||
archiveName = "archive.tgz" | ||
metadataName = "meta.json" | ||
metaHashName = "meta.sha3_384" | ||
|
||
userArchivePrefix = "user/" | ||
userArchiveSuffix = ".tgz" | ||
) | ||
|
||
var ( | ||
dirOpen = os.Open | ||
dirNames = (*os.File).Readdirnames | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will make the call site slightly non obvious There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I was aware of the tension between that and readability. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do have a convention where we try to write those names close to their underlying packages and method/function names, precisely so that it's more readable. The second one would be over the top with fileReaddirnames indeed and still unclear given that it's a method, so dirNames sounds good there, but the first one reads properly with osOpen. Alternatively, given that osOpen is itself already mocked, it can return a different type, possibly an interface that has all the necessary methods. Another alternative is to provide a different directory location with the desired data instead of mocking all methods. That might be better in some cases because it ends up testing the real code, including the backing logic behavior (readdirnames itself, for instance), instead of faking its behavior. |
||
open = Open | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit on on the fence with whether these should have their full type written down here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, having both Open and open on the same package is indeed somewhat unclear. Perhaps backendOpen? This at least would make it slightly less tempting to use one or the other interchangeably without understanding why. |
||
) | ||
|
||
// Iter loops over all snapshots in the snapshots directory, applying the | ||
// given function to each. The snapshot will be closed after the function | ||
// returns. If the function returns error, iteration is stopped (and if the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. an error |
||
// error isn't EOF, it's returned as the error of the iterator). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The EOF handling here feels strange and might yield actual bugs, in the sense that this is being used as a marker to stop the iteration, but its true meaning is that it reached the end of something. For example, it's easy to get an EOF because the whole snapshot file was read out, and if this error is returned it will inadvertently stop the outside iteration silently. It would be safer to have a real marker, such as |
||
func Iter(ctx context.Context, f func(*Reader) error) error { | ||
if err := ctx.Err(); err != nil { | ||
return err | ||
} | ||
|
||
dir, err := dirOpen(dirs.SnapshotsDir) | ||
if err != nil { | ||
if osutil.IsDirNotExist(err) { | ||
// no dir -> no snapshots | ||
return nil | ||
} | ||
return fmt.Errorf("cannot open snapshots directory: %v", err) | ||
} | ||
defer dir.Close() | ||
|
||
var names []string | ||
for err == nil { | ||
names, err = dirNames(dir, 100) | ||
for _, name := range names { | ||
if err = ctx.Err(); err != nil { | ||
break | ||
} | ||
|
||
filename := filepath.Join(dirs.SnapshotsDir, name) | ||
rsh, openError := open(filename) | ||
// rsh can be non-nil even when openError is not nil (in | ||
// which case rsh.Broken will have a reason). f can | ||
// check and either ignore or return an error when | ||
// finding a broken snapshot. | ||
if rsh != nil { | ||
err = f(rsh) | ||
} else { | ||
// TODO: use warnings instead | ||
logger.Noticef("cannot open snapshot %q: %v", name, openError) | ||
} | ||
if openError == nil { | ||
// if openError was nil the snapshot was opened and needs closing | ||
if closeError := rsh.Close(); err == nil { | ||
err = closeError | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's probably okay to leave it like this, since we don't really expect the close error to ever be an issue since we're not writing, but FWIW there's something asymmetric here about open errors and close errors. An open error is ignored and passed onto f via broken, but a close error is transformed into a big deal and stops the iteration. The Close error seems like the least interesting one when we're not really touching the file here. But again, this will probably never fail, so perhaps okay. |
||
} | ||
} | ||
if err != nil { | ||
break | ||
} | ||
} | ||
} | ||
|
||
if err == io.EOF { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should probably be checked inside the loop, right after dirNames. It would be strange to observe this error elsewhere still. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The wording on Readdirnames makes me think that it might at some point return a non-empty list and a non-nil error: it's documented that if names is empty, then the error is non-nil. It does not say that a non-nil error implies names is empty. The implementation confirms this suspicion. I'll add a comment explaining this. |
||
err = nil | ||
} | ||
|
||
return err | ||
} | ||
|
||
// List valid snapshots sets. | ||
func List(ctx context.Context, setID uint64, snapNames []string) ([]client.SnapshotSet, error) { | ||
setshots := map[uint64][]*client.Snapshot{} | ||
err := Iter(ctx, func(sh *Reader) error { | ||
if setID == 0 || sh.SetID == setID { | ||
if len(snapNames) == 0 || strutil.ListContains(snapNames, sh.Snap) { | ||
setshots[sh.SetID] = append(setshots[sh.SetID], &sh.Snapshot) | ||
} | ||
} | ||
return nil | ||
}) | ||
|
||
sets := make([]client.SnapshotSet, 0, len(setshots)) | ||
for id, shots := range setshots { | ||
sort.Sort(bySnap(shots)) | ||
sets = append(sets, client.SnapshotSet{ID: id, Snapshots: shots}) | ||
} | ||
|
||
sort.Sort(byID(sets)) | ||
|
||
return sets, err | ||
} | ||
|
||
// Filename of the given client.Snapshot in this backend. | ||
func Filename(sh *client.Snapshot) string { | ||
// this _needs_ the snap name and version to be valid | ||
return filepath.Join(dirs.SnapshotsDir, fmt.Sprintf("%d_%s_%s.zip", sh.SetID, sh.Snap, sh.Version)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should revision be encoded in filename too? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, it sounds reasonable given it's the only thing that defines uniqueness, and it wouldn't hurt since it's just a few chars. We do have the revision inside, but the same would be true for the other fields. |
||
} | ||
|
||
// Save a snapshot | ||
func Save(ctx context.Context, id uint64, si *snap.Info, cfg map[string]interface{}, usernames []string) (sh *client.Snapshot, e error) { | ||
if err := os.MkdirAll(dirs.SnapshotsDir, 0700); err != nil { | ||
return nil, err | ||
} | ||
|
||
sh = &client.Snapshot{ | ||
SetID: id, | ||
Snap: si.Name(), | ||
Revision: si.Revision, | ||
Version: si.Version, | ||
Time: time.Now(), | ||
SHA3_384: make(map[string]string), | ||
Size: 0, | ||
Conf: cfg, | ||
} | ||
|
||
aw, err := osutil.NewAtomicFile(Filename(sh), 0600, 0, osutil.NoChown, osutil.NoChown) | ||
if err != nil { | ||
return nil, err | ||
} | ||
// if things worked, we'll commit (and Cancel becomes a NOP) | ||
defer aw.Cancel() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea is and pattern is great, but why isn't this Close? It's the typical way we use it anyway, so the explanation wouldn't even be necessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. atomic files have both Close and Cancel (Cancel closes, but not viceversa). I'm not sure we ever use Close on its own, and you're not the first person to ask this (and in fact we had a bug where an atomic file was closed instead of being committed or canceled), so it might be worth revisiting. |
||
|
||
w := zip.NewWriter(aw) | ||
// Note we don't “defer w.Close()” because Close on zip.Writers: | ||
// 1. doesn't close the file descriptor (that's done by Cancel, above) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That'd be implied if it was named Close. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note this long comment was to explain the use of Close without a |
||
// 2. writes out the central directory of the zip | ||
// (yes they're weird in this way: why is that called Close, even) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because Close is the thing that does the final changes to the file description, flushing anything pending and releasing resources allocated. That's a common pattern across many types in Go, even things that are completely in memory. Besides clarity, the consistency also benefits use cases that have an interface for the Close method. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you can't safely do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's nothing unsafe about calling it as far as I can see. |
||
if err := addToZip(ctx, sh, w, archiveName, si.DataDir()); err != nil { | ||
return nil, err | ||
} | ||
|
||
users, err := usersForUsernames(usernames) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
for _, usr := range users { | ||
if err := addToZip(ctx, sh, w, userArchiveName(usr), si.UserDataDir(usr.HomeDir)); err != nil { | ||
return nil, err | ||
} | ||
} | ||
|
||
metaWriter, err := w.Create(metadataName) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
hasher := crypto.SHA3_384.New() | ||
enc := json.NewEncoder(io.MultiWriter(metaWriter, hasher)) | ||
if err := enc.Encode(sh); err != nil { | ||
return nil, err | ||
} | ||
|
||
hashWriter, err := w.Create(metaHashName) | ||
if err != nil { | ||
return nil, err | ||
} | ||
fmt.Fprintf(hashWriter, "%x\n", hasher.Sum(nil)) | ||
if err := w.Close(); err != nil { | ||
return nil, err | ||
} | ||
|
||
if err := ctx.Err(); err != nil { | ||
return nil, err | ||
} | ||
|
||
if err := aw.Commit(); err != nil { | ||
return nil, err | ||
} | ||
|
||
return sh, nil | ||
} | ||
|
||
func addToZip(ctx context.Context, sh *client.Snapshot, w *zip.Writer, entry, dir string) error { | ||
hasher := crypto.SHA3_384.New() | ||
if exists, isDir, err := osutil.DirExists(dir); !exists || !isDir || err != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a bit strange for this operation to not be an error here when isDir is false. I mean, the call site asked to add it, it existed, and even then it did nothing, and said nothing. The convenience of having multiple call sites working fine with this here perhaps wins of that, but it's at least worth noting. Perhaps tuning the function name so it's more indicative of that detail might help (addIfDir?). Either way, not a blocker. Just for your consideration. |
||
return err | ||
} | ||
parent, dir := filepath.Split(dir) | ||
|
||
archiveWriter, err := w.CreateHeader(&zip.FileHeader{Name: entry}) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
var sz sizer | ||
|
||
cmd := exec.Command("tar", | ||
"--create", | ||
"--sparse", "--gzip", | ||
"--directory", parent, dir, "common") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens if parent is a symlink to someone else's data? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bad stuff, which is what #4983 would let us avoid. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, having said that, I need to check whether that plays well with exec. It might not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sigh. It doesn't. I'll have to use |
||
cmd.Env = []string{"GZIP=-9 -n"} | ||
cmd.Stdout = io.MultiWriter(archiveWriter, hasher, &sz) | ||
matchCounter := &strutil.MatchCounter{Regexp: regexp.MustCompile(".*"), N: 1} | ||
cmd.Stderr = matchCounter | ||
if err := osutil.RunWithContext(ctx, cmd); err != nil { | ||
matches, count := matchCounter.Matches() | ||
if count > 0 { | ||
return fmt.Errorf("cannot create archive: %s (and %d more)", matches[0], count-1) | ||
} | ||
return fmt.Errorf("tar failed: %v", err) | ||
} | ||
|
||
sh.SHA3_384[entry] = fmt.Sprintf("%x", hasher.Sum(nil)) | ||
sh.Size += sz.size | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear that this is the proper place for this. Spool is generally associated with things which are there just for a short while before going away into another system, such as email, print jobs, and similar.
From the Filesystem Hierarchy Standard:
It doesn't seem quite right for snapshots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/var/lib/snapd/snapshots
would be fine