-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent concurrent image pulls of same imageRef #159
Conversation
Hi @towe75 👋 Sorry for the delay in getting this reviewed 😅 The changes look good, but I think we can get the same behaviour in a simpler way by using the The main difference with that approach is that we can't really tell if there's another tasking downloading the image, the package will just block when From their perspective, they would just see this for all tasks: So it doesn't really matter which task actually downloaded the image. With regards to testing, yeah, I can't think of an easy way to test this 🤔 One idea was to somehow mock and wrap the |
@lgfa29 sorry, the "compare" link does not work for me? I will have a look at the |
Oh, that's weird. Here's the exact commit with the change then, see if this one works: 073987e It's also a pretty small change, so here's the full diff just in case: diff --git a/driver.go b/driver.go
index 2db8b35..6f26f1f 100644
--- a/driver.go
+++ b/driver.go
@@ -30,6 +30,7 @@ import (
"github.com/hashicorp/nomad/plugins/shared/hclspec"
pstructs "github.com/hashicorp/nomad/plugins/shared/structs"
spec "github.com/opencontainers/runtime-spec/specs-go"
+ "golang.org/x/sync/singleflight"
)
const (
@@ -104,6 +105,9 @@ type Driver struct {
systemInfo api.Info
// Queried from systemInfo: is podman running on a cgroupv2 system?
cgroupV2 bool
+
+ // singleflight group to prevent parallel image downloads
+ pullGroup singleflight.Group
}
// TaskState is the state which is encoded in the handle returned in
@@ -764,9 +768,19 @@ func (d *Driver) createImage(image string, auth *AuthConfig, forcePull bool, cfg
Username: auth.Username,
Password: auth.Password,
}
- if imageID, err = d.podman.ImagePull(d.ctx, imageName, imageAuth); err != nil {
- return imageID, fmt.Errorf("failed to start task, unable to pull image %s : %w", imageName, err)
+
+ result, err, _ := d.pullGroup.Do(imageName, func() (interface{}, error) {
+ id, err := d.podman.ImagePull(d.ctx, imageName, imageAuth)
+ if err != nil {
+ return "", fmt.Errorf("failed to start task, unable to pull image %s : %w", imageName, err)
+ }
+ return id, nil
+ })
+ if err != nil {
+ return "", err
}
+
+ imageID = result.(string)
d.logger.Debug("Pulled image ID", "imageID", imageID)
return imageID, nil
}
diff --git a/go.mod b/go.mod
index 58d0047..727e842 100644
--- a/go.mod
+++ b/go.mod
@@ -22,5 +22,6 @@ require (
github.com/onsi/ginkgo v1.13.0 // indirect
github.com/opencontainers/runtime-spec v1.0.3-0.20200929063507-e6143ca7d51d
github.com/stretchr/testify v1.7.0
+ golang.org/x/sync v0.0.0-20201207232520-09787c993a3a
)
diff --git a/go.sum b/go.sum
index f62ca83..170fdc5 100644
--- a/go.sum
+++ b/go.sum
@@ -731,8 +731,6 @@ github.com/mitchellh/copystructure v1.0.0/go.mod h1:SNtv71yrdKgLRyLFxmLdkAbkKEFW
github.com/mitchellh/go-homedir v1.0.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y=
github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0=
-github.com/mitchellh/go-linereader v0.0.0-20190213213312-1b945b3263eb h1:GRiLv4rgyqjqzxbhJke65IYUf4NCOOvrPOJbV/sPxkM=
-github.com/mitchellh/go-linereader v0.0.0-20190213213312-1b945b3263eb/go.mod h1:OaY7UOoTkkrX3wRwjpYRKafIkkyeD0UtweSHAWWiqQM=
github.com/mitchellh/go-ps v0.0.0-20190716172923-621e5597135b/go.mod h1:r1VsdOzOPt1ZSrGZWFoNhsAedKnEd6r9Np1+5blZCWk=
github.com/mitchellh/go-ps v1.0.0 h1:i6ampVEEF4wQFF+bkYfwYgY+F/uYJDktmvLPf7qIgjc=
github.com/mitchellh/go-ps v1.0.0/go.mod h1:J4lOc8z8yJs6vUwklHw2XEIiT4z4C40KtWVN3nvg8Pg= |
It uses now the @lgfa29 please review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, that event was more of an implementation detail that was not relevant to users.
Just missing a CHANGELOG entry, but looks good and much simpler 🙂
Changelog
podman does not establish a lock when it pulls an image.
There is some movement towards a more efficient implementation via containers/podman#1911
Other people workaround this problem by having a lock in another layer like e.g. cri-o/cri-o#3409
This PR adds such a mechanism to nomad-driver-podman. It prevents to pull the same image on the same host concurrently. A good, manual, check is to run a job with e.g. 5x redis and to ensure that the image is locally not available. First allocation will pull the image, 4 others will wait until it is ready. Without this PR we would try to download it 5 times parallel, wasting time and bandwith.
There is no unit test because i could not figure out a good way to code it. Maybe i can get a recommandation?