Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix startup data corruption on persistent rootfs #776

Merged
merged 4 commits into from
Oct 12, 2018

Conversation

ivan4th
Copy link
Contributor

@ivan4th ivan4th commented Oct 9, 2018

There was another problem with caching when setting up the device mapper for the persistent rootfs.

  • fix e2e tests that were reusing same file for a loopback block device
  • try fixing cache issue using a "brute-force" write barrier (sync + drop kernel caches)
  • do multiple runs to confirm that "brute-force" approach works
  • implement less broad sync to fix the issue
  • do multiple test runs to confirm that the "less broad sync" approach works

This change is Reviewable

Copy link
Contributor

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 approvals obtained (waiting on @ivan4th)


pkg/libvirttools/persistentroot_volumesource.go, line 71 at r1 (raw file):

	imagePath, imageDigest, imageSize, err := v.owner.ImageManager().GetImagePathDigestAndVirtualSize(v.config.Image)
	if err != nil {
		glog.V(4).Infof("Persistent rootfs setup on %q: image info error %v", v.dev.HostPath, err)

Imo there is no need to repeat info about persistent rootfs setup on %q as it's there above.
Also if it's an error - verbosity level should be tuned to something more visible. Maybe V(2)? or even V(1)?
So my proposal will be there (and similarly in next occurrences below): Image info error: %v as this and next log are missing :.

Or are these logs there only for WIP time, to debug the thing?


tests/e2e/common.go, line 203 at r1 (raw file):

		Expect(err).NotTo(HaveOccurred())

		_, err = framework.RunSimple(nodeExecutor, "dd", "if=/dev/zero", "of="+filename, "bs=1M", "count=1000")

Maybe we should tune this count value to something smaller, as 1000 can be sluggish.
I know that it was there already, but in the same time maybe we could tune it a bit there?

@ivan4th ivan4th force-pushed the ivan4th/fix-another-persistent-rootfs-cache-issue branch 2 times, most recently from 5308331 to d757062 Compare October 11, 2018 11:20
Copy link
Contributor Author

@ivan4th ivan4th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 approvals obtained (waiting on @ivan4th)


pkg/libvirttools/persistentroot_volumesource.go, line 71 at r1 (raw file):

Previously, jellonek (Piotr Skamruk) wrote…

Imo there is no need to repeat info about persistent rootfs setup on %q as it's there above.
Also if it's an error - verbosity level should be tuned to something more visible. Maybe V(2)? or even V(1)?
So my proposal will be there (and similarly in next occurrences below): Image info error: %v as this and next log are missing :.

Or are these logs there only for WIP time, to debug the thing?

The error will propagate further and will be reported in any case as an error. This Info is just for the case when some further debugging is necessary. Also, it includes the name of the device on the host to facilitate grepping through the logs -- which would be harder if only just one msg included the path.


tests/e2e/common.go, line 203 at r1 (raw file):

Previously, jellonek (Piotr Skamruk) wrote…

Maybe we should tune this count value to something smaller, as 1000 can be sluggish.
I know that it was there already, but in the same time maybe we could tune it a bit there?

It may be a bit sluggish but this way it's compatible with a wider range of images. A better approach would be probably checking qcow2 size (internal) in the test but perhaps this can be done as a followup.

@ivan4th ivan4th changed the title [WiP] Fix startup data corruption on persistent rootfs Fix startup data corruption on persistent rootfs Oct 12, 2018
Copy link
Contributor

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 4 files at r1, 2 of 2 files at r2.
Reviewable status: 0 of 2 approvals obtained (waiting on @jellonek)


pkg/libvirttools/persistentroot_volumesource.go, line 71 at r1 (raw file):

Previously, ivan4th (Ivan Shvedunov) wrote…

The error will propagate further and will be reported in any case as an error. This Info is just for the case when some further debugging is necessary. Also, it includes the name of the device on the host to facilitate grepping through the logs -- which would be harder if only just one msg included the path.

So it's ok. So last very minor thing is lack of : before %v there and in next message ;)

Copy link
Contributor

@lukaszo lukaszo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 2 approvals obtained (waiting on @ivan4th)

@ivan4th ivan4th force-pushed the ivan4th/fix-another-persistent-rootfs-cache-issue branch from 54bea3c to f1c883f Compare October 12, 2018 13:23
Copy link
Contributor Author

@ivan4th ivan4th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 1 of 2 approvals obtained (waiting on @jellonek)


pkg/libvirttools/persistentroot_volumesource.go, line 71 at r1 (raw file):

Previously, jellonek (Piotr Skamruk) wrote…

So it's ok. So last very minor thing is lack of : before %v there and in next message ;)

Done.

Copy link
Contributor

@jellonek jellonek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r3.
Reviewable status: 1 change requests, 1 of 2 approvals obtained

@jellonek jellonek merged commit 4d925c7 into master Oct 12, 2018
@jellonek jellonek deleted the ivan4th/fix-another-persistent-rootfs-cache-issue branch October 12, 2018 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants