feat: optimizes file copies to and from containers #2450

codefromthecrypt · 2024-03-27T04:33:09Z

What does this PR do?

This changes code interacting with file copies to and from the container to use optimized functions. Doing so reduces buffering and gives a chance for to use Go 1.22's optimized paths for linux.

Why is it important?

I'm using rather large images in k3s. For example, nodejs images get easily over a GB each. I found this copy logic accounts for the majority of test fixture setup, in our case sometimes over a minute is spent here even when images are available locally.

Related issues

Originally added in #347

How to test this PR

you can use k3s and its LoadImages function which will hit all the paths here.

Signed-off-by: Adrian Cole <[email protected]>

netlify · 2024-03-27T04:33:26Z

✅ Deploy Preview for testcontainers-go ready!

Name	Link
🔨 Latest commit	`61a8c3d`
🔍 Latest deploy log	https://app.netlify.com/sites/testcontainers-go/deploys/6603aae270d3130008b05817
😎 Deploy Preview	https://deploy-preview-2450--testcontainers-go.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt · 2024-03-30T00:50:47Z

I think the failure is a flake?

           	            	-  RyukDisabled: (bool) true,
          	            	+  RyukDisabled: (bool) false,

mdelapenya · 2024-04-03T10:48:12Z

Hey @codefromthecrypt I've done a super quick and dirty benchmark for this improvement, and I'm sharing here the results.

I'm not doing them to demonstrate anything against the PR, but to learn myself from the process, as I've been thinking about benchmarks more and more in the recent times. So please take this as me doing an exercise to learn, and I'd love to receive feedback if possible.

Here I go!

Env

goos: darwin
goarch: arm64
GOMAXPROCS=8

Code

I created a LoadImagesOld method in the k3s module, which is a copy of the LoadImages one, but calling a copy of the old CopyFileToContainer method (before these changes). I also created a benchmark function to be run in sub-benchmarks: one for the new method, and one for the old one. Each benchmark will load the nginx image into the k3s cluster.

func BenchmarkLoadImages(b *testing.B) {
	// Give up to three minutes to run this test
	ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
	defer cancel()

	k3sContainer, err := k3s.RunContainer(ctx,
		testcontainers.WithImage("docker.io/rancher/k3s:v1.27.1-k3s1"),
	)
	if err != nil {
		b.Fatal(err)
	}

	// Clean up the container
	defer func() {
		if err := k3sContainer.Terminate(ctx); err != nil {
			b.Fatal(err)
		}
	}()

	provider, err := testcontainers.ProviderDocker.GetProvider()
	if err != nil {
		b.Fatal(err)
	}

	// ensure nginx image is available locally
	err = provider.PullImage(ctx, "nginx")
	if err != nil {
		b.Fatal(err)
	}

	b.ResetTimer() // Reset the benchmark timer

	b.Run("Old copy method", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
			defer cancel()
			err := k3sContainer.LoadImagesOld(ctx, "nginx")
			if err != nil {
				b.Fatal(err)
			}
		}
	})

	b.Run("New copy method", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			ctx, cancel := context.WithDeadline(context.Background(), time.Now().Add(3*time.Minute))
			defer cancel()
			err := k3sContainer.LoadImages(ctx, "nginx")
			if err != nil {
				b.Fatal(err)
			}
		}
	})
}

Benchmark execution

Run benchmarks 5 times, including memory profile (bytes and allocations per operation):

go test -bench=. -benchmem -count 5 -run=^#

Benchmarks results

Benchmark	ns/op	B/op	allocs/op
BenchmarkLoadImages/Old_copy_method-8	33,176,551,708	460,569,136	3,702
BenchmarkLoadImages/Old_copy_method-8	33,715,342,958	460,480,384	2,715
BenchmarkLoadImages/Old_copy_method-8	31,288,304,583	460,465,152	2,632
BenchmarkLoadImages/Old_copy_method-8	32,554,835,500	460,480,336	2,716
BenchmarkLoadImages/Old_copy_method-8	31,915,499,583	460,471,904	2,714
BenchmarkLoadImages/New_copy_method-8	33,774,419,125	269,623,592	2,720
BenchmarkLoadImages/New_copy_method-8	31,926,620,417	269,623,272	2,707
BenchmarkLoadImages/New_copy_method-8	31,896,055,375	269,616,432	2,720
BenchmarkLoadImages/New_copy_method-8	33,230,731,916	269,640,440	2,827
BenchmarkLoadImages/New_copy_method-8	33,030,320,250	269,622,944	2,712

Results

With the above numbers, it seems obvious that the Bytes per operation is way lower with the new method (from 460's to 270's). The other two values, ns and allocations per operation seems more or less the same: the code is not much faster nor produce less allocations, but it uses less memory.

mdelapenya · 2024-04-03T10:50:16Z

docker.go

+	// In Go 1.22 os.File is always an io.WriterTo. However, testcontainers
+	// currently allows Go 1.21, so we need to trick the compiler a little.


We test against both versions of the language, although we always develop in the lowest one. It could be probably interesting working the other way around: always develop in the latest release, and run the tests for both (current and current -1). Thoughts?

To me, I think we should test at least 2 versions of Go anyway, and usually things like this don't come up too often.

In some projects I tend to do the develop in latest and test the floor model, just because a lot of devs always use latest first.

mdelapenya

LGTM, thanks!

codefromthecrypt · 2024-04-03T13:00:03Z

Thanks for sharing the results @mdelapenya, and also merging! I'm not too surprised about the benchmarks at the moment as most of the potential is only implemented in linux right now.

It could be neat to run the same in a linux container to see how much difference it makes where the go side is optimized, but not required on my side. Cheers!

* main: (115 commits) chore: create TLS certs in a consistent manner (testcontainers#2478) chore(deps): bump idna from 3.6 to 3.7 (testcontainers#2480) Elasticsearch disable CA retrieval when ssl is disabled (testcontainers#2475) fix: handle dockerignore exclusions properly (testcontainers#2476) chore: prepare for next minor development cycle (0.31.0) chore: use new version (v0.30.0) in modules and examples Fix url creation to handle query params when using HTTP wait strategy (testcontainers#2466) fix: data race on container run (testcontainers#2345) fix: logging deadlock (testcontainers#2346) feat(k6):Add remote test scripts (testcontainers#2350) feat: optimizes file copies to and from containers (testcontainers#2450) fix(exec): updates the `Multiplexed` opt to combine stdout and stderr (testcontainers#2452) Upgrade neo4j module to use features from v0.29.1 of testcontainers-go (testcontainers#2463) bug:Fix AMQPS url (testcontainers#2462) chore: more compose updates in comments chore: use "docker compose" (v2) instead of "docker-compose" (v1) (testcontainers#2464) chore(deps): bump github/codeql-action from 2.22.12 to 3.24.9 (testcontainers#2459) refactor: Add Weaviate modules tests (testcontainers#2447) feat(exitcode): Add exit code sugar method (testcontainers#2342) feat: add module to support InfluxDB v1.x (testcontainers#1703) ...

feat: optimizes file copies to and from containers

4d36c20

Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt requested a review from a team as a code owner March 27, 2024 04:33

Adrian Cole added 2 commits March 27, 2024 13:58

drift

4649a1e

Signed-off-by: Adrian Cole <[email protected]>

go 1.21 defense

61a8c3d

Signed-off-by: Adrian Cole <[email protected]>

mdelapenya reviewed Apr 3, 2024

View reviewed changes

mdelapenya approved these changes Apr 3, 2024

View reviewed changes

mdelapenya self-assigned this Apr 3, 2024

mdelapenya added the feature New functionality or new behaviors on the existing one label Apr 3, 2024

mdelapenya merged commit 697c264 into testcontainers:main Apr 3, 2024
102 checks passed

codefromthecrypt deleted the reduce-buffering branch April 3, 2024 12:56

mdelapenya mentioned this pull request Apr 10, 2024

chore: create TLS certs in a consistent manner #2478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: optimizes file copies to and from containers #2450

feat: optimizes file copies to and from containers #2450

codefromthecrypt commented Mar 27, 2024 •

edited

Loading

netlify bot commented Mar 27, 2024 •

edited

Loading

codefromthecrypt commented Mar 30, 2024

mdelapenya commented Apr 3, 2024

mdelapenya Apr 3, 2024

codefromthecrypt Apr 3, 2024

mdelapenya left a comment

codefromthecrypt commented Apr 3, 2024

		// In Go 1.22 os.File is always an io.WriterTo. However, testcontainers
		// currently allows Go 1.21, so we need to trick the compiler a little.

feat: optimizes file copies to and from containers #2450

feat: optimizes file copies to and from containers #2450

Conversation

codefromthecrypt commented Mar 27, 2024 • edited Loading

What does this PR do?

Why is it important?

Related issues

How to test this PR

netlify bot commented Mar 27, 2024 • edited Loading

✅ Deploy Preview for testcontainers-go ready!

codefromthecrypt commented Mar 30, 2024

mdelapenya commented Apr 3, 2024

Env

Code

Benchmark execution

Benchmarks results

Results

mdelapenya Apr 3, 2024

Choose a reason for hiding this comment

codefromthecrypt Apr 3, 2024

Choose a reason for hiding this comment

mdelapenya left a comment

Choose a reason for hiding this comment

codefromthecrypt commented Apr 3, 2024

codefromthecrypt commented Mar 27, 2024 •

edited

Loading

netlify bot commented Mar 27, 2024 •

edited

Loading