Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrades Envoy 1.11 -> 1.12 and Istio 1.2 -> 1.6 #168

Merged
merged 4 commits into from
Apr 12, 2021

Conversation

codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Apr 7, 2021

This upgrades Envoy from 1.11 -> 1.12 and Istio from 1.2 -> 1.6 and fixes the MacOS build.

Since Envoy 1.12 fixes /dev/stdout on OS/x, EnableEnvoyLogCollection can now work on macOS. Upgrading further will be in another pull request.

This also fixes functionality related to lsof and debug. Before, we didn't use timeouts, which could result in hangs. Now, we wait up to 3 seconds to capture this information.

fixes #88

General notes

This also fixes race conditions by reading back the admin port in tests via the "--admin-address-path" flag as implemented in envoy.EnableAdminAddressDetection

This intentionally stops at Istio 1.6 as versions beyond this imply configuration changes to our test data.

This should also fix #88 (macOS support)

On go:embed for the istio envoy bootstrap template

Before, our unit tests would download a 30MB+ targz and copy it to verify that a json file from envoy was still the same contents. This replaces that code with go:embed so we don't need to test it (because there's no risk of code formatters or anything else interfering with it.

This removes a large amount of the unit test runtime and with it a test that isn't good to execute -race against!

$ go test ./pkg/binary/envoy/controlplane -run TestIstioBootStrapTemplateEqualsReleaseJson
ok  	github.com/tetratelabs/getenvoy/pkg/binary/envoy/controlplane	7.127s

@codefromthecrypt codefromthecrypt changed the title Upgrades archiver lib and makes bootstrap template test streaming Upgrades archiver lib and makes bootstrap template test simpler Apr 7, 2021
@codefromthecrypt codefromthecrypt changed the title Upgrades archiver lib and makes bootstrap template test simpler Upgrades archiver lib and replaces istio.tmpl.go with go:embed Apr 7, 2021
@codefromthecrypt
Copy link
Contributor Author

tests were failing because docker install failed for rust+osx, then later they failed because I kicked the build too many times (docker.io rate limit) oh well

@codefromthecrypt
Copy link
Contributor Author

added a commit to hopefully help with #149

@codefromthecrypt
Copy link
Contributor Author

I will repurpose this PR to fix all the debug/controlplane tests as the build is so flakey it is annoying to have to keep re-kicking it

@codefromthecrypt codefromthecrypt changed the title Upgrades archiver lib and replaces istio.tmpl.go with go:embed Upgrades Envoy 1.11 -> 1.12 and Istio 1.2 -> 1.6 Apr 9, 2021
run: |
go run cmd/getenvoy/main.go fetch standard:1.11.0
go run cmd/getenvoy/main.go fetch standard:1.12.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update this to latest, 1.17.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some issues in doing that related to dropped config v2 support, so want to make a step first. Happy to do that later according to #169 sg?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's do v3 support and upgrading to 1.17.1 in a separate PR

// EnableAdminAddressDetection sets the "--admin-address-path" flag and reads it back before attempting ready checks.
//
// Notably, this allows ephemeral admin ports via bootstrap configuration admin/port_value=0
func EnableAdminAddressDetection(r *Runtime) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I owe unit tests on this, but waiting to see if this passes CI first as there are some files that only build on linux and tests that only execute on linux. Once this PR passes, I'll backfill tests on this..


func setupMockPilot() (*bootstrap.Server, util.TearDownFunc) {
return util.EnsureTestServer(func(args *bootstrap.PilotArgs) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file was deleted from istio, so I had to re-do it from scratch...

@@ -59,4 +59,4 @@ admin:
address:
socket_address:
address: 0.0.0.0
port_value: 15000
port_value: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a revelation, the insight that you can set this to zero and use --admin-address-path to avoid constant port conflicts!

@codefromthecrypt
Copy link
Contributor Author

the gnostic case sensitivity issue caused by istio 1.6 is wreaking havoc on the code quality tools, sadly in a way I can't reproduce locally. I'd really like to hear we can move to at least istio 1.7 or delete this function as I don't know if anyone is using it especially as it was using templates from istio 1.2.

if r.envoyReady() {
return binary.StatusReady
case r.cmd.ProcessState == nil: // the process started, but it hasn't completed, yet
// TODO: envoyReady() can succeed when there's a port collision even when the process managed by this
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this took me a lot of head scratching to eventually understand, hence the comment now.

}

trap terminate SIGINT SIGTERM SIGKILL
# This script is used to simulate a process that never terminates unless it receives SIGINT or SIGTERM
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes to make this macOS friendly

// EnableEnvoyLogCollection is a preset option that registers collection of Envoy Access Logs
//
// This is not supported on non-Linux platforms as Envoy is not able to access /dev/stdout unless
// *exec.Cmd.Stdout == os.Stdout. This makes it impossible to add the MultiWriter and start Envoy.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fixed in envoy 1.12

f, err := os.Create(filepath.Join(r.DebugStore(), "lsof/lsof.json"))
if err != nil {
return fmt.Errorf("error in creating a file to write open file statistics to: %v", err)
ctx, cancel := context.WithTimeout(context.Background(), processTimeout)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to use the XXXWithContext functionality here, else things can hang forever


// WaitWithContext blocks until the child process reaches the state passed or the context is canceled
// Note: It does not guarantee that it is in the specified state just that it has reached it
func (r *Runtime) WaitWithContext(ctx context.Context, state int) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the wait function was only used once EnableEnvoyLogCollection, so moved it there

@codefromthecrypt
Copy link
Contributor Author

fingers crossed this time... it has been a battlefield, but I think macOS finally works in CI now

@codefromthecrypt
Copy link
Contributor Author

removed the docker dep from unit tests as it has been getting rate-limited and was never a good idea anyway.

the flakiness in the debug and istio tests has to do with parallel runs, something we don't do in e2e and these are definitely more like e2e tests than unit tests (they literally download and run envoy). I'll take another look tomorrow to see if I can improve it.

@codefromthecrypt codefromthecrypt changed the base branch from master to fix-rate-limits April 10, 2021 23:53
@codefromthecrypt
Copy link
Contributor Author

I rebased over #171 as getting tired of waiting for docker.io rate limits to expire every day

@codefromthecrypt
Copy link
Contributor Author

@mathetake PTAL as I'm pretty done on this, but will help follow-up on istio upgrade if there's a decision in another PR

matrix:
runner:
# - os: macos-latest ## revisit once all tests are off ginkgo
- os: macos-latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easy to miss this, but this is now using macos

Base automatically changed from fix-rate-limits to master April 12, 2021 01:21
@@ -1,85 +0,0 @@
// Copyright 2019 Tetrate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious why this file is prefixed _linux..? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes it only compile in linux, which is a compilation bug trap I ran into! you can drift files and it will compile fine since the build skips based on this pattern

Adrian Cole and others added 4 commits April 12, 2021 09:48
This upgrades Envoy from 1.11 -> 1.12 and Istio from 1.2 -> 1.6.

This also fixes race conditions by reading back the admin port in tests
via the "--admin-address-path" flag as implemented in
`envoy.EnableAdminAddressDetection`

This intentionally stops at Istio 1.6 as versions beyond this imply
configuration changes to our test data.

Signed-off-by: Adrian Cole <[email protected]>
@codefromthecrypt
Copy link
Contributor Author

Thanks @mathetake once again, it is better after you reviewed it, and also I didn't accidentally break the website!

@codefromthecrypt codefromthecrypt merged commit 023353d into master Apr 12, 2021
@codefromthecrypt codefromthecrypt deleted the upgrade-archiver branch April 12, 2021 02:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

go test ./... fails on Mac
4 participants