-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic Agent] Elastic Agent takes long to shut down #29650
Conversation
This pull request does not have a backport label. Could you fix it @AndersonQ? 🙏
NOTE: |
dev-tools/mage/crossbuild.go
Outdated
// Types is the list of package types | ||
var SelectedPackageTypes []PackageType | ||
// SelectedPackageTypes is the list of package types | ||
var SelectedPackageTypes []PackageType = []PackageType{TarGz} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windows does not have targz. careful and test on windows vm for sure
dev-tools/mage/build.go
Outdated
LDFlags: []string{ | ||
"-s", // Strip all debug symbols from binary (does not affect Go stack traces). | ||
// "-s", // Strip all debug symbols from binary (does not affect Go stack traces). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure having debug symbols on prod binaries was intended here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it before the merge. Actually, I think if it could be configurable it'd valuable to have it in development mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a "DEV" flag that exists in other places, it could be used to enable or not symbols?
beats/dev-tools/mage/settings.go
Lines 123 to 127 in a7f8517
DevBuild, err = strconv.ParseBool(EnvOr("DEV", "false")) | |
if err != nil { | |
panic(errors.Wrap(err, "failed to parse DEV env value")) | |
} | |
@@ -279,8 +289,13 @@ func (a *Application) gracefulKill(proc *process.Info) { | |||
return | |||
} | |||
|
|||
alog := logp.NewLogger("anderson") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder to remove this before merge
can you add some results from tests. how long it took before vs now after a change |
@ph @AndersonQ could you please update this PR and make it move? |
Thanks @AndersonQ |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments, is there enough existing test around shutdown?
dev-tools/mage/build.go
Outdated
LDFlags: []string{ | ||
"-s", // Strip all debug symbols from binary (does not affect Go stack traces). | ||
// "-s", // Strip all debug symbols from binary (does not affect Go stack traces). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a "DEV" flag that exists in other places, it could be used to enable or not symbols?
beats/dev-tools/mage/settings.go
Lines 123 to 127 in a7f8517
DevBuild, err = strconv.ParseBool(EnvOr("DEV", "false")) | |
if err != nil { | |
panic(errors.Wrap(err, "failed to parse DEV env value")) | |
} | |
@@ -90,6 +91,8 @@ func run(streams *cli.IOStreams, override cfgOverrider) error { | |||
return err | |||
} | |||
|
|||
// TODO: remove before review | |||
cfg.Settings.LoggingConfig.Level = logp.DebugLevel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndersonQ Oops, did you need to have the logging in test? while testing you can update the configuration to have it enable?
@@ -280,7 +286,9 @@ func (a *Application) gracefulKill(proc *process.Info) { | |||
} | |||
|
|||
// send stop signal to request stop | |||
proc.Stop() | |||
if err := proc.Stop(); err != nil { | |||
a.logger.Error(fmt.Errorf("failed to stop %s: %w", a.Name(), err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling fmt.Errorf(
you can call a.logger.Errorf(
@@ -275,6 +278,7 @@ func (a *Application) Stop() { | |||
func (a *Application) Shutdown() { | |||
a.appLock.Lock() | |||
defer a.appLock.Unlock() | |||
a.logger.Infof("Signaling service to stop because of shutdown: %s", a.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Signaling/signaling
Don't use capitalization when starting a logging statement.
@@ -294,6 +298,7 @@ func (a *Application) Shutdown() { | |||
// | |||
// It updates the status of the application and handles restarting the application is needed. | |||
func (a *Application) OnStatusChange(s *server.ApplicationState, status proto.StateObserved_Status, msg string, payload map[string]interface{}) { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have used request changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets split build changes vs fixing the shutdow issue.
dev-tools/mage/build.go
Outdated
// Strip all debug symbols from binary (does not affect Go stack traces). | ||
args.LDFlags = append(args.LDFlags, "-s") | ||
} | ||
|
||
return args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndersonQ Could you move this into his own pull request? Since this changes is not required for solving the "takes long to shutdown".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done: #29961
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a changelog entry?
@ph, @blakerouse could you re-review it please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndersonQ Changes LGTM, is there enough test coverage around that area of code? Can you investigate that? That the only thing that is preventing me from approving it.
not for the shutdown function. the coverage on the package isn't that great:
Well, I could do a manual test, compare the elastic and beats logs when shutting down to check that they shutdown successfully |
@@ -120,6 +125,8 @@ func TestExecFile_StdOut(t *testing.T) { | |||
} | |||
|
|||
func TestExecFile_NoOutput(t *testing.T) { | |||
t.Skip("skipping failing tests") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndersonQ is this new? I haven't seen that in the original PR commits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a mistake, the're failing on master as well... let me fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, lock is fixed! Thanks.
Change LGTM thanks ! |
/package |
/test |
@AndersonQ @ph the only issue left in the e2e testing repo is about the k8s autodiscover. Can we merge this PR then? |
@jlind23 LGTM, this would be 8.0 8.1 8.2? |
This pull request is now in conflicts. Could you fix it? 🙏
|
I would rather keep it in 8.2 only to avoid any side effect. |
…into feature/use-with-kind-k8s-env * 'feature/use-with-kind-k8s-env' of github.com:v1v/beats: (52 commits) ci: home is declared within withBeatsEnv ci: use withKindEnv step ci: use getBranchesFromAliases and support next-patch-8 (elastic#30400) Update fields.yml (elastic#29609) Heartbeat: fix browser metrics and trace mappings (elastic#30258) Apply light edits to 8.0 changelog (elastic#30351) packetbeat/beater: make sure Npcap installation runs before interfaces are needed (elastic#30396) Add a ring-buffer reporter to libbeat (elastic#28750) Osquerybeat: Add install verification for osquerybeat (elastic#30388) update windows matrix support (elastic#30373) Refactor of metricbeat process-gathering metrics and system/process (elastic#30076) adjust next changelog wording (elastic#30371) [Metricbeat] azure: move event report into loop validDim loop (elastic#29945) fix: report GitHub Check before the cache (elastic#30372) Add support for non-unique keys in Kafka output headers (elastic#30369) ci: 6 major branch reached EOL (elastic#30357) reduce Elastic Agent shut down time by stopping processes concurrently (elastic#29650) [Filebeat] Add message to register encode/decode debug logs (elastic#30271) [libbeat] kafka message header support (elastic#29940) Heartbeat: set duration to zero for syntax errors (elastic#30227) ...
Hi Team UPDATE: Thanks! |
What does this PR do?
WIP:
Why is it important?
It's part of the effort to support the new M1 Macs.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Screenshots
Logs
Before the fix:
After the fix: