[Ingest Manager] Improved verify experience #21573

michalpristas · 2020-10-06T14:20:39Z

What does this PR do?

In this PR agent removes downloaded bits even if asc signature does not match (it was only the case for hash up until now)
Then fetch-verify is retried.

This helps mainly with sceanrio when artifacts are built without .asc files. So asc file downloaded from snapshot repository matches other build as self build binaries included in agent package.
IN this case after initial failure, agent removes self build binaries and downloads snapshot from repository including sha512 and asc files.

BUT there's a bit more going on in this PR. there's added lock for update method in emitter because when dynamic input called Set and Config was Loaded it resulted in two concurrent processing of configuration which tried to setup and run own set of beats and they were fighting over path.data location, crashing...

This problem sometimes manifested sometimes not, more probably during standalone scenario.

Why is it important?

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

elasticmachine · 2020-10-06T14:23:51Z

Pinging @elastic/ingest-management (Team:Ingest Management)

michalpristas · 2020-10-06T14:28:54Z

/package

blakerouse · 2020-10-06T14:41:28Z

x-pack/elastic-agent/pkg/core/plugin/process/app.go

@@ -117,7 +117,7 @@ func (a *Application) Name() string {

 // Started returns true if the application is started.
 func (a *Application) Started() bool {
-	return a.state.Status != state.Stopped
+	return a.state.Status != state.Stopped && a.state.Status != state.Crashed && a.state.Status != state.Failed


Why this change?

I think Started() is trying to say if the application was started at all. If it is crashed or failed you don't want Start() to be called again, because that state is handled internally in the application abstraction.

yes in case verification fails, it sets state to FAILING
when it passes in second round is tries to start the app but according to this check it is already running so it tries to configure it right away.

i just unified this check with the one in start.go so start.go should be unaffected but operation_start.Check should start functioning as intended (not skipping operation)

blakerouse · 2020-10-06T14:42:30Z

x-pack/elastic-agent/pkg/core/plugin/process/start.go

@@ -39,7 +39,7 @@ func (a *Application) start(ctx context.Context, t app.Taggable, cfg map[string]
 	}()

 	// already started if not stopped or crashed
-	if a.state.Status != state.Stopped && a.state.Status != state.Crashed && a.state.Status != state.Failed {
+	if a.Started() {


The internal start() is called to restart on a crash. We sure this change with the above change is not going to affect that behavior?

this should be ok we check it only 2 places here (which stays unaffected, it just looks more readable)
and operation_start.Check which is called on config resolution

blakerouse · 2020-10-06T14:43:40Z

I think this change will fix #21120. As that issue has to do with concurrency of updates, that this added mutex will solve.

elasticmachine · 2020-10-06T14:57:51Z

💚 Build Succeeded

Expand to view the summary

Build stats

Build Cause: [Pull request #21573 updated]
Start Time: 2020-10-06T16:06:57.779+0000
Duration: 42 min 4 sec

Test stats 🧪

Test	Results
Failed	0
Passed	1386
Skipped	4
Total	1390

…verify

blakerouse

Thanks for the explanation. Looks good.

[Ingest Manager] Improved verify experience (elastic#21573)

[Ingest Manager] Improved verify experience (#21573)

verify improved

134ccf1

michalpristas added bug enhancement Team:Ingest Management Ingest Management:beta2 Group issues for ingest management beta2 labels Oct 6, 2020

michalpristas self-assigned this Oct 6, 2020

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 6, 2020

simpler check

38c4777

michalpristas marked this pull request as ready for review October 6, 2020 14:23

blakerouse reviewed Oct 6, 2020

View reviewed changes

Merge branch 'master' of github.com:elastic/beats into agent-improve-…

757e4f5

…verify

blakerouse approved these changes Oct 6, 2020

View reviewed changes

michalpristas merged commit 306332f into elastic:master Oct 6, 2020

michalpristas added a commit to michalpristas/beats that referenced this pull request Oct 6, 2020

[Ingest Manager] Improved verify experience (elastic#21573)

0fc9fe3

[Ingest Manager] Improved verify experience (elastic#21573)

michalpristas mentioned this pull request Oct 6, 2020

Cherry-pick #21573 to 7.x: Improved verify experience #21588

Merged

6 tasks

michalpristas added a commit that referenced this pull request Oct 6, 2020

[Ingest Manager] Improved verify experience (#21573) (#21588)

1045e6e

[Ingest Manager] Improved verify experience (#21573)

ph mentioned this pull request Oct 7, 2020

[Agent] Reboot of Agent (with Endpoint installed) shows an temporary failure in Agent Activity log regarding bind port 6788 #21663

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ingest Manager] Improved verify experience #21573

[Ingest Manager] Improved verify experience #21573

michalpristas commented Oct 6, 2020 •

edited

Loading

elasticmachine commented Oct 6, 2020

michalpristas commented Oct 6, 2020

blakerouse Oct 6, 2020

michalpristas Oct 6, 2020 •

edited

Loading

blakerouse Oct 6, 2020

michalpristas Oct 6, 2020

blakerouse commented Oct 6, 2020

elasticmachine commented Oct 6, 2020 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

blakerouse left a comment

[Ingest Manager] Improved verify experience #21573

[Ingest Manager] Improved verify experience #21573

Conversation

michalpristas commented Oct 6, 2020 • edited Loading

What does this PR do?

Why is it important?

Checklist

elasticmachine commented Oct 6, 2020

michalpristas commented Oct 6, 2020

blakerouse Oct 6, 2020

Choose a reason for hiding this comment

michalpristas Oct 6, 2020 • edited Loading

Choose a reason for hiding this comment

blakerouse Oct 6, 2020

Choose a reason for hiding this comment

michalpristas Oct 6, 2020

Choose a reason for hiding this comment

blakerouse commented Oct 6, 2020

elasticmachine commented Oct 6, 2020 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

blakerouse left a comment

Choose a reason for hiding this comment

michalpristas commented Oct 6, 2020 •

edited

Loading

michalpristas Oct 6, 2020 •

edited

Loading

elasticmachine commented Oct 6, 2020 •

edited by jenkins-beats-ci bot

Loading