Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add agent diagnostics action #1703

Merged
merged 40 commits into from
Jan 31, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
eddcfa9
Add agent diagnostics action
michel-laterman Oct 25, 2022
b70f7c3
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Nov 16, 2022
e9b59a2
Fix PR and use control client instead of coord in handler
michel-laterman Nov 16, 2022
ba94a2b
Change AckEvent construction from fleet acker to an action method
michel-laterman Nov 17, 2022
ef7038a
Fix linter and tests
michel-laterman Nov 17, 2022
5e2792a
Fix linter
michel-laterman Nov 21, 2022
8fa1bdc
Merge branch 'main' into diagnostics-action
michel-laterman Dec 12, 2022
267cb6c
Fix merge
michel-laterman Dec 12, 2022
ee1f787
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Dec 12, 2022
3a949f3
Fix tests
michel-laterman Dec 12, 2022
b316427
remove duplication when creating an ackevent for an action
michel-laterman Dec 12, 2022
427c7ab
Merge branch 'main' into diagnostics-action
michel-laterman Dec 15, 2022
b185b91
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Jan 3, 2023
cffeca6
Retry upload for non-context errors
michel-laterman Jan 4, 2023
5c404f0
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Jan 5, 2023
9e6cdb8
Fix linter
michel-laterman Jan 5, 2023
d46ce20
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Jan 23, 2023
4318754
review feedback, fix diagnostics acks
michel-laterman Jan 23, 2023
1a1a9d9
Add JSON deserialization, fix yaml
michel-laterman Jan 24, 2023
55797c3
Add debug messages to handler
michel-laterman Jan 25, 2023
3fa3e1b
Fix uploader implementation and handler bug
michel-laterman Jan 25, 2023
abe952e
Review feedback
michel-laterman Jan 25, 2023
aea2b95
Fix linter
michel-laterman Jan 25, 2023
db52a04
Change diag ack to use upload_id add dates to diag directories
michel-laterman Jan 26, 2023
214fb22
Add rate limiter to diagnostics action handler
michel-laterman Jan 27, 2023
78ba518
update config
michel-laterman Jan 27, 2023
2693ce2
Add changelog fragment, fix linter
michel-laterman Jan 27, 2023
b29e75f
changed file_id to upload_id, updated changelog
juliaElastic Jan 27, 2023
abd31a8
updated diagnostics file name
juliaElastic Jan 27, 2023
383a576
Revert hooks changes, move log collection to ZipArchive
michel-laterman Jan 28, 2023
a2c8631
Merge remote-tracking branch 'origin/main' into diagnostics-action
michel-laterman Jan 28, 2023
4546751
Cleanup and yaml redaction fix
michel-laterman Jan 28, 2023
3776773
handler and redact fixes
michel-laterman Jan 28, 2023
3f868d2
fixed storing action_id correctly in files index
juliaElastic Jan 30, 2023
3adfdaa
commit the ack
michel-laterman Jan 30, 2023
8d96c53
Diagnostics handler will use temp file
michel-laterman Jan 31, 2023
cc1cd33
Change to async handler, add panic recover
michel-laterman Jan 31, 2023
f97d6d7
fix linter
michel-laterman Jan 31, 2023
cfa3e96
Update internal/pkg/agent/application/actions/handlers/handler_action…
michel-laterman Jan 31, 2023
09293ff
build error out of recovered item
michel-laterman Jan 31, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package handlers

import (
"bytes"
"context"
"fmt"

"github.com/elastic/elastic-agent/internal/pkg/agent/control/v2/client"
"github.com/elastic/elastic-agent/internal/pkg/diagnostics"
"github.com/elastic/elastic-agent/internal/pkg/fleetapi"
"github.com/elastic/elastic-agent/internal/pkg/fleetapi/acker"
"github.com/elastic/elastic-agent/pkg/core/logger"
)

// Uploader is the interface used to upload a diagnostics bundle to fleet-server.
type Uploader interface {
UploadDiagnostics(context.Context, string, *bytes.Buffer) (string, error)
}

// Diagnostics is the handler to process Diagnostics actions.
// When a Diagnostics action is received a full diagnostics bundle is taken and uploaded to fleet-server.
type Diagnostics struct {
log *logger.Logger
client client.Client
uploader Uploader
}

// NewDiagnostics returns a new Diagnostics handler.
func NewDiagnostics(log *logger.Logger, uploader Uploader) *Diagnostics {
return &Diagnostics{
log: log,
client: client.New(),
uploader: uploader,
}
}

// Handle processes the passed Diagnostics action.
func (h *Diagnostics) Handle(ctx context.Context, a fleetapi.Action, ack acker.Acker) error {
h.log.Debugf("handlerDiagnostics: action '%+v' received", a)
michel-laterman marked this conversation as resolved.
Show resolved Hide resolved
action, ok := a.(*fleetapi.ActionDiagnostics)
if !ok {
return fmt.Errorf("invalid type, expected ActionDiagnostics and received %T", a)
}
defer ack.Ack(ctx, action) //nolint:errcheck // no path for a failed ack
juliaElastic marked this conversation as resolved.
Show resolved Hide resolved

// Gather agent diagnostics
aDiag, err := h.client.DiagnosticAgent(ctx)
if err != nil {
action.Err = err
return fmt.Errorf("unable to gather agent diagnostics: %w", err)
}
uDiag, err := h.client.DiagnosticUnits(ctx)
if err != nil {
action.Err = err
return fmt.Errorf("unable to gather unit diagnostics: %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion]
I'm not sure if it was defined in the feature, but I'd suggest to have a best effort approach here. Don't fail it all if only part of the diagnostic is retrieved, upload what ever was got and an error for the rest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but currently the CLI has the same behaviour. I think we should keep it as is for now so the behaviour is the same


var b bytes.Buffer
// create a buffer that any redaction error messages are written into as warnings.
// if the buffer is not empty after the bundle is assembled then the message is written to the log
// zapio.Writer would be a better way to pass a writer to ZipArchive, but logp embeds the zap.Logger so we are unable to access it here.
var wBuf bytes.Buffer
defer func() {
if str := wBuf.String(); str != "" {
h.log.Warn(str)
}
}()
err = diagnostics.ZipArchive(&wBuf, &b, aDiag, uDiag) // TODO Do we want to pass a buffer/a reader around? or write the file to a temp dir and read (to avoid memory usage)? file usage may need more thought for containerized deployments
if err != nil {
action.Err = err
return fmt.Errorf("error creating diagnostics bundle: %w", err)
}

uploadID, err := h.uploader.UploadDiagnostics(ctx, action.ActionID, &b)
action.Err = err
action.UploadID = uploadID
if err != nil {
return fmt.Errorf("unable to upload diagnostics: %w", err)
}
return nil
}
Loading