From ad6223bc6e47ca329a25dbb30fe95218bdedda39 Mon Sep 17 00:00:00 2001 From: "mergify[bot]" <37929162+mergify[bot]@users.noreply.github.com> Date: Tue, 29 Nov 2022 10:45:20 -0500 Subject: [PATCH] Capture stdout/stderr of spawned components (#1702) (#1809) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * [v2] Add v2 component specification and validation. (#502) * Add v2 component specification and validation. * Remove i386 and ppc64el. Update spec for osquerybeat. * Remove windows/arm64. * Add component spec command to validate component specifications. (#510) * [v2] Calculate the expected runtime components from policy (#550) * Upgrade elastic-agent-client. * Calculate the expected running components and units from the v2 specification and the current policy. * Update NOTICE.txt. * Fix lint from servicable main.go. * Update GRPC for the agent CLI control protocol. Fix name collision issue. * Run go mod tidy. * Fix more lint issues. * Fix fmt. * Update logic to always compute model, with err set on each component. Check runtime preventions at model generation time. * Fix items from code review, and issue on windows test runner. * Try to cleanup duplication in tests. * Try 2 of fixing duplicate lint failure, that is not really a duplicate. * Re-run mage fmt. * Lint fixes for linux, why different? * Fix nolint comment. * Add comment. * Initial Flat Structure (#544) Flattening the structure and removing download/install steps for programs. Co-authored-by: Aleksandr Maus * Generate checksum file for components (#604) * generating checksum? * yaml output * Update dev-tools/mage/common.go Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com> * review * ioutil removal from magefile Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com> * V2 Runtime Component Manager (#645) * Add runtime for command v2 components. * Fix imports. * Add tests for watching checkins. * Fix lint and move checkin period to a configurable timeout. * Fix tests now that checkin timeout needs to be defined. * Fix code review and lint. * [v2] Use the v2 components runtime as the core of the Elastic Agent (#753) * Add runtime for command v2 components. * Fix imports. * Add tests for watching checkins. * Fix lint and move checkin period to a configurable timeout. * Fix tests now that checkin timeout needs to be defined. * Fix code review and lint. * Work on actually running the v2 runtime. * Work on switching to the v2 runtime. * More work on switching to v2 runtime. * Cleanup some imports. * More import cleanups. * Add TODO to FleetServerComponentModifier. * Remove outdated managed_mode_test.go. * Fixes from code review and lint. * [v2] Delete unused code from refactor (#777) * Add runtime for command v2 components. * Fix imports. * Add tests for watching checkins. * Fix lint and move checkin period to a configurable timeout. * Fix tests now that checkin timeout needs to be defined. * Fix code review and lint. * Work on actually running the v2 runtime. * Work on switching to the v2 runtime. * More work on switching to v2 runtime. * Cleanup some imports. * More import cleanups. * Add TODO to FleetServerComponentModifier. * More cleanup and removals. * Remove more. * Delete more unused code. * Clean up step_download from refactor. * Remove outdated managed_mode_test.go. * Fixes from code review and lint. * Fix lint and missing errcheck. * [v2] Delete more unused code from v2 transition (#790) * Remove more unused code that was including already deleted code. * Fix all unit tests. * Fix lint. * More lint fixes, maybe this time? * More lint.... really? * Update NOTICE.txt. * [v2] Merge July 27th main into v2 feature branch (#789) * [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564) Co-authored-by: apmmachine * Fix regression and use comma separated values (#560) Fix regression from https://github.com/elastic/elastic-agent/pull/509 * Change in Jenkinsfile will trigger k8s run (#568) * [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573) Co-authored-by: apmmachine * Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527) These 2 value are going to be used in the shipper to identify where an event came from in order to apply processors accordingly. Also, added test cases for the processor to verify the change and updated test cases with the new processor. * Add filemod times to contents of diagnostics collect command (#570) * Add filemod times to contents of diagnostics collect command Add filemod times to the files and directories in the zip archive. Log files (and sub dirs) will use the modtime returned by the fileinfo for the source. Others will use the timestamp from when the zip is created. * Fix linter * [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581) Co-authored-by: apmmachine * Fix Agent upgrade 8.2->8.3 (#578) * Fix Agent upgrade 8.2->8.3 * Improve the upgrade encryption handling. Add .yml files cleanup. * Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility * Update containerd (#577) * [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591) Co-authored-by: apmmachine * Set explicit ExitTimeOut for MacOS agent launchd plist (#594) * Set explicit ExitTimeOut for MacOS agent launchd plist * [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599) Co-authored-by: apmmachine * ci: enable build notifications as GitHub issues (#595) * status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569) * Add liveness endpoint Add /liveness route to metrics server. This route will report the status from pkg/core/status. fleet-gateway will now report a degraded state if a checkin fails. This may not propogate to fleet-server as a failed checkin means communications between the agent and the server are not working. It may also lead to the server reporting degraded for up to 30s (fleet-server polling time) when teh agent is able to successfully connect. * linter fix * add nolint direcrtive * Linter fix * Review feedback, add doc strings * Rename noop controller file to _test file * [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607) Co-authored-by: apmmachine * ci: enable flaky test detector (#605) * [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620) Co-authored-by: apmmachine * mergify: remove backport automation for non active branches (#615) * chore: use elastic-agent profile to run the E2E tests (#610) * [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631) Co-authored-by: apmmachine * add macros pointing to new agent's repo and fix old macro calls (#458) * Add mount of /etc/machine-id for managed Agent in k8s (#530) * Set hostPID=true for managed agent in k8s (#528) * Set hostPID=true for managed agent in k8s * Add comment on hostPID. * [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648) Co-authored-by: apmmachine * Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521) * update version * mage fmt update * update dependency * update changelog * redact sensitive information in diagnostics collect command (#566) * Support Cloudbeat regex input type (#638) * support input type with regex * Update supported.go * Changing the regex to support backward compatible * Disable flaky test download test (#641) * [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661) Co-authored-by: apmmachine * jjbb: exclude allowed branches, tags and PRs (#658) cosmetic change in the description and boolean based * Update elastic-agent-project-board.yml (#649) * ci: fix labels that clashes with the Orka workers (#659) * [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675) Co-authored-by: apmmachine * Osquerybeat: Fix osquerybeat is not running with logstash output (#674) * [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705) Co-authored-by: apmmachine * Add missing license headers (#711) * [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713) Co-authored-by: apmmachine * Allow - in eql variable names (#710) * fix to allow dashes in variable names in EQL expressions extend eql to allow the '-' char to appear in variable names, i.e., ${data.some-var} and additional test cases to eql, the transpiler, and the k8s provider to verify this works. Note that the bug was caused by the EQL limitation, the otehr test cases were added when attempting to find it. * Regenerate grammer with antlr 4.7.1, add CHANGELOG * Fix linter issue * Fix typo * Fix transpiler to allow : in dynamic variables. (#680) Fix transpiler regex to allow ':' characters in dynamic variables so that users can input "${dynamic.lookup|'fallback.here'}". Co-authored-by: Aleksandr Maus * Fix for the filebeat spec file picking up packetbeat inputs (#700) * Reproduce filebeat picking up packetbeat inputs * Filebeat: filter inputs as first input transform. Move input filtering to be the first input transformation that occurs in the filebeat spec file. Fixes https://github.com/elastic/elastic-agent/issues/427. * Update changelog. * [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727) Co-authored-by: apmmachine * ci: run on MacOS12 (#696) * [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732) Co-authored-by: apmmachine * fix typo on package command (#734) This commit fixes the typo in the package command on the README.md. * Allow / to be used in variable names (#718) * Allow the / character to be used in variable names. Allow / to be used in variable names from dynamic providers and eql expressions. Ensure that k8s providers can provide variables with slashes in their names. * run antlr4 * Fix tests * Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701) * Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases * Migrates vault directory on linux and windows to the top directory of the agent, so it can be shared without needing the upgrade handler call, like for example with side-by-side install/upgrade from .rpm/.deb * Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created. * Correct the typo in the log messages * Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run * Address code review feedback * Add missing import for linux utz * Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs * Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that. * Fix typo in the postinstall script * Update the vault migration code, add the agent configuration match check with the agent secret * [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746) Co-authored-by: apmmachine * wrap errors and fix some docs typo and convention (#743) * automate the ironbank docker context generation (#679) * Update README.md Adding M1 variable to export to be able to build AMD images * fix flaky (#730) * Add filestream ID on standalone kubernetes manifest (#742) This commit add unique IDs for the filestream inputs used by the Kubernetes integration in the Elastic-Agent standalone Kubernetes configuration/manifest file. * Alter github action to run on different OSs (#769) Alter the linter action to run on different OSs instead of on linux with the $GOOS env var. * [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771) Co-authored-by: apmmachine * elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708) * managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests Signed-off-by: Tetiana Kravchenko * add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html Signed-off-by: Tetiana Kravchenko * Apply suggestions from code review Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas * remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required Signed-off-by: Tetiana Kravchenko * rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy Signed-off-by: Tetiana Kravchenko * run make check Signed-off-by: Tetiana Kravchenko * keep manifests in sync to pass ci check Signed-off-by: Tetiana Kravchenko * add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN Signed-off-by: Tetiana Kravchenko * add links to elastic-agent documentation Signed-off-by: Tetiana Kravchenko * update comment on FLEET_ENROLLMENT_TOKEN Signed-off-by: Tetiana Kravchenko Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas * [Elastic-Agent] Added source uri reloading (#686) * Update will cleanup unneeded artifacts. (#752) * Update will cleanup unneeded artifacts. The update process will cleanup unneeded artifacts. When an update starts all artifacts that do not have the current version number in it's name will be removed. If artifact retrieval fails, downloaded artifacts are removed. On a successful upgrade, all contents of the downloads dir will be removed. * Clean up linter warnings * Wrap errors * cleanup tests * Fix passed version * Use os.RemoveAll * ci: propagate e2e-testing errors (#695) * [Release] add-backport-next (#784) * Update go.sum. * Fix upgrade. * Fix the upgrade artifact reload. * Fix lint in coordinator. Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine Co-authored-by: Pier-Hugues Pellerin Co-authored-by: Denis Rechkunov Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com> Co-authored-by: Aleksandr Maus Co-authored-by: Victor Martinez Co-authored-by: Manuel de la Peña Co-authored-by: Anderson Queiroz Co-authored-by: Daniel Araujo Almeida Co-authored-by: Mariana Dima Co-authored-by: ofiriro3 Co-authored-by: Julien Lind Co-authored-by: Craig MacKenzie Co-authored-by: Tiago Queiroz Co-authored-by: Pierre HILBERT Co-authored-by: Tetiana Kravchenko Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas Co-authored-by: Michal Pristas Co-authored-by: Elastic Machine * [v2] Fix inspect command (#805) * Write the inspect command for v2. * Fix lint. * Fix code review. Load inputs from inputs.d for inspect. * Fix lint. * Refactor to use errgroup. * Remove unused struct. * Expand check-in payload for V2 (#916) * Expand check-in payload for V2 * Make linter happy * [v2] Update protocol to use new UnitExpectedConfig. (#850) * Update v2 protocol to use new UnitExpectedConfig. * Cleanup. * Update NOTICE.txt. Lint dupl. * Fix code review. Ensure type is set to real type and not alias. * Fix action dispatching that was using ActionType instead of InputType as before (#973) * Fix bootstrapping a Fleet Server with v2. (#1010) * Fix bootstrapping a Fleet Server with v2. * Fix lint. * Fix tests. * Query just related files on build (#1045) * Update main to 8.5.0 (#793) (#1050) (cherry picked from commit 317e03116aa919d69be97242207ad11a28c826aa) Co-authored-by: Pier-Hugues Pellerin * Create archive directory if it doesn't exist. (#1058) On an M1 Mac rename seems to fail if the containing directories do not already exist. * fixed docker build (#1105) * V2 command work dir (#1061) * Fix v2 work directory for command. Add permission check for execution. Add determining root into runtime prevention. * Add writeable by group and other in check. * Fix restart and stopping issues in command runtime for failing binaries. * Fix issue in endpoint spec. Allow an input to not require an ID, but that ID must be unique. * Remove unused transpiler rules and steps. * Fix test. * Fix workDir for windows. * Reset to checkin period. * Fix test and code review issues. * Add extra log message in unit test. * More fixes from code review. * Fix test. * [v2] Move queue management to dispatcher (#1109) * Move queue management to dispatcher Move queue management actions to the dispatcher from the fleet-server in order to help with future work to add a retry mechanism. Add a PersistedQueue type which wrap the ActionQueue to make persisting the queue simpler for the consumer. * Refactor ActionQueue Refactor ActionQueue to only export methods that are used by consumers. The priority queue implementation has been changed to an unexported type. Persistency has been added and the persistedqueue type has been removed. * Rename persistedQueue interface to priorityQueue * Review feedback * failing to save queue will log message * Chagne gateway to use copy * Fix [V2]: Elastic Agent Install is broken. (#1331) * Fix agent shutdown on SIGINT (#1258) * Fix agent shutdown on SIGINT * Update runtime_comm expected check-in handling to eliminate the lock in failure cases * Remove some buffered channels that are not longer blocking shutdown after the runtime comms fix commit * Fix the recursive lock on itself in the runtime loop, refactored code to make it cleaner * Fix the comment typo * Fixed managed_mode coordination with fleet gateway. Now the gateway errors reading loop waits until gateway exits. Otherwise the gateway shutdown out of sequence blocks on errCh * Fix linter * Fix make check-ci * Fix runner Err() possible race * Update the runer DoneWithTimeout implementation * Address code review comments * [v2] Re-enable diagnostics for Elastic Agent and all components (#1140) * Add diagnostics back to v2. * Update pkg/component/runtime/manager.go Co-authored-by: Anderson Queiroz Co-authored-by: Anderson Queiroz * Check and create downloads dir before using (#1410) * [v2] Add upgrade action retry (#1219) * Add upgrade action retry Add the ability for the agent to schedule and retry upgrade actions. The fleetapi actions now define a ScheduledAction, and RetryableAction interface to eliminate the need for stub methods on all different action types. Action queue has been changed to function on scheduled actions. Serialization tests now ensure that that the retry attribute needed by retryable actions works. Decouple dispatcher from gateway, dispatcher has an errors channel that will return an error for the list of actions that's sent. Gateway has an Actions method that can be used to get the list of actions from the gateway. The managed_mode config manager will link these two components If a handler returns an error and the action is a RetryableAction, the dispatcher will attempt to schedule a retry. The dispatcher will also ack the action to fleet-server and indicate if it will be retried or has failed (or has been received normally). For the acker, if a RetryableAction has an error and an attempt count that is greater than 0 it will be acked as retried. If it has an error and an attempt count less than 1 it will be acked as failed. Co-authored-by: Blake Rouse * V1 metrics monitoring for V2 (#1487) V1 metrics monitoring for V2 (#1487) * [v2] Merge main on Oct. 18 (#1557) * [Automation] Update elastic stack version to 8.4.0-40cff009 for testing (#557) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-5e6770b1 for testing (#564) Co-authored-by: apmmachine * Fix regression and use comma separated values (#560) Fix regression from https://github.com/elastic/elastic-agent/pull/509 * Change in Jenkinsfile will trigger k8s run (#568) * [Automation] Update elastic stack version to 8.4.0-da5a1c6d for testing (#573) Co-authored-by: apmmachine * Add `@metadata.input_id` and `@metadata.stream_id` when injecting streams (#527) These 2 value are going to be used in the shipper to identify where an event came from in order to apply processors accordingly. Also, added test cases for the processor to verify the change and updated test cases with the new processor. * Add filemod times to contents of diagnostics collect command (#570) * Add filemod times to contents of diagnostics collect command Add filemod times to the files and directories in the zip archive. Log files (and sub dirs) will use the modtime returned by the fileinfo for the source. Others will use the timestamp from when the zip is created. * Fix linter * [Automation] Update elastic stack version to 8.4.0-b13123ee for testing (#581) Co-authored-by: apmmachine * Fix Agent upgrade 8.2->8.3 (#578) * Fix Agent upgrade 8.2->8.3 * Improve the upgrade encryption handling. Add .yml files cleanup. * Rollback ActionUpgrade to action_id, add MarkerActionUpgrade adapter struct for marker serialization compatibility * Update containerd (#577) * [Automation] Update elastic stack version to 8.4.0-4fe26f2a for testing (#591) Co-authored-by: apmmachine * Set explicit ExitTimeOut for MacOS agent launchd plist (#594) * Set explicit ExitTimeOut for MacOS agent launchd plist * [Automation] Update elastic stack version to 8.4.0-2e32a640 for testing (#599) Co-authored-by: apmmachine * ci: enable build notifications as GitHub issues (#595) * status identifies failing component, fleet gateway may report degraded, liveness endpoint added (#569) * Add liveness endpoint Add /liveness route to metrics server. This route will report the status from pkg/core/status. fleet-gateway will now report a degraded state if a checkin fails. This may not propogate to fleet-server as a failed checkin means communications between the agent and the server are not working. It may also lead to the server reporting degraded for up to 30s (fleet-server polling time) when teh agent is able to successfully connect. * linter fix * add nolint direcrtive * Linter fix * Review feedback, add doc strings * Rename noop controller file to _test file * [Automation] Update elastic stack version to 8.4.0-722a7d79 for testing (#607) Co-authored-by: apmmachine * ci: enable flaky test detector (#605) * [Automation] Update elastic stack version to 8.4.0-210dd487 for testing (#620) Co-authored-by: apmmachine * mergify: remove backport automation for non active branches (#615) * chore: use elastic-agent profile to run the E2E tests (#610) * [Automation] Update elastic stack version to 8.4.0-a6aa9f3b for testing (#631) Co-authored-by: apmmachine * add macros pointing to new agent's repo and fix old macro calls (#458) * Add mount of /etc/machine-id for managed Agent in k8s (#530) * Set hostPID=true for managed agent in k8s (#528) * Set hostPID=true for managed agent in k8s * Add comment on hostPID. * [Automation] Update elastic stack version to 8.4.0-86cc80f3 for testing (#648) Co-authored-by: apmmachine * Update elastic-agent-libs version: includes restriction on default VerificationMode to `full` (#521) * update version * mage fmt update * update dependency * update changelog * redact sensitive information in diagnostics collect command (#566) * Support Cloudbeat regex input type (#638) * support input type with regex * Update supported.go * Changing the regex to support backward compatible * Disable flaky test download test (#641) * [Automation] Update elastic stack version to 8.4.0-3d206b5d for testing (#656) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-3ad82aa8 for testing (#661) Co-authored-by: apmmachine * jjbb: exclude allowed branches, tags and PRs (#658) cosmetic change in the description and boolean based * Update elastic-agent-project-board.yml (#649) * ci: fix labels that clashes with the Orka workers (#659) * [Automation] Update elastic stack version to 8.4.0-03bd6f3f for testing (#668) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-533f1e30 for testing (#675) Co-authored-by: apmmachine * Osquerybeat: Fix osquerybeat is not running with logstash output (#674) * [Automation] Update elastic stack version to 8.4.0-d0a4da44 for testing (#684) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-dd98ded4 for testing (#703) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-164d9a10 for testing (#705) Co-authored-by: apmmachine * Add missing license headers (#711) * [Automation] Update elastic stack version to 8.4.0-00048b66 for testing (#713) Co-authored-by: apmmachine * Allow - in eql variable names (#710) * fix to allow dashes in variable names in EQL expressions extend eql to allow the '-' char to appear in variable names, i.e., ${data.some-var} and additional test cases to eql, the transpiler, and the k8s provider to verify this works. Note that the bug was caused by the EQL limitation, the otehr test cases were added when attempting to find it. * Regenerate grammer with antlr 4.7.1, add CHANGELOG * Fix linter issue * Fix typo * Fix transpiler to allow : in dynamic variables. (#680) Fix transpiler regex to allow ':' characters in dynamic variables so that users can input "${dynamic.lookup|'fallback.here'}". Co-authored-by: Aleksandr Maus * Fix for the filebeat spec file picking up packetbeat inputs (#700) * Reproduce filebeat picking up packetbeat inputs * Filebeat: filter inputs as first input transform. Move input filtering to be the first input transformation that occurs in the filebeat spec file. Fixes https://github.com/elastic/elastic-agent/issues/427. * Update changelog. * [Automation] Update elastic stack version to 8.4.0-3cd57abb for testing (#724) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-a324b98b for testing (#727) Co-authored-by: apmmachine * ci: run on MacOS12 (#696) * [Automation] Update elastic stack version to 8.4.0-31315ca3 for testing (#732) Co-authored-by: apmmachine * fix typo on package command (#734) This commit fixes the typo in the package command on the README.md. * Allow / to be used in variable names (#718) * Allow the / character to be used in variable names. Allow / to be used in variable names from dynamic providers and eql expressions. Ensure that k8s providers can provide variables with slashes in their names. * run antlr4 * Fix tests * Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases (#701) * Fix Elastic Agent non-fleet broken upgrade between 8.3.x releases * Migrates vault directory on linux and windows to the top directory of the agent, so it can be shared without needing the upgrade handler call, like for example with side-by-side install/upgrade from .rpm/.deb * Extended vault to allow read-only open, useful when the vault at particular location needs to be only read not created. * Correct the typo in the log messages * Update lint flagged function comment with 'unused', was flagged with 'deadcode' on the previous run * Address code review feedback * Add missing import for linux utz * Change vault path from Top() to Config(), this a better location, next to fleet.enc based on the install/upgrade testing with .rpm/.deb installs * Fix the missing state migration for .rpm/.deb upgrade. The post install script now performs the migration and creates the symlink after that. * Fix typo in the postinstall script * Update the vault migration code, add the agent configuration match check with the agent secret * [Automation] Update elastic stack version to 8.4.0-31269fd2 for testing (#746) Co-authored-by: apmmachine * wrap errors and fix some docs typo and convention (#743) * automate the ironbank docker context generation (#679) * Update README.md Adding M1 variable to export to be able to build AMD images * fix flaky (#730) * Add filestream ID on standalone kubernetes manifest (#742) This commit add unique IDs for the filestream inputs used by the Kubernetes integration in the Elastic-Agent standalone Kubernetes configuration/manifest file. * Alter github action to run on different OSs (#769) Alter the linter action to run on different OSs instead of on linux with the $GOOS env var. * [Automation] Update elastic stack version to 8.4.0-d058e92f for testing (#771) Co-authored-by: apmmachine * elastic-agent manifests: add comments; add cloudnative team as a codeowner for the k8s manifests (#708) * managed elastic-agent: add comments; add cloudnative team as a codeowner for the k8s manifests Signed-off-by: Tetiana Kravchenko * add comments to the standalone elastic-agent, similar to the documentation we have https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html Signed-off-by: Tetiana Kravchenko * Apply suggestions from code review Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas * remove comment for FLEET_ENROLLMENT_TOKEN; use Needed everywhere instead of Required Signed-off-by: Tetiana Kravchenko * rephrase regarding accessing kube-state-metrics when used third party tools, like kube-rbac-proxy Signed-off-by: Tetiana Kravchenko * run make check Signed-off-by: Tetiana Kravchenko * keep manifests in sync to pass ci check Signed-off-by: Tetiana Kravchenko * add info on where to find FLEET_URL and FLEET_ENROLLMENT_TOKEN Signed-off-by: Tetiana Kravchenko * add links to elastic-agent documentation Signed-off-by: Tetiana Kravchenko * update comment on FLEET_ENROLLMENT_TOKEN Signed-off-by: Tetiana Kravchenko Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas * [Elastic-Agent] Added source uri reloading (#686) * Update will cleanup unneeded artifacts. (#752) * Update will cleanup unneeded artifacts. The update process will cleanup unneeded artifacts. When an update starts all artifacts that do not have the current version number in it's name will be removed. If artifact retrieval fails, downloaded artifacts are removed. On a successful upgrade, all contents of the downloads dir will be removed. * Clean up linter warnings * Wrap errors * cleanup tests * Fix passed version * Use os.RemoveAll * ci: propagate e2e-testing errors (#695) * [Release] add-backport-next (#784) * Update main to 8.5.0 (#793) * [Automation] Update go release version to 1.17.12 (#726) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.4.0-60171339 for testing (#799) Co-authored-by: apmmachine * update dependency elastic/go-structform from v0.0.9 to v0.0.10 (#802) Signed-off-by: Florian Lehner * Fix unpacking of artifact config (#776) Fix unpacking of artifact config (#776) * [Automation] Update elastic stack version to 8.5.0-c54c3404 for testing (#826) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-7dbc10f8 for testing (#833) Co-authored-by: apmmachine * Fix RPM/DEB clean install (#816) * Fix RPM/DEB clean install * Improve the post install script * Do not try to copy the state files if the agent directory is the same, this causes the error. * Check the existance of symlink instead of the file it is pointing to for the state file migration. * Update check for symlink existance for the cases where the symlink points to non-existent file * fix path for auto generated spec file (#859) Signed-off-by: Florian Lehner * Reload downloader client on config change (#848) Reload downloader client on config change (#848) * Bundle elastic-agent.app for MacOS, needed to be able to enable the … (#714) * Bundle elastic-agent.app for MacOS, needed to be able to enable the Full Disk Access * Calm down the linter * Fix pathing for windows unit test * crossbuild: add fix to set ulimit for debian images (#856) Signed-off-by: Florian Lehner * [Heartbeat] Cleanup docker install / always add playwright deps (#764) This is the agent counterpart to elastic/beats#32122 Refactors Dockerfile handling of synthetics deps to rely on playwright install-deps rather than us manually keeping up to date with those. This should fix issues with newer playwrights needing additional deps. This also cleans up the Dockerfile a good amount, and fixes indentation. Finally, this removes the unused Dockerfile.elastic-agent.tmpl file since agent is now its own repo. It also cleans up some other metadata that no longer does anything. No changelog is specified because no user facing changes are present. * [Automation] Update elastic stack version to 8.5.0-41aadc32 for testing (#889) Co-authored-by: apmmachine * Fix/panic with composable renderer (#823) * Fix a panic with wg passed to the composable object In the code to retrieve the variables from the configuration files we need to pass a execution callback, this callback will be called in a goroutine. This callback can be executed multiple time until the composable renderer is stopped. There were a problem in the code that made the callback called multiple time and it made the waitgroup internal counter to do to a negative values. This commit change the behavior, it start the composable renderer give it a callback when the callback receives the variables it will stop the composable's Run method using the context. This ensure that the callback will be called a single time and that the variables are correctly retrieved. Fixes: #806 * [Automation] Update go release version to 1.18.5 (#832) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-60a4c029 for testing (#899) Co-authored-by: apmmachine * Add control-plane toleration to Agent K8S manifests. (#864) * Add toleration to elastic-agent Kubernetes manifests. The toleration with key node-role.kubernetes.io/control-plane is set to replace the deprecated toleration with key node-role.kubernetes.io/master which will be removed by Kubernetes v1.25 * Remove outdated "master" node terminology. * install mage with go install (#936) * Cloudnative ci automation (#837) This commit provides the relevant Jenkins CI automation to open Pull requests to kibana github repository in order to keep Cloud-Native teams manifests in sync with the manifests that are used into Fleet UI. For full information check #706 Updated .ci/Jenkins file that is triggered upon PR requests of /elastic-agent/deploy/kubernetes/* changes Updated Makefile to add functionality needed to create the extra files for the new prs to kibana remote repository * Reduce memory footprint by reordering struct elements (#804) * Reduce memory footprint by reordering struct elements * rename struct element for linter Signed-off-by: Florian Lehner Signed-off-by: Florian Lehner * [Automation] Update elastic stack version to 8.5.0-6b9f92c0 for testing (#948) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-0616acda for testing (#963) Co-authored-by: apmmachine * Clarify that this repo is not only docs (#969) * Add Filebeat lumberjack input to spec (#959) Make the lumberjack input available from Agent. Relates: https://github.com/elastic/beats/pull/32175 * [Automation] Update elastic stack version to 8.5.0-dd6f2bb0 for testing (#978) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-feb644de for testing (#988) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-7783a03c for testing (#1004) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-17b8a62d for testing (#1014) Co-authored-by: apmmachine * update ironbank image product name (#1009) This is required to automate the creation of the ironbank merge requests as the ubireleaser is using this field to compute the elastic-agent artifact url. For example it is now trying to retrieve https://artifacts.elastic.co/downloads/beats/elastic-agent-8.4.0-linux-x86_64.tar.gz instead of https://artifacts.elastic.co/downloads/beats/elastic-agent/elastic-agent-8.4.0-linux-x86_64.tar.gz * ci: add extended support for windows (#683) * [Automation] Update elastic stack version to 8.5.0-9aed3b11 for testing (#1030) Co-authored-by: apmmachine * Cloudnative ci utomation (#1035) * Updating Jenkinsfile and Makefile to open PR * Adding needed token-id * [Automation] Update elastic stack version to 8.5.0-fedc3e60 for testing (#1054) Co-authored-by: apmmachine * Testing PR creation for 706 (#1049) * Fix lookup issues with inputs.d fragment yml (#840) * Fix lookup issues with inputs.d fragment yml The Elastic Agent was looking next to the binary for the `inputs.d` folder instead it should look up into the `Home` folder where the Elastic Agent symlink is located. Fixes: #663 * Changelog * Fix input.d path, tie to the agent Config() directory * Update CHANGELOG to reflect that the agent configuration directory is used to locate the inputs.d directory Co-authored-by: Aleksandr Maus * [Automation] Update elastic stack version to 8.5.0-b5001a6d for testing (#1064) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-1bd77fc1 for testing (#1082) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-167dfc80 for testing (#1091) Co-authored-by: apmmachine * Adding support for v1.25.0 k8s (#1044) * Adding support for v1.25.0 k8s * [Automation] Update elastic stack version to 8.5.0-6b7dda2d for testing (#1101) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.5.0-4140365c for testing (#1114) Co-authored-by: apmmachine * Remove experimental warning log in upgrade command (#1106) * Update go.mod to Go 1.18, update notice. (#1120) * Remove the fleet reporter (#1130) * Remove the fleet reporter Remove the fleet-reporter so that checkins no longer deliver the event list. * add CHANGELOG fix tests * [Automation] Update elastic stack version to 8.5.0-589a4a10 for testing (#1147) Co-authored-by: apmmachine * Avoid reporting `Unhealthy` on fleet connectivity issues (#1152) Avoid reporting `Unhealthy` on fleet connectivity issues (#1152) * ci: enable MacOS M1 stages (#1123) * [Automation] Update go release version to 1.18.6 (#1143) * [Automation] Update elastic stack version to 8.5.0-37418cf3 for testing (#1165) Co-authored-by: apmmachine * Remove mage notice in favour of make notice (#1108) The current implementation of mage notice is not working because it was never finalised, the fact that it and `make notice` exist only generates confusion. This commit removes the `mage notice` and documents that `make notice` should be used instead for the time being. In the long run we want to use the implementation on `elastic-agent-libs`, however it is not working at the moment. Closes #1107 Co-authored-by: Craig MacKenzie * ci: run e2e-testing at the end (#1169) * ci: move macos to github actions (#1175) * [Automation] Update elastic stack version to 8.5.0-fcf3d4c2 for testing (#1183) Co-authored-by: apmmachine * Add support for hints' based autodiscovery in kubernetes provider (#698) * ci: increase timeout (#1190) * Fixing condition for PR creation (#1188) * Fix leftover log level (#1194) * [automation] Publish kubernetes templates for elastic-agent (#1192) Co-authored-by: apmmachine * ci: force GO_VERSION (#1204) * Fix whitespaces in vault_darwin.c (#1206) * Update kubernetes templates for elastic-agent [templates.d] (#1231) * Use at least warning level for all status logs (#1218) * Update k8s manifests to leverage hints (#1202) * Add Go 1.18 upgrade to breaking changes section. (#1216) * Add Go 1.18 upgrade to breaking changes section. * Fix the PR number in the changelog. * [Release] add-backport-next (#1254) * Bump version to 8.6.0. (#1259) * [Automation] Update elastic stack version to 8.5.0-7dc445a0 for testing (#1248) Co-authored-by: apmmachine * Fix: Endpoint collision between monitoring and regular beats (#1034) Fix: Endpoint collision between monitoring and regular beats (#1034) * internal/pkg/agent/cmd: don't format error message with nil errors (#1240) The failure conditions allow nil errors to result in an error being formatted, when formatting due to a non-accepted HTTP status code and a nil error, omit the error. Co-authored-by: Craig MacKenzie * [Automation] Update elastic stack version to 8.6.0-21651da3 for testing (#1290) Co-authored-by: apmmachine * Fixed: source uri reload for download/verify components (#1252) Fixed: source uri reload for download/verify components (#1252) * Expand status reporter/controller interfaces to allow local reporters (#1285) * Expand status reporter/controller interfaces to allow local reporters Add a local reporter map to the status controller. These reporters are not used when updating status with fleet-server, they are only used to gather local state information - specifically if the agent is degraded because checkin with fleet-server has failed. This bypasses the bug that was introduced with the liveness endpoint where the agent could checkin (to fleet-server) with a degraded status because a previous checkin failed. Local reporters are used to generate a separate status. This status is used in the liveness endpoint. * fix linter * Improve logging for agent upgrades. (#1287) * [Automation] Update elastic stack version to 8.6.0-326f84b0 for testing (#1318) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.6.0-df00693f for testing (#1334) Co-authored-by: apmmachine * Add success log message after previous checkin failures (#1327) * Fix status reporter initialization (#1341) * [Automation] Update elastic stack version to 8.6.0-a2f4f140 for testing (#1362) Co-authored-by: apmmachine * Added status message to CheckinRequest (#1369) * Added status message to CheckinRequest * added changelog * updated test * added omitempty * Fix failures when using npipe monitoring endpoints (#1371) * [Automation] Update elastic stack version to 8.6.0-158a13db for testing (#1379) Co-authored-by: apmmachine * Mount /etc directory in Kubernetes DaemonSet manifests. (#1382) Changes made to files like `/etc/passwd` using Linux tools like `useradd` are not reflected in the mounted file on the Agent, because the tool replaces the file instead of changing it in-place. Mounting the parent directory solves this problem. * [Automation] Update elastic stack version to 8.6.0-aea1c645 for testing (#1405) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.6.0-0fca2953 for testing (#1412) Co-authored-by: apmmachine * ci: 7.17 is not available for the daily run (#1417) * [Automation] Update elastic stack version to 8.6.0-e4c15f15 for testing (#1425) Co-authored-by: apmmachine * [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401) [backport main] Fix: Agent failed to upgrade from 8.4.2 to 8.5.0 BC1 for MAC 12 agent using agent binary. (#1401) * Fix docker provider add_fields processors (#1420) The Docker provider was using a wrong key when defining the `add_fields` processor, this causes Filebeat not to start the input and stay on a unhealthy state. This commig fixes it. Fixes https://github.com/elastic/beats/issues/29030 * [Automation] Update elastic stack version to 8.6.0-d939cfde for testing (#1436) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.6.0-7c9f25a9 for testing (#1446) Co-authored-by: apmmachine * Enable integration only when datastreams are not defined (#1456) * Add not dedoted k8s pod labels in autodiscover provider to be used for templating, exactly like annotations (#1398) * [Automation] Update elastic stack version to 8.6.0-c49fac70 for testing (#1464) Co-authored-by: apmmachine * Add storageclass permissions in agent clusterrole (#1470) * Add storageclass permissions in agent clusterrole * Remote QA-labels automation (#1455) * [Automation] Update go release version to 1.18.7 (#1444) Co-authored-by: apmmachine * [Automation] Update elastic stack version to 8.6.0-5a8d757d for testing (#1480) Co-authored-by: apmmachine * Improve logging around agent checkins. (#1477) Improve logging around agent checkins. - Log transient checkin errors at Info. - Upgrade to an Error log after 2 repeated failures. - Log the wait time for the next retry. - Only update local state after repeated failures. * [Automation] Update elastic stack version to 8.6.0-40086bc7 for testing (#1496) Co-authored-by: apmmachine * Fixing makefile check (#1490) * Fixing makefile check * action: validate changelog fragment (#1488) * Allign managed with standalone role (#1500) * Fix k8s template link versioning (#1504) * Allighningmanifests (#1507) * Allign managed with standalone role * Fixing missing Label * [Automation] Update elastic stack version to 8.6.0-233dc5d4 for testing (#1515) Co-authored-by: apmmachine * Convert CHANGELOG.next to fragments (#1244) * [Automation] Update elastic stack version to 8.6.0-54a302f0 for testing (#1531) Co-authored-by: apmmachine * Update the linter configuration. (#1478) Sync the configuration with the one used in Beats, which has disabled the majority of the least useful linters already. * Elastic agent counterpart of https://github.com/elastic/beats/pull/33362 (#1528) Always use the stack_release label for npm i No changelog necessary since there are no user-visible changes This lets us ensure we've carefully reviewed and labeled the version of the @elastic/synthetics NPM library that's bundled in docker images * [Automation] Update elastic stack version to 8.6.0-cae815eb for testing (#1545) Co-authored-by: apmmachine * Fix admin permission check on localized windows (#1552) Fix admin permission check on localized windows (#1552) * Fixes from merge of main. * Update heartbeat specification to only support elasticsearch. * Fix bad merge in dockerfile. Signed-off-by: Florian Lehner Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine Co-authored-by: Pier-Hugues Pellerin Co-authored-by: Denis Rechkunov Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com> Co-authored-by: Aleksandr Maus Co-authored-by: Victor Martinez Co-authored-by: Manuel de la Peña Co-authored-by: Anderson Queiroz Co-authored-by: Daniel Araujo Almeida Co-authored-by: Mariana Dima Co-authored-by: ofiriro3 Co-authored-by: Julien Lind Co-authored-by: Craig MacKenzie Co-authored-by: Tiago Queiroz Co-authored-by: Pierre HILBERT Co-authored-by: Tetiana Kravchenko Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas Co-authored-by: Michal Pristas Co-authored-by: Elastic Machine Co-authored-by: Florian Lehner Co-authored-by: Andrew Cholakian Co-authored-by: Yash Tewari Co-authored-by: Quentin Pradet Co-authored-by: Andrew Kroh Co-authored-by: Julien Mailleret <8582351+jmlrt@users.noreply.github.com> Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com> Co-authored-by: Chris Mark Co-authored-by: apmmachine Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com> Co-authored-by: Julia Bardi <90178898+juliaElastic@users.noreply.github.com> Co-authored-by: Edoardo Tenani <526307+endorama@users.noreply.github.com> * Add input name alias for cloudbeat integrations (#1596) * add name alias for cloudbeat * add anchors for yaml fields * add EKS input * Change the stater to include a local flag. (#1308) * Change the stater to include a local flag. Change the state reporter to use a local flag that determines if local errors are included in the resulting state. Assume that configMgr errors are all local - this effects mainly the fleet_gateway. Allow the gateway to report an error if a checkin fails. When a checkin fails the local state reported through the status command and liveness endpoint will include the error, but checkins to fleet-server will not. * Add ActionsError() method to config manager Add a new ActionsError() methdo the the config managers. For the non-managed instances it will return a nil channel. For the managed instances it will return the dispatcher error queue directly. Have teh coordinator gather from this channel as it does for the others and treat any errors as non-local. * Fix linter * Service runtime V2 (#1529) * Service V2 runtime * Implements service runtime component for V2. * Extends endpoint spec with some additional attributes for service start/stop/status checks and creds discovery. The creds discovery logic is taken from V1, cleaned up and extracted into its own file, added utz. * Implements service uninstall * Refactors pkg/core/process/process.go adds additional options that are needed for the service_command implementation. * Changes ComponentsModifier to access raw config, needed for the EndpointComponentModifier * Injects host.id into configuration, needed for Endpoint * Injects fleet and policy.revision configuration into the Endpoint input configuration * Bumps the version to 8.6.0 to make it consistent with current beats V2 branch * Addresses linter complains on affected files * Remove the service watcher, all the start/stopping logic * Add changelog * Fix typo * Send STOPPING only upon teardown * Wait for check-in with timeout before sending stopping on teardown * Fix the service loop routine blocking on channel after stopped * Addressed code review comments * Make linter happy * Try to fix make check-ci * Spellcheck runtime README.md * Remove .Stop timeout from the spec as it is no longer used * Addressed code review feedback * Sync components with state during container start (#1653) * Sync components with state during container start * path approach * Subprocess reader start. * Implement io.Writer to handle reading stdout/stderr for spawned components. * Don't inject logging args to beats components. Always have beats log to stderr. * Update to v0.2.15 of elastic-agent-libs. * [V2] Enable support for shippers (#1527) * Work on adding shipper support. * Fix fmt. * Fix reference to spec. Allow shipper to be null but still enabled if key exists. * Move supported shippers into its own key in the input specification. * Fix issue in merge. * Implement fake shipper and add fake shipper output to the fake component. * Add protoc to the test target. * Don't generate fake shipper protocol in test. * Commit fake GRPC into code. * Add unit test for running with shipper, with sending event between running componentn and running shipper. * Add docstring for shipper test. * Add changelog fragement. * Adjust paths for shipper to work on windows and better on unix. * Update changelog/fragments/1667571017-Add-support-for-running-the-elastic-agent-shipper.yaml Co-authored-by: Craig MacKenzie * Fix fake/component to connect over npipe on windows. Co-authored-by: Craig MacKenzie * More work on the logging. * More fixes. * Change back to streams. * Fix go.mod. * Fix import. * Fix issues with merge of main. * remove log helper. * Add NewWithoutConfig. * Fix the spawned filestream to ingest logs into elasticsearch for monitoring. * Add changelog entry. * Remove debug print. * Update 1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml Signed-off-by: Florian Lehner Co-authored-by: Michal Pristas Co-authored-by: Aleksandr Maus Co-authored-by: Michel Laterman <82832767+michel-laterman@users.noreply.github.com> Co-authored-by: apmmachine <58790750+apmmachine@users.noreply.github.com> Co-authored-by: apmmachine Co-authored-by: Pier-Hugues Pellerin Co-authored-by: Denis Rechkunov Co-authored-by: Victor Martinez Co-authored-by: Manuel de la Peña Co-authored-by: Anderson Queiroz Co-authored-by: Daniel Araujo Almeida Co-authored-by: Mariana Dima Co-authored-by: ofiriro3 Co-authored-by: Julien Lind Co-authored-by: Craig MacKenzie Co-authored-by: Tiago Queiroz Co-authored-by: Pierre HILBERT Co-authored-by: Tetiana Kravchenko Co-authored-by: Michael Katsoulis Co-authored-by: Andrew Gizas Co-authored-by: Elastic Machine Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Anderson Queiroz Co-authored-by: Florian Lehner Co-authored-by: Andrew Cholakian Co-authored-by: Yash Tewari Co-authored-by: Quentin Pradet Co-authored-by: Andrew Kroh Co-authored-by: Julien Mailleret <8582351+jmlrt@users.noreply.github.com> Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com> Co-authored-by: Chris Mark Co-authored-by: apmmachine Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com> Co-authored-by: Julia Bardi <90178898+juliaElastic@users.noreply.github.com> Co-authored-by: Edoardo Tenani <526307+endorama@users.noreply.github.com> Co-authored-by: Alex K <8418476+fearful-symmetry@users.noreply.github.com> (cherry picked from commit 7a748fa0fdf4ab786583c7a38169d099b58a7c02) Co-authored-by: Blake Rouse --- .gitignore | 1 - NOTICE.txt | 4 +- ...pawned-components-to-simplify-logging.yaml | 31 +++ go.mod | 2 +- go.sum | 4 +- .../application/monitoring/v1_monitor.go | 178 ++++++----------- pkg/component/runtime/command.go | 35 +++- pkg/component/runtime/log_writer.go | 171 ++++++++++++++++ pkg/component/runtime/log_writer_test.go | 186 ++++++++++++++++++ pkg/component/runtime/runtime.go | 4 +- pkg/component/spec.go | 18 ++ pkg/core/logger/logger.go | 7 + specs/apm-server.spec.yml | 48 ++--- specs/auditbeat.spec.yml | 88 +++++---- specs/cloudbeat.spec.yml | 80 ++++---- specs/filebeat.spec.yml | 4 +- specs/heartbeat.spec.yml | 4 +- specs/metricbeat.spec.yml | 4 +- specs/osquerybeat.spec.yml | 54 ++--- specs/packetbeat.spec.yml | 60 +++--- 20 files changed, 682 insertions(+), 301 deletions(-) create mode 100644 changelog/fragments/1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml create mode 100644 pkg/component/runtime/log_writer.go create mode 100644 pkg/component/runtime/log_writer_test.go diff --git a/.gitignore b/.gitignore index 57546893fb4..476cfd50764 100644 --- a/.gitignore +++ b/.gitignore @@ -45,7 +45,6 @@ fleet.enc.lock # Files generated with the bump version automations *.bck - # agent build/ elastic-agent diff --git a/NOTICE.txt b/NOTICE.txt index 7bc5103d040..cdd71e1a34f 100644 --- a/NOTICE.txt +++ b/NOTICE.txt @@ -1273,11 +1273,11 @@ SOFTWARE -------------------------------------------------------------------------------- Dependency : github.com/elastic/elastic-agent-libs -Version: v0.2.6 +Version: v0.2.15 Licence type (autodetected): Apache-2.0 -------------------------------------------------------------------------------- -Contents of probable licence file $GOMODCACHE/github.com/elastic/elastic-agent-libs@v0.2.6/LICENSE: +Contents of probable licence file $GOMODCACHE/github.com/elastic/elastic-agent-libs@v0.2.15/LICENSE: Apache License Version 2.0, January 2004 diff --git a/changelog/fragments/1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml b/changelog/fragments/1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml new file mode 100644 index 00000000000..8dfa6a9aa2f --- /dev/null +++ b/changelog/fragments/1669236059-Capture-stdout-stderr-of-all-spawned-components-to-simplify-logging.yaml @@ -0,0 +1,31 @@ +# Kind can be one of: +# - breaking-change: a change to previously-documented behavior +# - deprecation: functionality that is being removed in a later release +# - bug-fix: fixes a problem in a previous version +# - enhancement: extends functionality but does not break or fix existing behavior +# - feature: new functionality +# - known-issue: problems that we are aware of in a given version +# - security: impacts on the security of a product or a user’s deployment. +# - upgrade: important information for someone upgrading from a prior version +# - other: does not fit into any of the other categories +kind: feature + +# Change summary; a 80ish characters long description of the change. +summary: Capture stdout/stderr of all spawned components and adjust default log level to info for all components + +# Long description; in case the summary is not enough to describe the change +# this field accommodate a description without length limits. +#description: + +# Affected component; a word indicating the component this changeset affects. +component: + +# PR number; optional; the PR number that added the changeset. +# If not present is automatically filled by the tooling finding the PR where this changelog fragment has been added. +# NOTE: the tooling supports backports, so it's able to fill the original PR number instead of the backport PR number. +# Please provide it if you are adding a fragment for a different PR. +pr: 1702 + +# Issue number; optional; the GitHub issue related to this changeset (either closes or is part of). +# If not present is automatically filled by the tooling with the issue linked to the PR number. +issue: 221 diff --git a/go.mod b/go.mod index df1845dff01..4d44c91ae45 100644 --- a/go.mod +++ b/go.mod @@ -14,7 +14,7 @@ require ( github.com/elastic/e2e-testing v1.99.2-0.20220117192005-d3365c99b9c4 github.com/elastic/elastic-agent-autodiscover v0.2.1 github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484 - github.com/elastic/elastic-agent-libs v0.2.6 + github.com/elastic/elastic-agent-libs v0.2.15 github.com/elastic/elastic-agent-system-metrics v0.4.4 github.com/elastic/go-licenser v0.4.0 github.com/elastic/go-sysinfo v1.8.1 diff --git a/go.sum b/go.sum index 73ded2d2cf3..ac08a20814c 100644 --- a/go.sum +++ b/go.sum @@ -387,8 +387,8 @@ github.com/elastic/elastic-agent-autodiscover v0.2.1/go.mod h1:gPnzzfdYNdgznAb+i github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484 h1:uJIMfLgCenJvxsVmEjBjYGxt0JddCgw2IxgoNfcIXOk= github.com/elastic/elastic-agent-client/v7 v7.0.0-20220804181728-b0328d2fe484/go.mod h1:fkvyUfFwyAG5OnMF0h+FV9sC0Xn9YLITwQpSuwungQs= github.com/elastic/elastic-agent-libs v0.2.5/go.mod h1:chO3rtcLyGlKi9S0iGVZhYCzDfdDsAQYBc+ui588AFE= -github.com/elastic/elastic-agent-libs v0.2.6 h1:DpcUcCVYZ7lNtHLUlyT1u/GtGAh49wpL15DTH7+8O5o= -github.com/elastic/elastic-agent-libs v0.2.6/go.mod h1:chO3rtcLyGlKi9S0iGVZhYCzDfdDsAQYBc+ui588AFE= +github.com/elastic/elastic-agent-libs v0.2.15 h1:hdAbrZZ2mCPcQLRCE3E8xw3mHKl8HFMt36w7jan/XGo= +github.com/elastic/elastic-agent-libs v0.2.15/go.mod h1:0J9lzJh+BjttIiVjYDLncKYCEWUUHiiqnuI64y6C6ss= github.com/elastic/elastic-agent-system-metrics v0.4.4 h1:Br3S+TlBhijrLysOvbHscFhgQ00X/trDT5VEnOau0E0= github.com/elastic/elastic-agent-system-metrics v0.4.4/go.mod h1:tF/f9Off38nfzTZHIVQ++FkXrDm9keFhFpJ+3pQ00iI= github.com/elastic/elastic-package v0.32.1/go.mod h1:l1fEnF52XRBL6a5h6uAemtdViz2bjtjUtgdQcuRhEAY= diff --git a/internal/pkg/agent/application/monitoring/v1_monitor.go b/internal/pkg/agent/application/monitoring/v1_monitor.go index 24813cac0d6..70a6e062063 100644 --- a/internal/pkg/agent/application/monitoring/v1_monitor.go +++ b/internal/pkg/agent/application/monitoring/v1_monitor.go @@ -58,7 +58,7 @@ var ( supportedBeatsComponents = []string{"filebeat", "metricbeat", "apm-server", "fleet-server", "auditbeat", "cloudbeat", "heartbeat", "osquerybeat", "packetbeat"} ) -// Beats monitor is providing V1 monitoring support. +// BeatsMonitor is providing V1 monitoring support for metrics and logs for endpoint-security only. type BeatsMonitor struct { enabled bool // feature flag disabling whole v1 monitoring story config *monitoringConfig @@ -178,21 +178,10 @@ func (b *BeatsMonitor) EnrichArgs(unit, binary string, args []string) []string { } } - loggingPath := loggingPath(unit, b.operatingSystem) - if loggingPath != "" { + if !b.config.C.LogMetrics { appendix = append(appendix, - "-E", "logging.files.path="+filepath.Dir(loggingPath), - "-E", "logging.files.name="+filepath.Base(loggingPath), - "-E", "logging.files.keepfiles=7", - "-E", "logging.files.permission=0640", - "-E", "logging.files.interval=1h", + "-E", "logging.metrics.enabled=false", ) - - if !b.config.C.LogMetrics { - appendix = append(appendix, - "-E", "logging.metrics.enabled=false", - ) - } } return append(args, appendix...) @@ -291,24 +280,21 @@ func (b *BeatsMonitor) injectMonitoringOutput(source, dest map[string]interface{ func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDToBinary map[string]string, monitoringOutput string) error { monitoringNamespace := b.monitoringNamespace() - //fixedAgentName := strings.ReplaceAll(agentName, "-", "_") logsDrop := filepath.Dir(loggingPath("unit", b.operatingSystem)) streams := []interface{}{ map[string]interface{}{ - idKey: "filestream-monitoring-agent", - // "data_stream" is not used when creating an Input on Filebeat - "data_stream": map[string]interface{}{ - "type": "filestream", - "dataset": "elastic_agent", - "namespace": monitoringNamespace, - }, + idKey: "filestream-monitoring-agent", "type": "filestream", "paths": []interface{}{ filepath.Join(logsDrop, agentName+"-*.ndjson"), filepath.Join(logsDrop, agentName+"-watcher-*.ndjson"), }, - "index": fmt.Sprintf("logs-elastic_agent-%s", monitoringNamespace), + "data_stream": map[string]interface{}{ + "type": "logs", + "dataset": "elastic_agent", + "namespace": monitoringNamespace, + }, "close": map[string]interface{}{ "on_state_change": map[string]interface{}{ "inactive": "5m", @@ -325,133 +311,86 @@ func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDTo }, }, "processors": []interface{}{ + // copy original dataset so we can drop the dataset field map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "data_stream", - "fields": map[string]interface{}{ - "type": "logs", - "dataset": "elastic_agent", - "namespace": monitoringNamespace, - }, - }, - }, - map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "event", - "fields": map[string]interface{}{ - "dataset": "elastic_agent", - }, - }, - }, - map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "elastic_agent", - "fields": map[string]interface{}{ - "id": b.agentInfo.AgentID(), - "version": b.agentInfo.Version(), - "snapshot": b.agentInfo.Snapshot(), - }, - }, - }, - map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "agent", - "fields": map[string]interface{}{ - "id": b.agentInfo.AgentID(), + "copy_fields": map[string]interface{}{ + "fields": []interface{}{ + map[string]interface{}{ + "from": "data_stream.dataset", + "to": "data_stream.dataset_original", + }, }, }, }, + // drop the dataset field so following copy_field can copy to it map[string]interface{}{ "drop_fields": map[string]interface{}{ "fields": []interface{}{ - "ecs.version", //coming from logger, already added by libbeat + "data_stream.dataset", }, - "ignore_missing": true, - }, - }}, - }, - } - for unit, binaryName := range componentIDToBinary { - if !isSupportedBinary(binaryName) { - continue - } - - fixedBinaryName := strings.ReplaceAll(binaryName, "-", "_") - name := strings.ReplaceAll(unit, "-", "_") // conform with index naming policy - logFile := loggingPath(unit, b.operatingSystem) - streams = append(streams, map[string]interface{}{ - idKey: "filestream-monitoring-" + name, - "data_stream": map[string]interface{}{ - // "data_stream" is not used when creating an Input on Filebeat - "type": "filestream", - "dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName), - "namespace": monitoringNamespace, - }, - "type": "filestream", - "index": fmt.Sprintf("logs-elastic_agent.%s-%s", fixedBinaryName, monitoringNamespace), - "paths": []interface{}{logFile, logFile + "*"}, - "close": map[string]interface{}{ - "on_state_change": map[string]interface{}{ - "inactive": "5m", - }, - }, - "parsers": []interface{}{ - map[string]interface{}{ - "ndjson": map[string]interface{}{ - "message_key": "message", - "overwrite_keys": true, - "add_error_key": true, - "target": "", }, }, - }, - "processors": []interface{}{ + // copy component.dataset as the real dataset map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "data_stream", - "fields": map[string]interface{}{ - "type": "logs", - "dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName), - "namespace": monitoringNamespace, + "copy_fields": map[string]interface{}{ + "fields": []interface{}{ + map[string]interface{}{ + "from": "component.dataset", + "to": "data_stream.dataset", + }, }, + "fail_on_error": false, + "ignore_missing": true, }, }, + // possible it's a log message from agent itself (doesn't have component.dataset) map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "event", - "fields": map[string]interface{}{ - "dataset": fmt.Sprintf("elastic_agent.%s", fixedBinaryName), + "copy_fields": map[string]interface{}{ + "fields": []interface{}{ + map[string]interface{}{ + "from": "data_stream.dataset_original", + "to": "data_stream.dataset", + }, }, + "fail_on_error": false, }, }, + // drop the original dataset copied and the event.dataset (as it will be updated) map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "elastic_agent", - "fields": map[string]interface{}{ - "id": b.agentInfo.AgentID(), - "version": b.agentInfo.Version(), - "snapshot": b.agentInfo.Snapshot(), + "drop_fields": map[string]interface{}{ + "fields": []interface{}{ + "data_stream.dataset_original", + "event.dataset", }, }, }, + // update event.dataset with the now used data_stream.dataset map[string]interface{}{ - "add_fields": map[string]interface{}{ - "target": "agent", - "fields": map[string]interface{}{ - "id": b.agentInfo.AgentID(), + "copy_fields": map[string]interface{}{ + "fields": []interface{}{ + map[string]interface{}{ + "from": "data_stream.dataset", + "to": "event.dataset", + }, }, }, }, + // coming from logger, added by agent (drop) map[string]interface{}{ "drop_fields": map[string]interface{}{ "fields": []interface{}{ - "ecs.version", //coming from logger, already added by libbeat + "ecs.version", }, "ignore_missing": true, }, }, - }, - }) + // adjust destination data_stream based on the data_stream fields + map[string]interface{}{ + "add_formatted_index": map[string]interface{}{ + "index": "%{[data_stream.type]}-%{[data_stream.dataset]}-%{[data_stream.namespace]}", + }, + }}, + }, } inputs := []interface{}{ @@ -460,10 +399,7 @@ func (b *BeatsMonitor) injectLogsInput(cfg map[string]interface{}, componentIDTo "name": "filestream-monitoring-agent", "type": "filestream", useOutputKey: monitoringOutput, - "data_stream": map[string]interface{}{ - "namespace": monitoringNamespace, - }, - "streams": streams, + "streams": streams, }, } inputsNode, found := cfg[inputsKey] diff --git a/pkg/component/runtime/command.go b/pkg/component/runtime/command.go index 2575a35d5f1..405a4329db5 100644 --- a/pkg/component/runtime/command.go +++ b/pkg/component/runtime/command.go @@ -12,14 +12,16 @@ import ( "os/exec" "path/filepath" "runtime" + "strings" "time" - "github.com/elastic/elastic-agent/internal/pkg/agent/application/paths" - "github.com/elastic/elastic-agent/pkg/utils" - "github.com/elastic/elastic-agent-client/v7/pkg/client" + + "github.com/elastic/elastic-agent/internal/pkg/agent/application/paths" "github.com/elastic/elastic-agent/pkg/component" + "github.com/elastic/elastic-agent/pkg/core/logger" "github.com/elastic/elastic-agent/pkg/core/process" + "github.com/elastic/elastic-agent/pkg/utils" ) type actionMode int @@ -50,6 +52,7 @@ type procState struct { // CommandRuntime provides the command runtime for running a component as a subprocess. type CommandRuntime struct { + logger *logger.Logger current component.Component monitor MonitoringManager @@ -67,7 +70,7 @@ type CommandRuntime struct { } // NewCommandRuntime creates a new command runtime for the provided component. -func NewCommandRuntime(comp component.Component, monitor MonitoringManager) (ComponentRuntime, error) { +func NewCommandRuntime(comp component.Component, logger *logger.Logger, monitor MonitoringManager) (ComponentRuntime, error) { c := &CommandRuntime{ current: comp, monitor: monitor, @@ -82,6 +85,11 @@ func NewCommandRuntime(comp component.Component, monitor MonitoringManager) (Com if cmdSpec == nil { return nil, errors.New("must have command defined in specification") } + c.logger = logger.With("component", map[string]interface{}{ + "id": comp.ID, + "type": c.getSpecType(), + "binary": c.getSpecBinaryName(), + }) return c, nil } @@ -306,7 +314,7 @@ func (c *CommandRuntime) start(comm Communicator) error { proc, err := process.Start(path, process.WithArgs(args), process.WithEnv(env), - process.WithCmdOptions(attachOutErr, dirPath(workDir))) + process.WithCmdOptions(attachOutErr(c.current, c.getCommandSpec(), c.getSpecType(), c.getSpecBinaryName()), dirPath(workDir))) if err != nil { return err } @@ -452,10 +460,19 @@ func (c *CommandRuntime) getCommandSpec() *component.CommandSpec { return nil } -func attachOutErr(cmd *exec.Cmd) error { - cmd.Stdout = os.Stdout - cmd.Stderr = os.Stderr - return nil +func attachOutErr(comp component.Component, cmdSpec *component.CommandSpec, typeStr string, binaryName string) process.CmdOption { + return func(cmd *exec.Cmd) error { + dataset := fmt.Sprintf("elastic_agent.%s", strings.ReplaceAll(strings.ReplaceAll(comp.ID, "-", "_"), "/", "_")) + logger := logger.NewWithoutConfig("").With("component", map[string]interface{}{ + "id": comp.ID, + "type": typeStr, + "binary": binaryName, + "dataset": dataset, + }) + cmd.Stdout = newLogWriter(logger.Core(), cmdSpec.Log) + cmd.Stderr = newLogWriter(logger.Core(), cmdSpec.Log) + return nil + } } func dirPath(path string) process.CmdOption { diff --git a/pkg/component/runtime/log_writer.go b/pkg/component/runtime/log_writer.go new file mode 100644 index 00000000000..6825769f364 --- /dev/null +++ b/pkg/component/runtime/log_writer.go @@ -0,0 +1,171 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +package runtime + +import ( + "bytes" + "encoding/json" + "errors" + "strings" + "time" + + "go.uber.org/zap" + "go.uber.org/zap/zapcore" + + "github.com/elastic/elastic-agent/pkg/component" +) + +type zapcoreWriter interface { + Write(zapcore.Entry, []zapcore.Field) error +} + +// logWriter is an `io.Writer` that takes lines and passes them through the logger. +// +// `Write` handles parsing lines as either ndjson or plain text. +type logWriter struct { + loggerCore zapcoreWriter + logCfg component.CommandLogSpec + remainder []byte +} + +func newLogWriter(core zapcoreWriter, logCfg component.CommandLogSpec) *logWriter { + return &logWriter{ + loggerCore: core, + logCfg: logCfg, + } +} + +func (r *logWriter) Write(p []byte) (int, error) { + if len(p) == 0 { + // nothing to do + return 0, nil + } + offset := 0 + for { + idx := bytes.IndexByte(p[offset:], '\n') + if idx < 0 { + // not all used add to remainder to be used on next call + r.remainder = append(r.remainder, p[offset:]...) + return len(p), nil + } + + var line []byte + if r.remainder != nil { + line = r.remainder + r.remainder = nil + line = append(line, p[offset:offset+idx]...) + } else { + line = append(line, p[offset:offset+idx]...) + } + offset += idx + 1 + // drop '\r' from line (needed for Windows) + if len(line) > 0 && line[len(line)-1] == '\r' { + line = line[0 : len(line)-1] + } + if len(line) == 0 { + // empty line + continue + } + str := strings.TrimSpace(string(line)) + // try to parse line as JSON + if str[0] == '{' && r.handleJSON(str) { + // handled as JSON + continue + } + // considered standard text being it's not JSON, log at basic info level + _ = r.loggerCore.Write(zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Now(), + Message: str, + }, nil) + } +} + +func (r *logWriter) handleJSON(line string) bool { + var evt map[string]interface{} + if err := json.Unmarshal([]byte(line), &evt); err != nil { + return false + } + lvl := getLevel(evt, r.logCfg.LevelKey) + ts := getTimestamp(evt, r.logCfg.TimeKey, r.logCfg.TimeFormat) + msg := getMessage(evt, r.logCfg.MessageKey) + fields := getFields(evt, r.logCfg.IgnoreKeys) + _ = r.loggerCore.Write(zapcore.Entry{ + Level: lvl, + Time: ts, + Message: msg, + }, fields) + return true +} + +func getLevel(evt map[string]interface{}, key string) zapcore.Level { + lvl := zapcore.InfoLevel + err := unmarshalLevel(&lvl, getStrVal(evt, key)) + if err == nil { + delete(evt, key) + } + return lvl +} + +func unmarshalLevel(lvl *zapcore.Level, val string) error { + if val == "" { + return errors.New("empty val") + } else if val == "trace" { + // zap doesn't handle trace level we cast to debug + *lvl = zapcore.DebugLevel + return nil + } + return lvl.UnmarshalText([]byte(val)) +} + +func getMessage(evt map[string]interface{}, key string) string { + msg := getStrVal(evt, key) + if msg != "" { + delete(evt, key) + } + return msg +} + +func getTimestamp(evt map[string]interface{}, key string, format string) time.Time { + t, err := time.Parse(format, getStrVal(evt, key)) + if err == nil { + delete(evt, key) + return t + } + return time.Now() +} + +func getFields(evt map[string]interface{}, ignore []string) []zapcore.Field { + fields := make([]zapcore.Field, 0, len(evt)) + for k, v := range evt { + if len(ignore) > 0 && contains(ignore, k) { + // ignore field + continue + } + fields = append(fields, zap.Any(k, v)) + } + return fields +} + +func getStrVal(evt map[string]interface{}, key string) string { + raw, ok := evt[key] + if !ok { + return "" + } + str, ok := raw.(string) + if !ok { + return "" + } + return str +} + +func contains(s []string, val string) bool { + for _, v := range s { + if v == val { + return true + } + } + return false +} diff --git a/pkg/component/runtime/log_writer_test.go b/pkg/component/runtime/log_writer_test.go new file mode 100644 index 00000000000..5da512e9f77 --- /dev/null +++ b/pkg/component/runtime/log_writer_test.go @@ -0,0 +1,186 @@ +// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one +// or more contributor license agreements. Licensed under the Elastic License; +// you may not use this file except in compliance with the Elastic License. + +package runtime + +import ( + "sort" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "go.uber.org/zap" + "go.uber.org/zap/zapcore" + + "github.com/elastic/elastic-agent/pkg/component" +) + +type wrote struct { + entry zapcore.Entry + fields []zapcore.Field +} + +func TestLogWriter(t *testing.T) { + scenarios := []struct { + Name string + Config component.CommandLogSpec + Lines []string + Wrote []wrote + }{ + { + Name: "multi plain text line", + Lines: []string{ + "simple written line\r\n", + "another written line\n", + }, + Wrote: []wrote{ + { + entry: zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Time{}, + Message: "simple written line", + }, + }, + { + entry: zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Time{}, + Message: "another written line", + }, + }, + }, + }, + { + Name: "multi split text line", + Lines: []string{ + "simple written line\r\n", + " another line sp", + "lit on ", + "", + "multi writes\n", + "\r\n", + "\n", + }, + Wrote: []wrote{ + { + entry: zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Time{}, + Message: "simple written line", + }, + }, + { + entry: zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Time{}, + Message: "another line split on multi writes", + }, + }, + }, + }, + { + Name: "json log line split", + Config: component.CommandLogSpec{ + LevelKey: "log.level", + TimeKey: "@timestamp", + TimeFormat: time.RFC3339Nano, + MessageKey: "message", + IgnoreKeys: []string{"ignore"}, + }, + Lines: []string{ + `{"@timestamp": "2009-11-10T23:00:00Z", "log.level": "debug", "message": "message`, + ` field", "string": "extra", "int": 50, "ignore": "other"}`, + "\n", + }, + Wrote: []wrote{ + { + entry: zapcore.Entry{ + Level: zapcore.DebugLevel, + Time: parseTime("2009-11-10T23:00:00Z", time.RFC3339Nano), + Message: "message field", + }, + fields: []zapcore.Field{ + zap.String("string", "extra"), + zap.Float64("int", 50), + }, + }, + }, + }, + { + Name: "invalid JSON line", + Lines: []string{ + `{"broken": json`, + "\n", + }, + Wrote: []wrote{ + { + entry: zapcore.Entry{ + Level: zapcore.InfoLevel, + Time: time.Time{}, + Message: `{"broken": json`, + }, + }, + }, + }, + } + + for _, scenario := range scenarios { + t.Run(scenario.Name, func(t *testing.T) { + c := &captureCore{} + w := newLogWriter(c, scenario.Config) + for _, line := range scenario.Lines { + l := len([]byte(line)) + c, err := w.Write([]byte(line)) + require.NoError(t, err) + require.Equal(t, l, c) + } + require.Len(t, c.wrote, len(scenario.Wrote)) + for i := 0; i < len(scenario.Wrote); i++ { + e := scenario.Wrote[i] + o := c.wrote[i] + if e.entry.Time.IsZero() { + // can't ensure times match; set it to observed before ensuring its equal + e.entry.Time = o.entry.Time + } + assert.Equal(t, e.entry, o.entry) + + // ensure the fields are in the same order (doesn't really matter for logging; but test cares) + if len(e.fields) > 0 { + sortFields(e.fields) + } + if len(o.fields) > 0 { + sortFields(o.fields) + } + assert.EqualValues(t, e.fields, o.fields) + } + }) + } +} + +type captureCore struct { + wrote []wrote +} + +func (c *captureCore) Write(entry zapcore.Entry, fields []zapcore.Field) error { + c.wrote = append(c.wrote, wrote{ + entry: entry, + fields: fields, + }) + return nil +} + +func parseTime(t string, format string) time.Time { + v, err := time.Parse(format, t) + if err != nil { + panic(err) + } + return v +} + +func sortFields(fields []zapcore.Field) { + sort.Slice(fields, func(i, j int) bool { + return fields[i].Key < fields[j].Key + }) +} diff --git a/pkg/component/runtime/runtime.go b/pkg/component/runtime/runtime.go index 0ed1b46c26c..aa780a002e5 100644 --- a/pkg/component/runtime/runtime.go +++ b/pkg/component/runtime/runtime.go @@ -60,7 +60,7 @@ func NewComponentRuntime(comp component.Component, logger *logger.Logger, monito } if comp.InputSpec != nil { if comp.InputSpec.Spec.Command != nil { - return NewCommandRuntime(comp, monitor) + return NewCommandRuntime(comp, logger, monitor) } if comp.InputSpec.Spec.Service != nil { return NewServiceRuntime(comp, logger) @@ -69,7 +69,7 @@ func NewComponentRuntime(comp component.Component, logger *logger.Logger, monito } if comp.ShipperSpec != nil { if comp.ShipperSpec.Spec.Command != nil { - return NewCommandRuntime(comp, monitor) + return NewCommandRuntime(comp, logger, monitor) } return nil, errors.New("components for shippers can only support command runtime") } diff --git a/pkg/component/spec.go b/pkg/component/spec.go index e7ec47a5811..fd109414736 100644 --- a/pkg/component/spec.go +++ b/pkg/component/spec.go @@ -78,6 +78,7 @@ type CommandSpec struct { Args []string `config:"args,omitempty" yaml:"args,omitempty"` Env []CommandEnvSpec `config:"env,omitempty" yaml:"env,omitempty"` Timeouts CommandTimeoutSpec `config:"timeouts" yaml:"timeouts"` + Log CommandLogSpec `config:"log" yaml:"log"` } // CommandEnvSpec is the specification that defines environment variables that will be set to execute the subprocess. @@ -100,6 +101,23 @@ func (t *CommandTimeoutSpec) InitDefaults() { t.Stop = 30 * time.Second } +// CommandLogSpec is the log specification for subprocess. +type CommandLogSpec struct { + LevelKey string `config:"level_key" yaml:"level_key"` + TimeKey string `config:"time_key" yaml:"time_key"` + TimeFormat string `config:"time_format" yaml:"time_format"` + MessageKey string `config:"message_key" yaml:"message_key"` + IgnoreKeys []string `config:"ignore_keys" yaml:"ignore_keys"` +} + +// InitDefaults initialized the defaults for the timeouts. +func (t *CommandLogSpec) InitDefaults() { + t.LevelKey = "log.level" + t.TimeKey = "@timestamp" + t.TimeFormat = "2006-01-02T15:04:05.000Z0700" + t.MessageKey = "message" +} + // ServiceTimeoutSpec is the timeout specification for subprocess. type ServiceTimeoutSpec struct { Checkin time.Duration `config:"checkin" yaml:"checkin"` diff --git a/pkg/core/logger/logger.go b/pkg/core/logger/logger.go index 049fd271038..8c1aa50e98e 100644 --- a/pkg/core/logger/logger.go +++ b/pkg/core/logger/logger.go @@ -58,6 +58,13 @@ func NewFromConfig(name string, cfg *Config, logInternal bool) (*Logger, error) return new(name, cfg, logInternal) } +// NewWithoutConfig returns a new logger without having a configuration. +// +// Use only when a clean logger is needed, and it is known that the logging configuration has already been performed. +func NewWithoutConfig(name string) *Logger { + return logp.NewLogger(name) +} + func new(name string, cfg *Config, logInternal bool) (*Logger, error) { commonCfg, err := toCommonConfig(cfg) if err != nil { diff --git a/specs/apm-server.spec.yml b/specs/apm-server.spec.yml index e646e9facce..0545d7ec307 100644 --- a/specs/apm-server.spec.yml +++ b/specs/apm-server.spec.yml @@ -1,23 +1,25 @@ -version: 2 -inputs: - - name: apm - description: "APM Server" - platforms: - - linux/amd64 - - linux/arm64 - - darwin/amd64 - - darwin/arm64 - - windows/amd64 - - container/amd64 - - container/arm64 - outputs: - - elasticsearch - - kafka - - logstash - - redis - command: - args: - - "-E" - - "management.enabled=true" - - "-E" - - "gc_percent=${APMSERVER_GOGC:100}" +version: 2 +inputs: + - name: apm + description: "APM Server" + platforms: + - linux/amd64 + - linux/arm64 + - darwin/amd64 + - darwin/arm64 + - windows/amd64 + - container/amd64 + - container/arm64 + outputs: + - elasticsearch + - kafka + - logstash + - redis + command: + args: + - "-E" + - "management.enabled=true" + - "-E" + - "gc_percent=${APMSERVER_GOGC:100}" + - "-E" + - "logging.to_stderr=true" diff --git a/specs/auditbeat.spec.yml b/specs/auditbeat.spec.yml index f8c46a96873..a54a47fbbe8 100644 --- a/specs/auditbeat.spec.yml +++ b/specs/auditbeat.spec.yml @@ -1,43 +1,45 @@ -version: 2 -inputs: - - name: audit/auditd - description: "Auditd" - platforms: &platforms - - linux/amd64 - - linux/arm64 - - darwin/amd64 - - darwin/arm64 - - windows/amd64 - - container/amd64 - - container/arm64 - outputs: &outputs - - elasticsearch - - kafka - - logstash - - redis - command: - args: &args - - "-E" - - "setup.ilm.enabled=false" - - "-E" - - "setup.template.enabled=false" - - "-E" - - "management.enabled=true" - - "-E" - - "logging.level=debug" - - "-E" - - "gc_percent=${AUDITBEAT_GOGC:100}" - - "-E" - - "auditbeat.config.modules.enabled=false" - - name: audit/file_integrity - description: "Audit File Integrity" - platforms: *platforms - outputs: *outputs - command: - args: *args - - name: audit/system - description: "Audit System" - platforms: *platforms - outputs: *outputs - command: - args: *args +version: 2 +inputs: + - name: audit/auditd + description: "Auditd" + platforms: &platforms + - linux/amd64 + - linux/arm64 + - darwin/amd64 + - darwin/arm64 + - windows/amd64 + - container/amd64 + - container/arm64 + outputs: &outputs + - elasticsearch + - kafka + - logstash + - redis + command: + args: &args + - "-E" + - "setup.ilm.enabled=false" + - "-E" + - "setup.template.enabled=false" + - "-E" + - "management.enabled=true" + - "-E" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" + - "-E" + - "gc_percent=${AUDITBEAT_GOGC:100}" + - "-E" + - "auditbeat.config.modules.enabled=false" + - name: audit/file_integrity + description: "Audit File Integrity" + platforms: *platforms + outputs: *outputs + command: + args: *args + - name: audit/system + description: "Audit System" + platforms: *platforms + outputs: *outputs + command: + args: *args diff --git a/specs/cloudbeat.spec.yml b/specs/cloudbeat.spec.yml index 1ecbe47e330..337ac250622 100644 --- a/specs/cloudbeat.spec.yml +++ b/specs/cloudbeat.spec.yml @@ -1,39 +1,41 @@ -version: 2 -inputs: - - name: cloudbeat - description: "Cloudbeat" - platforms: &platforms - - linux/amd64 - - linux/arm64 - - darwin/amd64 - - darwin/arm64 - - windows/amd64 - - container/amd64 - - container/arm64 - outputs: &outputs - - elasticsearch - - kafka - - logstash - - redis - command: - args: &args - - "-E" - - "management.enabled=true" - - "-E" - - "setup.ilm.enabled=false" - - "-E" - - "setup.template.enabled=false" - - "-E" - - "gc_percent=${CLOUDBEAT_GOGC:100}" - - name: cloudbeat/cis_k8s - description: "CIS Kubernetes monitoring" - platforms: *platforms - outputs: *outputs - command: - args: *args - - name: cloudbeat/cis_eks - description: "CIS elastic Kubernetes monitoring" - platforms: *platforms - outputs: *outputs - command: - args: *args \ No newline at end of file +version: 2 +inputs: + - name: cloudbeat + description: "Cloudbeat" + platforms: &platforms + - linux/amd64 + - linux/arm64 + - darwin/amd64 + - darwin/arm64 + - windows/amd64 + - container/amd64 + - container/arm64 + outputs: &outputs + - elasticsearch + - kafka + - logstash + - redis + command: + args: &args + - "-E" + - "management.enabled=true" + - "-E" + - "setup.ilm.enabled=false" + - "-E" + - "setup.template.enabled=false" + - "-E" + - "logging.to_stderr=true" + - "-E" + - "gc_percent=${CLOUDBEAT_GOGC:100}" + - name: cloudbeat/cis_k8s + description: "CIS Kubernetes monitoring" + platforms: *platforms + outputs: *outputs + command: + args: *args + - name: cloudbeat/cis_eks + description: "CIS elastic Kubernetes monitoring" + platforms: *platforms + outputs: *outputs + command: + args: *args diff --git a/specs/filebeat.spec.yml b/specs/filebeat.spec.yml index e18fcbb1e65..609fa1f5804 100644 --- a/specs/filebeat.spec.yml +++ b/specs/filebeat.spec.yml @@ -26,7 +26,9 @@ inputs: - "-E" - "management.enabled=true" - "-E" - - "logging.level=debug" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" - "-E" - "gc_percent=${FILEBEAT_GOGC:100}" - "-E" diff --git a/specs/heartbeat.spec.yml b/specs/heartbeat.spec.yml index 48c48541cb6..6cc20cdae5d 100644 --- a/specs/heartbeat.spec.yml +++ b/specs/heartbeat.spec.yml @@ -21,7 +21,9 @@ inputs: - "-E" - "management.enabled=true" - "-E" - - "logging.level=debug" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" - "-E" - "gc_percent=${HEARTBEAT_GOGC:100}" - name: synthetics/http diff --git a/specs/metricbeat.spec.yml b/specs/metricbeat.spec.yml index b7c88ad4864..e795c3b6710 100644 --- a/specs/metricbeat.spec.yml +++ b/specs/metricbeat.spec.yml @@ -26,7 +26,9 @@ inputs: - "-E" - "management.enabled=true" - "-E" - - "logging.level=debug" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" - "-E" - "gc_percent=${METRICBEAT_GOGC:100}" - "-E" diff --git a/specs/osquerybeat.spec.yml b/specs/osquerybeat.spec.yml index 31edb9a3edb..2bf4e53b8f8 100644 --- a/specs/osquerybeat.spec.yml +++ b/specs/osquerybeat.spec.yml @@ -1,26 +1,28 @@ -version: 2 -inputs: - - name: osquery - description: "Osquery" - platforms: - - linux/amd64 - - linux/arm64 - - darwin/amd64 - - darwin/arm64 - - windows/amd64 - - container/amd64 - - container/arm64 - outputs: - - elasticsearch - command: - args: - - "-E" - - "setup.ilm.enabled=false" - - "-E" - - "setup.template.enabled=false" - - "-E" - - "management.enabled=true" - - "-E" - - "logging.level=debug" - - "-E" - - "gc_percent=${OSQUERYBEAT_GOGC:100}" +version: 2 +inputs: + - name: osquery + description: "Osquery" + platforms: + - linux/amd64 + - linux/arm64 + - darwin/amd64 + - darwin/arm64 + - windows/amd64 + - container/amd64 + - container/arm64 + outputs: + - elasticsearch + command: + args: + - "-E" + - "setup.ilm.enabled=false" + - "-E" + - "setup.template.enabled=false" + - "-E" + - "management.enabled=true" + - "-E" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" + - "-E" + - "gc_percent=${OSQUERYBEAT_GOGC:100}" diff --git a/specs/packetbeat.spec.yml b/specs/packetbeat.spec.yml index 0519078cac8..cd788b89add 100644 --- a/specs/packetbeat.spec.yml +++ b/specs/packetbeat.spec.yml @@ -1,29 +1,31 @@ -version: 2 -inputs: - - name: packet - description: "Packet Capture" - platforms: - - linux/amd64 - - linux/arm64 - - darwin/amd64 - - darwin/arm64 - - windows/amd64 - - container/amd64 - - container/arm64 - outputs: - - elasticsearch - - kafka - - logstash - - redis - command: - args: - - "-E" - - "setup.ilm.enabled=false" - - "-E" - - "setup.template.enabled=false" - - "-E" - - "management.enabled=true" - - "-E" - - "logging.level=debug" - - "-E" - - "gc_percent=${PACKETBEAT_GOGC:100}" +version: 2 +inputs: + - name: packet + description: "Packet Capture" + platforms: + - linux/amd64 + - linux/arm64 + - darwin/amd64 + - darwin/arm64 + - windows/amd64 + - container/amd64 + - container/arm64 + outputs: + - elasticsearch + - kafka + - logstash + - redis + command: + args: + - "-E" + - "setup.ilm.enabled=false" + - "-E" + - "setup.template.enabled=false" + - "-E" + - "management.enabled=true" + - "-E" + - "logging.level=info" + - "-E" + - "logging.to_stderr=true" + - "-E" + - "gc_percent=${PACKETBEAT_GOGC:100}"