-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encountered channel not found error on adding Windows integration to the Windows agent. #5746
Comments
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
@muskangulati-qasource Please review. |
Secondary review is done for this ticket! |
I see this is privileged/admin agent looking in agent-info.yaml: agent_id: 881c5687-32af-4bf9-b62f-4b74f2f688ec
headers: {}
log_level: info
snapshot: true
unprivileged: false
version: 8.16.0
Also that this is coming from the winlog input. Tagging @nfritts and @elastic/sec-windows-platform. input-winlog-default-winlog-windows-4ea5f67a-48fc-41ea-b586-2a29eac6423a:
message: 'Encountered channel not found error when opening Windows Event Log: The specified channel could not be found.'
payload:
streams:
winlog-windows.forwarded-4ea5f67a-48fc-41ea-b586-2a29eac6423a:
error: ""
status: HEALTHY
winlog-windows.powershell-4ea5f67a-48fc-41ea-b586-2a29eac6423a:
error: ""
status: HEALTHY
winlog-windows.powershell_operational-4ea5f67a-48fc-41ea-b586-2a29eac6423a:
error: ""
status: HEALTHY
winlog-windows.sysmon_operational-4ea5f67a-48fc-41ea-b586-2a29eac6423a:
error: 'Encountered channel not found error when opening Windows Event Log: The specified channel could not be found.'
status: DEGRADED |
This issue has arisen following changes in this PR (elastic/beats#40163). The default configuration for for the Windows Integration has historically included Sysmon Operational channel. Sysmon is not a core component of Windows, it's a SysInternals tool (https://learn.microsoft.com/en-us/sysinternals/downloads/sysmon) that users can download and use at their discretion. As such, the Windows Integration has historically failed to open the Sysmon Operational channel, but that didn't propagate a Users can remedy the degraded status by either installing Sysmon, causing the channel to exist; or by deselecting the Sysmon Operational channel in the Windows Integration configuration... Possible solutions:
Thoughts @cmacknz @nfritts @andrewkroh ? |
This sounds like the most correct path if sysmon is not expected to be present the majority of the time. The counter argument is that this is a breaking change. Something we did for some of the system metricsets that were in a similar situation is keep the error message but report the status as healthy, since the input was working as well as it could with the configuration of the host system it was running on. |
In an ideal world, we'd move Sysmon out of Windows and have it as a standalone integration but that'd be very disruptive for existing users and would likely impact rules, dashboards, etc. As a quick fix, could we exclude Sysmon from our |
Can't Agent get it from |
I don't see a technical reason we couldn't just insert a check for whether we're trying to grab that particular channel near the code that changed. But that's filebeat code and not the Windows integration code. That'd be kind of an awkward place for the check in the long term and wouldn't scale well if we want to handle other things differently. I noticed the PR that changed this was addressing: elastic/beats#39735. Which related to wanting to see failures for channels when permission is denied, typically when Agent is installed unprivileged. I recreated that and now DO see the desired
The thing that strikes me there is we have multiple types of failures for I wonder if a good long term solution would be for the integration to feed in some type of filter data along with each channel it wants. Some enum or struct that tells it to actually degrade on access denied errors for this channel, but ignore not found errors for this channel, or warn/log (but not I'm not sure that change could be ready for 8.16.0. Perhaps we should rollback the change that caused this issue to arise and continue to tolerate the missing |
Having a per input way to turn off the "errors mark the Beat as degraded" would make sense vs just reverting the entire feature. This config could later expand into a list of specific errors to mute. For system/metrics we had similar ideas in elastic/beats#40543 but it hasn't been implemented yet. We have been more focused on fixing the specific errors, which in many cases have been actual bugs or permissions errors we were handling improperly. For system/metrics we also only get these errors when unprivileged. The OTel collector process scraper allows muting specific categories of error which is what we'd eventually want to emulate https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/README.md#process |
@bjmcnic followed up separately and we agreed we should revert the winlog specific change here in 8.16 while we work on a proper fix for 8.17 considering that:
There's not time to do a more in depth fix and the current state will probably lead to a flood of support cases. The revert PR is in elastic/beats#41468 |
elastic/beats#41468 was merged, this is now reverted from 8.16. |
Hi Team, While testing on 8.17.0 SNAPSHOT, we have found this issue reproducible there too. Observations:
Build details: Logs: Please let us know if us know if anything else is required from our end. Thanks! |
It's fixed in
|
Those two PRs will add the change to main and the 8.x branch |
Hi Team, Observations:
Logs: elastic-agent-diagnostics-2024-12-06T04-47-57Z-00 (1).zip Build details: Hence we are closing and marking this issue as QA:Validated. Thanks! |
Kibana Build details:
Artifact: https://snapshots.elastic.co/8.16.0-39df64b4/downloads/beats/elastic-agent/elastic-agent-8.16.0-SNAPSHOT-windows-x86_64.zip
Host: Windows Server 2022- Test Signing ON
Preconditions:
Steps to reproduce:
Encountered channel not found error
Expected Result:
No error should be displayed on adding Windows integration to the Windows agent.
Logs:
elastic-agent-diagnostics-2024-10-09T06-48-15Z-00.zip
Screenshots:
The text was updated successfully, but these errors were encountered: