Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Any endpoint except Windows 10 does not show up under the Administration tab even if the Endpoint Security Integration is added #99030

Closed
muskangulati-qasource opened this issue May 3, 2021 · 28 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience impact:critical This issue should be addressed immediately due to a critical level of impact on the product. Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.

Comments

@muskangulati-qasource
Copy link

Describe the feature
Any endpoint except Windows 10 does not show up under the Administration tab even if the Endpoint Security Integration is added to it.

Build Details:

Version: 7.13.0-BC3
Commit: ac09c73a4a7998c770b50d39177c22036ae14375
Build number: 40647
Artifact: https://staging.elastic.co/7.13.0-100107e6/summary-7.13.0.html

Preconditions

  1. Elastic 7.13.0 environment should be deployed.

Steps to Reproduce

  1. Deploy agents with Endpoint Security integration added
  2. Observe that apart from Windows 10, no other agent is showing up under the Administration tab

Test data
N/A

Impacted Test case(s)
N/A

Actual Result
Any endpoint except Windows 10 does not show up under the Administration tab even if the Endpoint Security Integration is added

Expected Result
All the endpoints should show up under the Administration tab if the Endpoint Security Integration is added.

What's Working
N/A

What's Not Working
N/A

Screenshot

  • Fleet tab:
    HealthyStatusForWindows7

  • Agent details:
    AgentInHealthyState

  • Logs tab:
    Logs

  • The Administration Tab:
    NotShowingOnAdminTab

  • No Endpoint folder is created on the Endpoint:
    NoEndpointFolder

Logs:
Agent logs:
elastic-agent-logs.zip

@muskangulati-qasource muskangulati-qasource added bug Fixes for quality problems that affect the customer experience impact:critical This issue should be addressed immediately due to a critical level of impact on the product. Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Defend Workflows “EDR Workflows” sub-team of Security Solution labels May 3, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-onboarding-and-lifecycle-mgt (Team:Onboarding and Lifecycle Mgt)

@muskangulati-qasource
Copy link
Author

@manishgupta-qasource please review!!

@manishgupta-qasource
Copy link

Reviewed & assigned to @kevinlog

@kevinlog
Copy link
Contributor

kevinlog commented May 3, 2021

@muskangulati-qasource Is it just Windows 7 that isn't showing up? Has Mac or Linux worked? It looks like the Endpoint isn't getting as installed as you're showing no Endpoint folder is created.

Which steps did you take in deploying the Windows 7 Endpoint?

FYI @ferullo @nfritts

@muskangulati-qasource
Copy link
Author

Hi @kevinlog,

@muskangulati-qasource Is it just Windows 7 that isn't showing up? Has Mac or Linux worked? It looks like the Endpoint isn't getting as installed as you're showing no Endpoint folder is created.

This issue is also occurring on macOS and Linux endpoints

Which steps did you take in deploying the Windows 7 Endpoint?

We are using the steps shared by you in the googlesheet

And we used the same steps for Windows 10 too. We used the same policy also. And everything worked fine for Windows 10 endpoint.

Please let us know if anything is missing from our end.

Thanks!!

@kevinlog
Copy link
Contributor

kevinlog commented May 3, 2021

@muskangulati-qasource thanks for the info.

I was able to get a Windows 7 Endpoint to install on my side

image

did you use the --insecure flag to work around the bug that is currently in the cloud builds?

@muskangulati-qasource
Copy link
Author

Yes, @kevinlog,

did you use the --insecure flag to work around the bug that is currently in the cloud builds

We did use the --insecure flag with it. But we were getting TLS warning message while installation.

Please refer Screenshot:
Insecure

Thanks!!

@ferullo
Copy link
Contributor

ferullo commented May 3, 2021

@muskangulati-qasource Agent logs show it is failing to execute Endpoint's installer. Can you try to manually run it to see what happens?

In the data/downloads folder for Agent you'll see endpoint-security-*.zip. Unzip that and you'll have endpoint-security binary and the resources zip. Run endpoint-security.exe install --resources endpoint-security-resources.zip --log stdout --upgrade.

@muskangulati-qasource
Copy link
Author

Hi @ferullo,

Thank you for providing the command. Please find below the output for the same:
upgrade

It seems to be a signing issue with the Endpoint.

Let us know if we are missing anything.

Thanks!

@nfritts
Copy link

nfritts commented May 3, 2021

I downloaded the BC artifact from the Artifact URL in the initial comment and haven no problem running it on a retail, non-test signing Windows 7. Unsure what might be going on here.

image

@ferullo
Copy link
Contributor

ferullo commented May 3, 2021

Windows
It seems like this could be that the Windows 7 machine is not fully up to date. Its important that Windows 7 machines be fully patched to comply with Microsoft signing changes. Can you make sure Window 7 has all updates installed before trying to install Endpoint again.

macOS and Linux
It's unclear what is happening. The logs shared over Slack show Endpoint successfully installing on both OSes. Can you share Agent logs from when it tries to install Endpoint on macOS and Linux.

@muskangulati-qasource
Copy link
Author

Hi @ferullo,

We tried once again with new VMs and found that the issue is with signing.

The artifacts are not signed.

  • As per information, we use Endpoints with signing OFF for Windows and sip enable endpoints for Mac for BC builds. But both are causing issues.

We tried with Testsigning ON (for Windows) & Sip disabled (for Mac), we were able to install the agent successfully.

Refer Screenshots:

  • The Sip Enabled endpoint:
    image (2)

  • The sip Disabled endpoint:
    image (3)

  • Endpoint shows up under the Endpoints tab:
    image (5)

Let us know if we need to close this and open a new ticket for signing issue or it is already reported!!

Thanks!
cc: @kevinlog, @nfritts !!

@ferullo
Copy link
Contributor

ferullo commented May 4, 2021

@nfritts how can they verify the system has all needed patches applied?

@muskangulati-qasource its important that the system be fully patched. We've been able to run the same Endpoint on other Windows systems with test signing disabled, which suggests the problem is your windows 7 machine is out of date. (@nfritts is that right?)

I'm also worried about the macOS and Linux failures. Your logs show Endpoint installs fine but then isn't on the computer? Can you give Agent logs from when Agent tries to install?

@ferullo
Copy link
Contributor

ferullo commented May 4, 2021

@k-g-elastic can you share whether or not your team is seeing any issues running Endpoint on Linux, macOS, or Windows?

Another thought, is the Windows 7 machine 32 or 64 bit? That could be a difference between @muskangulati-qasource 's machine and @nfritts 's machine

@nfritts
Copy link

nfritts commented May 4, 2021

Thats correct. Windows 7 should be fully patched.

@pzl
Copy link
Member

pzl commented May 4, 2021

I was able to spin up 7.13.0-100107e6 using linux (centos) and did not see any issues (top host in table)

2021-05-04-174037_scrot

Trace-level agent+endpoint logs (interleaved)
linux-bc3-logs.txt

@muskangulati-qasource
Copy link
Author

muskangulati-qasource commented May 5, 2021

Hi @ferullo,

We tested this ticket again using the environment provided by Kevin. Please find below complete testing details.

Build Details:

Version: 7.13.0
Commit: 62a98b97048b3639b8dfe7884776d95581dc1eb5
Build number: 40746
Artifact: https://staging.elastic.co/7.13.0-100107e6/summary-7.13.0.html

Observations:

Please find the detailed investigation done for all the endpoints in the table below:

IP address + VM link OS, version & Architecture Signing Agent Logs Endpoint Logs Status Remark
10.0.5.152 Windows 7_x64 Testsigning Enabled 10.0.5.152_elastic-agent-json.log 10.0.5.152_endpoint.log 🔴 Fail The Endpoint shows policy is failure state. The error is for logging configuration
10.0.5.197 Windows 7_x64 Testsigning Disabled 10.0.5.197_elastic-agent-json.log 10.0.5.197_endpoint.log 🔴 Fail The Endpoint shows policy is failure state. The error is for logging configuration
10.0.7.194 Windows 10_x64 Testsigning Enabled 10.0.7.194_elastic-agent-json.log 10.0.7.194_endpoint.log 🟢 Pass Able to install the agent without any errors
10.0.5.107 Windows 10_x64 Testsigning Disabled 10.0.5.107_elastic-agent-json.log 10.0.5.107_endpoint.log 🟢 Pass Able to install the agent without any errors
10.0.7.101 Mac OS 10_x64 Sip Enabled 10.0.7.101_elastic-agent-json.txt 10.0.7.101_endpoint.txt 🟢 Pass Able to install the agent without any errors
10.0.5.36 Mac OS 10_x64 Sip Disabled 10.0.5.36_elastic-agent-json.txt 10.0.5.36_endpoint.txt 🟢 Pass Able to install the agent without any errors
10.0.6.237 Debian 10_x64 - 10.0.6.237_elastic-agent-json.txt 10.0.6.237_endpoint.txt 🟢 Pass Able to install the agent without any errors

NOTE: This zip folder has all the log files: logs.zip

@nfritts Please provide us steps to check if Windows 7 is properly patched or not. We can regress this issue again for Windows 7.

Please let us know if anything is missing from our end.

cc: @kevinlog, @pzl
Thanks!!

@kevinlog
Copy link
Contributor

kevinlog commented May 5, 2021

@muskangulati-qasource @ferullo @nfritts I also encountered the logging issue when deploying Win 7. After restarting the host machine, the Win 7 endpoint enabled logging successfully and I got a successful policy. Is this a known issue?

Here is a Win 7 machine from me on the same cloud instance:

image

@ferullo
Copy link
Contributor

ferullo commented May 5, 2021

Could you share the policy response documents for the two failed Windows 7 machines?

@ferullo
Copy link
Contributor

ferullo commented May 5, 2021

From the logs it looks like you have set the log level to the empty string ("") which is causing that to fail. Are the Windows 7 machines on their own policy?

@muskangulati-qasource
Copy link
Author

muskangulati-qasource commented May 5, 2021

Hi @kevinlog ,

We did report a similar issue: #97229 where we had to restart a host in order to bring policy to the success state. But it was resolved after the fleet-server changes were merged.

@ferullo,

From the logs it looks like you have set the log level to the empty string ("") which is causing that to fail. Are the Windows 7 machines on their own policy?

We did not update any thing is the policy and used exactly same policy for all the endpoints.

Could you share the policy response documents for the two failed Windows 7 machines?

Please find below the state.yml files for both the endpoints.

Please let us know in case anything else is required from our end.

Thanks!

@nfritts
Copy link

nfritts commented May 5, 2021

@muskangulati-qasource Could you check to make sure that your Windows 7 VM has the following KB installed and see if it fixes the install issues? https://www.catalog.update.microsoft.com/Search.aspx?q=KB4474419

Thanks!

@ferullo
Copy link
Contributor

ferullo commented May 5, 2021

This appears to be an Agent bug. cc @ph

I logged into 10.0.5.197 and looked at state.yml, fleet.yml, and elastic-endpoint.yaml.

state.yml is unremarkable, there's no reason to share it.

fleet.yml contains

agent:
  id: 381cc1b0-f772-4a0f-bd20-925610b7d783
  logging.level: info
  monitoring.http:
    enabled: false
    host: ""
    port: 6791
fleet:
  enabled: true
  access_api_key: <redacted>
  protocol: http
  host: <redacted>
  hosts:
  - https://<redacted>
  timeout: 5m0s
  ssl:
    verification_mode: none
    renegotiation: never
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""

elastic-endpoint.yaml's fleet section is below. Notice that the logging level is "". There is no reason this would be changed by Endpoint after its received by Agent. Endpoint just accepts the config and saves the whole thing without modification.

fleet:
  access_api_key: <redacted>
  agent:
    id: 381cc1b0-f772-4a0f-bd20-925610b7d783
    logging:
      level: ""
    monitoring:
      http:
        enabled: false
        host: ""
        port: 6791
  enabled: true
  host:
    id: 5064b7d5-80c5-4eff-aeaf-06e85448a222
  hosts:
  - <redacted>
  protocol: https
  reporting:
    check_frequency_sec: 30
    threshold: 10000
  ssl:
    renegotiation: never
    verification_mode: none
  timeout: 5m0s

@muskangulati-qasource
Copy link
Author

Hi @nfritts,

We tried today again and found the same issue for Windows 7 persists on 7.13.0 BC4 build.

@muskangulati-qasource Could you check to make sure that your Windows 7 VM has the following KB installed and see if it fixes the install issues? https://www.catalog.update.microsoft.com/Search.aspx?q=KB4474419

We tried to download the file on our endpoints:
Updates Page1
Updates Page2
Updates Page3
Update

The file was already installed on the endpoint.

@ferullo Thank you for routing to the correct person.

@ph Please let us know if anything is required from our end.

Thanks!

@ferullo
Copy link
Contributor

ferullo commented May 6, 2021

@muskangulati-qasource I'm confused. What is the problem?

Early in this issue's history it was stated that Endpoint could not install on any version of any OS other than Windows 10. Then that install OSes and versions but policy fails on Windows 7. Now it seems like you're again saying Endpoint cannot install on Windows 7?

@muskangulati-qasource
Copy link
Author

Hi @ferullo,

Sorry for the confusions.

The issue is Windows7 only. It is deployed with policy failures.

To resolve the confusions, we can close this one and open a new one for the actual bug or change the Summary of the bug.

Let us know what works best for you.

Thanks!

@ferullo
Copy link
Contributor

ferullo commented May 6, 2021

Great, thanks for clarifying. I'm closing this issue because it's history is a bit confusing and we've dug in enough to determine the issue is in Agent, not Kibana. I opened elastic/beats#25583 to track the problem. Please add any details you'd like to that issue.

@ferullo ferullo closed this as completed May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience impact:critical This issue should be addressed immediately due to a critical level of impact on the product. Team:Defend Workflows “EDR Workflows” sub-team of Security Solution Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc.
Projects
None yet
Development

No branches or pull requests

8 participants