Skip to content

Commit

Permalink
[EDR Workflows][E2E] Increase the timeout of agent check in (#168438)
Browse files Browse the repository at this point in the history
This pull request extends the agent fleet check timeout from 2 minutes
to 4 minutes. We've identified a number of unreliable tests that fail
during the `beforeAll` stage while executing the `createEndpointHost`
task. The following logs appear before the timeout:

```
info Enrolling Elastic Agent with Fleet
  | Installing service....... DONE
  | Starting service... DONE
  | Enrolling Elastic Agent with Fleet..........Successfully enrolled the Elastic Agent.
  | Elastic Agent has been successfully installed.
  | info Waiting for Agent to check-in with Fleet
```

The error message we encounter is `> Timed out waiting for host
[test-host-4981] to appear in Fleet.`

It appears that all the preceding steps are successful, and only the
final one fails due to either the agent not checking in with the fleet
for 2 minutes or the agent being unhealthy for two minutes. Since I
haven't been able to replicate this behavior locally, and there isn't a
way to inspect what's happening on the agent, I believe the best course
of action at this point is to extend the timeout and monitor the
results.

Reports of this error:
closes #168427
closes #168394
closes #168393
closes #168390
closes #168363
closes #168362
closes #168361
closes #168360
closes #168359

Affected CI runs:
https://buildkite.com/elastic/kibana-on-merge/builds/36483
https://buildkite.com/elastic/kibana-on-merge/builds/36497
https://buildkite.com/elastic/kibana-on-merge/builds/36501
https://buildkite.com/elastic/kibana-on-merge/builds/36526

Another time out happens from time to time when previously set 10
minutes timeout on `createEndpointHost` task is not enough to set up the
environment. Its portrayed below, timeout happens during agent setup
```
  | default: Running: inline script
  | default: Reading package lists...
  | default: Building dependency tree...
  | default: Reading state information...
  | default: Suggested packages:
  | default:   zip
  | default: The following NEW packages will be installed:
  | default:   unzip
  | default: 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
  | default: Need to get 174 kB of archives.
  | default: After this operation, 385 kB of additional disk space will be used.
  | default: Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 unzip amd64 6.0-26ubuntu3.1 [174 kB]
  | default: dpkg-preconfigure: unable to re-open stdin: No such file or directory
  | default: Fetched 174 kB in 1s (210 kB/s)
  | default: Selecting previously unselected package unzip.
  | (Reading database ... 63961 files and directories currently installed.)
  | default: Preparing to unpack .../unzip_6.0-26ubuntu3.1_amd64.deb ...
  | default: Unpacking unzip (6.0-26ubuntu3.1) ...
  | default: Setting up unzip (6.0-26ubuntu3.1) ...
  | default: Processing triggers for man-db (2.10.2-1) ...
  |  
  | CypressError: `cy.task('createEndpointHost')` timed out after waiting `600000ms`.
```
  • Loading branch information
szwarckonrad authored Oct 11, 2023
1 parent bf1357c commit 91cdbe2
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ export const createEndpointHost = (
{
agentPolicyId,
},
{ timeout: timeout ?? 600000 }
{ timeout: timeout ?? 900000 } // 15 minutes, since setup can take 10 minutes and more. Task will time out if is not resolved within this time.
);
};
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ const enrollHostWithFleet = async ({
]);
}
log.info(`Waiting for Agent to check-in with Fleet`);
const agent = await waitForHostToEnroll(kbnClient, vmName, 120000);
const agent = await waitForHostToEnroll(kbnClient, vmName, 240000);

return {
agentId: agent.id,
Expand Down

0 comments on commit 91cdbe2

Please sign in to comment.