Manually building `KueryNode` for Fleet's routes #75693

kobelb · 2020-08-21T18:34:23Z

Parsing a string to create the KQL query is computationally expensive. It accounted for nearly 17% of the CPU time when enrolling 1,000 agents using fleet-loadtest over 5 minutes.

This approach manually builds the KueryNode, so we no longer have to do the string parsing, which was the most expensive part.

Impact

Micro benchmark

6396334 nanoseconds

esKuery.fromKueryExpression(
  'not fleet-agent-actions.attributes.sent_at: * and fleet-agent-actions.attributes.agent_id:1234567'
);

766089 nanoseconds

esKuery.nodeTypes.function.buildNode('and', [
  esKuery.nodeTypes.function.buildNode(
    'not',
    esKuery.nodeTypes.function.buildNode('is', 'fleet-agent-actions.attributes.sent_at', '*')
  ),
  esKuery.nodeTypes.function.buildNode(
    'is',
    'fleet-agent-actions.attributes.agent_id',
    '1234567'
  ),
]);

Shortest test

Running https://github.com/nchaulet/fleet-load-test-scenarios before these changes

checkin_first_time_duration....: avg=2.53s    min=2.02s   med=2.03s    max=7.03s    p(90)=2.54s    p(95)=7.03s
checkin_second_time_duration...: avg=98.41ms  min=55.51ms med=93.41ms  max=138.32ms p(90)=122.72ms p(95)=130.55ms

And after

checkin_first_time_duration....: avg=2.7s     min=2.01s   med=2.03s    max=7.05s    p(90)=3.33s    p(95)=7.05s
checkin_second_time_duration...: avg=93.21ms  min=27.05ms med=92.64ms  max=177.27ms p(90)=133.85ms p(95)=155.01ms

Short-test

Running fleet-loadtest with RATE=15 AGENTS=4000 go run main.go until 200 agents checked-in and got their policy

2020/08/21 19:01:09 counter requests.healthcheck.concurrent_count
2020/08/21 19:01:09   count:               0
2020/08/21 19:01:09 timer requests.healthcheck.latency
2020/08/21 19:01:09   count:             315
2020/08/21 19:01:09   min:                 6.57ms
2020/08/21 19:01:09   max:              6209.91ms
2020/08/21 19:01:09   mean:              458.72ms
2020/08/21 19:01:09   stddev:           1030.04ms
2020/08/21 19:01:09   median:             24.94ms
2020/08/21 19:01:09   75%:               371.94ms
2020/08/21 19:01:09   95%:              3082.55ms
2020/08/21 19:01:09   99%:              5453.22ms
2020/08/21 19:01:09   99.9%:            6209.91ms
2020/08/21 19:01:09   1-min rate:          0.43
2020/08/21 19:01:09   5-min rate:          1.21
2020/08/21 19:01:09   15-min rate:         1.57
2020/08/21 19:01:09   mean rate:           1.04
2020/08/21 19:01:09 meter requests.healthcheck.success
2020/08/21 19:01:09   count:             314
2020/08/21 19:01:09   1-min rate:          0.43
2020/08/21 19:01:09   5-min rate:          1.21
2020/08/21 19:01:09   15-min rate:         1.57
2020/08/21 19:01:09   mean rate:           1.05
2020/08/21 19:01:09 Policy revision summary
2020/08/21 19:01:09   revision  1:    210 agents

After these changes

2020/08/21 20:01:47 timer requests.healthcheck.latency
2020/08/21 20:01:47   count:             323
2020/08/21 20:01:47   min:                 6.32ms
2020/08/21 20:01:47   max:              5004.98ms
2020/08/21 20:01:47   mean:              257.06ms
2020/08/21 20:01:47   stddev:            687.97ms
2020/08/21 20:01:47   median:             13.74ms
2020/08/21 20:01:47   75%:               120.73ms
2020/08/21 20:01:47   95%:              1875.24ms
2020/08/21 20:01:47   99%:              3985.15ms
2020/08/21 20:01:47   99.9%:            5004.98ms
2020/08/21 20:01:47   1-min rate:          0.81
2020/08/21 20:01:47   5-min rate:          1.48
2020/08/21 20:01:47   15-min rate:         1.68
2020/08/21 20:01:47   mean rate:           1.32
2020/08/21 20:01:47 counter requests.healthcheck.concurrent_count
2020/08/21 20:01:47   count:               0
2020/08/21 20:01:47 meter requests.healthcheck.success
2020/08/21 20:01:47   count:             323
2020/08/21 20:01:47   1-min rate:          0.81
2020/08/21 20:01:47   5-min rate:          1.48
2020/08/21 20:01:47   15-min rate:         1.68
2020/08/21 20:01:47   mean rate:           1.32
2020/08/21 20:01:47 Policy revision summary
2020/08/21 20:01:47   revision  1:    216 agents

This reduces the mean health check latency from 458.72ms to 257.06ms

Longer test

Running fleet-loadtest with RATE=15 AGENTS=2000 go run main.go until completion

2020/08/25 17:00:34 timer requests.healthcheck.latency
2020/08/25 17:00:34   count:            3266
2020/08/25 17:00:34   min:                 6.48ms
2020/08/25 17:00:34   max:              3143.09ms
2020/08/25 17:00:34   mean:              139.80ms
2020/08/25 17:00:34   stddev:            383.02ms
2020/08/25 17:00:34   median:             10.40ms
2020/08/25 17:00:34   75%:                30.99ms
2020/08/25 17:00:34   95%:               926.28ms
2020/08/25 17:00:34   99%:              2008.38ms
2020/08/25 17:00:34   99.9%:            3122.97ms
2020/08/25 17:00:34   1-min rate:          1.55
2020/08/25 17:00:34   5-min rate:          1.56
2020/08/25 17:00:34   15-min rate:         1.59
2020/08/25 17:00:34   mean rate:           1.59
2020/08/25 17:00:34 counter requests.healthcheck.concurrent_count
2020/08/25 17:00:34   count:               0
2020/08/25 17:00:34 meter requests.healthcheck.success
2020/08/25 17:00:34   count:            3266
2020/08/25 17:00:34   1-min rate:          1.55
2020/08/25 17:00:34   5-min rate:          1.56
2020/08/25 17:00:34   15-min rate:         1.59
2020/08/25 17:00:34   mean rate:           1.59
2020/08/25 17:00:34 Policy revision summary
2020/08/25 17:00:34   revision  1:   2000 agents

x-pack/plugins/ingest_manager/server/services/agents/actions.ts

nchaulet · 2020-08-21T20:35:37Z

Nice the perf improvements seems significant 👍

kobelb · 2020-08-26T18:38:07Z

src/core/server/saved_objects/types.ts

@@ -37,6 +37,9 @@ import { SavedObjectUnsanitizedDoc } from './serialization';
 import { SavedObjectsMigrationLogger } from './migrations/core/migration_logger';
 import { SavedObject } from '../../types';

+// eslint-disable-next-line @kbn/eslint/no-restricted-paths
+import { KueryNode } from '../../../plugins/data/common';


Refactoring this code to remove the circular reference became quite time-consuming. Since a circular reference already exists between core and the data plugin, I'd like to postpone fixing the circular reference to another PR to minimize the likelihood of this change not making 7.10.

This reverts commit 97e19c0.

elasticmachine · 2020-08-27T15:20:31Z

Pinging @elastic/ingest-management (Team:Ingest Management)

pgayvallet

LGTM

pgayvallet · 2020-08-31T11:55:19Z

src/core/server/saved_objects/service/lib/filter_utils.ts

-  if (filter && filter.length > 0 && indexMapping) {
-    const filterKueryNode = esKuery.fromKueryExpression(filter);
+  if (filter && indexMapping) {
+    const filterKueryNode =
+      typeof filter === 'string' ? esKuery.fromKueryExpression(filter) : filter;


NIT: we are no longer ignoring empty string filters, but I guess this has no impact?

Empty strings are still ignored, I added 195a5a1#diff-c5ab35755205b8adbf6cfc858d69ea3bR86-R88 to double-check this. JavaScript treats an empty-string as being falsy, so checking filter and filter.length > 0 is redundant, we only need to check filter.

Wait what? Kibana is not Java code?

Unfortunately, no... Maybe in the future though! New-new platform??

We would need to find a more suitable codename if we switch to java. NewPlatformBijectiveAdapterFactory maybe.

kobelb · 2020-09-01T13:46:08Z

@elasticmachine merge upstream

kibanamachine · 2020-09-01T16:07:26Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: f135722

Build metrics

✅ unchanged

History

💚 Build #71275 succeeded 195a5a1
💚 Build #70899 succeeded 205dd30
💚 Build #70701 succeeded 4ebbe98
💔 Build #70646 failed 4cc972b
💔 Build #70681 failed 55051f8

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* Simple benchmark tests for kuery * Building manually is "better" still not free * Building the KueryNode manually * Removing benchmark tests * Another query is building the KueryNode manually * Empty strings are inherently falsy * No longer reaching into the data plugin, import from the "root" indexes * Using AGENT_ACTION_SAVED_OBJECT_TYPE everywhere * Adding SavedObjectsRepository#find unit test for KueryNode * Adding KQL KueryNode test for validateConvertFilterToKueryNode * /s/KQL string/KQL expression * Updating API docs * Adding micro benchmark * Revert "Adding micro benchmark" This reverts commit 97e19c0. * Adding an empty string filters test Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Elastic Machine <[email protected]>

@ts-expect-error

* master: (223 commits) skip flaky suite (elastic#75724) [Reporting] Add functional test for Reports in non-default spaces (elastic#76053) [Enterprise Search] Fix various icons in dark mode (elastic#76430) skip flaky suite (elastic#76245) Add `auto` interval to histogram AggConfig (elastic#76001) [Resolver] generator uses setup_node_env (elastic#76422) [Ingest Manager] Support both zip & tar archives from Registry (elastic#76197) [Ingest Manager] Improve agent vs kibana version checks (elastic#76238) Manually building `KueryNode` for Fleet's routes (elastic#75693) remove dupe tinymath section (elastic#76093) Create APM issue template (elastic#76362) Delete unused file. (elastic#76386) [SECURITY_SOLUTION][ENDPOINT] Trusted Apps Create API (elastic#76178) [Detections Engine] Add Alert actions to the Timeline (elastic#73228) [Dashboard First] Library Notification (elastic#76122) [Maps] Add mvt support for ES doc sources (elastic#75698) Add setHeaderActionMenu API to AppMountParameters (elastic#75422) [ML] Remove "Are you sure" from data frame analytics jobs (elastic#76214) [yarn] remove typings-tester, use @ts-expect-error (elastic#76341) [Reporting/CSV] Do not fail the job if scroll ID can not be cleared (elastic#76014) ...

kobelb added 6 commits August 21, 2020 10:27

Simple benchmark tests for kuery

8a490fd

Building manually is "better" still not free

0c23f97

Building the KueryNode manually

3341219

Removing benchmark tests

062aaf2

Another query is building the KueryNode manually

8bd226f

Empty strings are inherently falsy

070552c

kobelb requested a review from nchaulet August 21, 2020 20:03

kobelb commented Aug 21, 2020

View reviewed changes

x-pack/plugins/ingest_manager/server/services/agents/actions.ts Show resolved Hide resolved

kobelb mentioned this pull request Aug 24, 2020

Switch to typescript project references and incremental builds #46773

Closed

7 tasks

No longer reaching into the data plugin, import from the "root" indexes

225bb15

kobelb commented Aug 26, 2020

View reviewed changes

kobelb added 7 commits August 26, 2020 11:55

Using AGENT_ACTION_SAVED_OBJECT_TYPE everywhere

0ee4f45

Merge remote-tracking branch 'upstream/master' into kuery-time

4cc972b

Adding SavedObjectsRepository#find unit test for KueryNode

6da0f32

Adding KQL KueryNode test for validateConvertFilterToKueryNode

7afb589

/s/KQL string/KQL expression

130d5cf

Merge remote-tracking branch 'upstream/master' into kuery-time

55051f8

Updating API docs

4ebbe98

kobelb marked this pull request as ready for review August 26, 2020 23:49

kobelb requested a review from a team August 26, 2020 23:49

kobelb requested a review from a team as a code owner August 26, 2020 23:49

kobelb added v8.0.0 v7.10.0 release_note:skip Skip the PR/issue when compiling release notes labels Aug 26, 2020

kobelb added 2 commits August 27, 2020 08:19

Adding micro benchmark

97e19c0

Revert "Adding micro benchmark"

aa14f7f

This reverts commit 97e19c0.

botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Aug 27, 2020

Merge remote-tracking branch 'upstream/master' into kuery-time

205dd30

pgayvallet approved these changes Aug 31, 2020

View reviewed changes

Adding an empty string filters test

195a5a1

nchaulet approved these changes Sep 1, 2020

View reviewed changes

Merge branch 'master' into kuery-time

f135722

kobelb merged commit b5faf41 into elastic:master Sep 1, 2020

kobelb deleted the kuery-time branch September 1, 2020 21:00

kobelb mentioned this pull request Sep 1, 2020

[7.x] Manually building KueryNode for Fleet's routes (#75693) #76442

Merged

This was referenced Sep 3, 2020

[Ingest Manager] Manually build Fleet kuery with Node arguments #76589

Merged

[Ingest Manager] Improve perfomance of fetching unacknowledged actions #75892

Closed

This was referenced Sep 4, 2020

KQL expression parsing is slow #76811

Closed

Alerting RBAC - manually build KueryNode #76960

Closed

lukasolson mentioned this pull request Sep 9, 2020

[KQL] Better programmatic API #77085

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manually building `KueryNode` for Fleet's routes #75693

Manually building `KueryNode` for Fleet's routes #75693

kobelb commented Aug 21, 2020 •

edited

Loading

nchaulet commented Aug 21, 2020

kobelb Aug 26, 2020

elasticmachine commented Aug 27, 2020

pgayvallet left a comment

pgayvallet Aug 31, 2020

kobelb Aug 31, 2020 •

edited

Loading

pgayvallet Sep 1, 2020

kobelb Sep 1, 2020

pgayvallet Sep 1, 2020

kobelb commented Sep 1, 2020

kibanamachine commented Sep 1, 2020

Manually building KueryNode for Fleet's routes #75693

Manually building KueryNode for Fleet's routes #75693

Conversation

kobelb commented Aug 21, 2020 • edited Loading

Impact

Micro benchmark

Shortest test

Short-test

Longer test

nchaulet commented Aug 21, 2020

kobelb Aug 26, 2020

Choose a reason for hiding this comment

elasticmachine commented Aug 27, 2020

pgayvallet left a comment

Choose a reason for hiding this comment

pgayvallet Aug 31, 2020

Choose a reason for hiding this comment

kobelb Aug 31, 2020 • edited Loading

Choose a reason for hiding this comment

pgayvallet Sep 1, 2020

Choose a reason for hiding this comment

kobelb Sep 1, 2020

Choose a reason for hiding this comment

pgayvallet Sep 1, 2020

Choose a reason for hiding this comment

kobelb commented Sep 1, 2020

kibanamachine commented Sep 1, 2020

💚 Build Succeeded

Build metrics

History

Manually building `KueryNode` for Fleet's routes #75693

Manually building `KueryNode` for Fleet's routes #75693

kobelb commented Aug 21, 2020 •

edited

Loading

kobelb Aug 31, 2020 •

edited

Loading