forked from elastic/kibana
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[APM] Circuit breaker and perf improvements for service map (elastic#…
…159883) Closes elastic#101920 This PR does three things: - add a `terminate_after` parameter to the search request for the scripted metric agg. This is a configurable setting (`xpack.apm.serviceMapTerminateAfter`) and defaults to 100k. This is a shard-level parameter, so there's still the possibility of lots of shards individually returning 100k documents and the coordinating node running out of memory because it is collecting all these docs from individual shards. However, I suspect that there is already some protection in the reduce phase that will terminate the request with a stack_overflow_error without OOMing, I've reached out to the ES team to confirm whether this is the case. - add `xpack.apm.serviceMapMaxTraces`: this tells the max traces to inspect in total, not just per search request. IE, if `xpack.apm.serviceMapMaxTracesPerRequest` is 1, we simply chunk the traces in n chunks, so it doesn't really help with memory management. `serviceMapMaxTraces` refers to the total amount of traces to inspect. - rewrite `getConnections` to use local mutation instead of immutability. I saw huge CPU usage (with admittedly a pathological scenario where there are 100s of services) in the `getConnections` function, because it uses a deduplication mechanism that is O(n²), so I rewrote it to O(n). Here's a before : ![image](https://github.com/elastic/kibana/assets/352732/6c24a7a2-3b48-4c95-9db2-563160a57aef) and after: ![image](https://github.com/elastic/kibana/assets/352732/c00b8428-3026-4610-aa8b-c0046e8f0e08) To reproduce an OOM, start ES with a much smaller amount of memory: `$ ES_JAVA_OPTS='-Xms236m -Xmx236m' yarn es snapshot` Then run the synthtrace Service Map OOM scenario: `$ node scripts/synthtrace.js service_map_oom --from=now-15m --to=now --clean` Finally, navigate to `service-100` in the UI, and click on Service Map. This should trigger an OOM.
- Loading branch information
1 parent
85ba9e9
commit 1a9b241
Showing
9 changed files
with
204 additions
and
162 deletions.
There are no files selected for viewing
64 changes: 64 additions & 0 deletions
64
packages/kbn-apm-synthtrace/src/scenarios/service_map_oom.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0 and the Server Side Public License, v 1; you may not use this file except | ||
* in compliance with, at your election, the Elastic License 2.0 or the Server | ||
* Side Public License, v 1. | ||
*/ | ||
|
||
import { ApmFields, httpExitSpan } from '@kbn/apm-synthtrace-client'; | ||
import { service } from '@kbn/apm-synthtrace-client/src/lib/apm/service'; | ||
import { Transaction } from '@kbn/apm-synthtrace-client/src/lib/apm/transaction'; | ||
import { Scenario } from '../cli/scenario'; | ||
import { RunOptions } from '../cli/utils/parse_run_cli_flags'; | ||
import { getSynthtraceEnvironment } from '../lib/utils/get_synthtrace_environment'; | ||
|
||
const environment = getSynthtraceEnvironment(__filename); | ||
|
||
const scenario: Scenario<ApmFields> = async (runOptions: RunOptions) => { | ||
const numServices = 500; | ||
|
||
const tracesPerMinute = 10; | ||
|
||
return { | ||
generate: ({ range }) => { | ||
const services = new Array(numServices) | ||
.fill(undefined) | ||
.map((_, idx) => { | ||
return service(`service-${idx}`, 'prod', environment).instance('service-instance'); | ||
}) | ||
.reverse(); | ||
|
||
return range.ratePerMinute(tracesPerMinute).generator((timestamp) => { | ||
const rootTransaction = services.reduce((prev, currentService) => { | ||
const tx = currentService | ||
.transaction(`GET /my/function`, 'request') | ||
.timestamp(timestamp) | ||
.duration(1000) | ||
.children( | ||
...(prev | ||
? [ | ||
currentService | ||
.span( | ||
httpExitSpan({ | ||
spanName: `exit-span-${currentService.fields['service.name']}`, | ||
destinationUrl: `http://address-to-exit-span-${currentService.fields['service.name']}`, | ||
}) | ||
) | ||
.timestamp(timestamp) | ||
.duration(1000) | ||
.children(prev), | ||
] | ||
: []) | ||
); | ||
|
||
return tx; | ||
}, undefined as Transaction | undefined); | ||
|
||
return rootTransaction!; | ||
}); | ||
}, | ||
}; | ||
}; | ||
|
||
export default scenario; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.