Avoid creating thousands of get-ranges threads #5224

carterkozak · 2021-01-26T17:29:41Z

Our metrics show services with thousands of threads in the
serializabletransactionmanager-get-ranges pool, however the
executor was not instrumented with Tritium, so it's not clear
how saturated it is, or if it is used at all. Threads are incredibly
expensive, it's generally a sign of failure when a service reaches
1000 total threads.

Using PTExecutors factories we get tracing and execution metrics
for free, as well as resource utilization improvements to share
a slice of an underlying cached executor so threads are only
used as needed. The provided numThreads is still an upper limit
for the ExecutorService instance, however idle threads can be
used elsewhere.

The existing queue size warning logic is preserved using a counter
rather than instrumenting the queue itself in much the same way
tritium estimates ExecutorService queue size.

Goals (and why):

Vastly reduce memory overhead for several services.

Implementation Description (bullets):

Use the standard ptexecutors factory with a wrapper to support queue size warnings. Ideally this would move to Hyperion instead, but that's out of scope here.

Testing (What was existing testing like? What have you done to improve it?):

No behavior change, only a reduction in resource utilization.

Concerns (what feedback would you like?):

Where should we start reviewing?:

Priority (whenever / two weeks / yesterday):

changelog-app · 2021-01-26T17:29:47Z

Generate changelog in `changelog/@unreleased`

Type

Description

Avoid creating thousands of serializabletransactionmanager-get-ranges threads by using the efficient PTExecutors ExecutorService factory methods.

Check the box to generate changelog(s)

Generate changelog entry

Our metrics show services with thousands of threads in the `serializabletransactionmanager-get-ranges` pool, however the executor was not instrumented with Tritium, so it's not clear how saturated it is, or if it is used at all. Using PTExecutors factories we get tracing and execution metrics for free, as well as resource utilization improvements to share a slice of an underlying cached executor so threads are only used as needed. The provided `numThreads` is still an upper limit for the ExecutorService instance, however idle threads can be used elsewhere. The existing queue size warning logic is preserved using a counter rather than instrumenting the queue itself in much the same way tritium estimates ExecutorService queue size.

gmaretic

LGTM, thanks!

svc-autorelease · 2021-02-02T14:25:23Z

Released 0.290.1

carterkozak mentioned this pull request Jan 26, 2021

[Fix develop] Fix a potential deadlock in SweepQueueReader #5225

Merged

carterkozak added 2 commits January 26, 2021 17:33

Add generated changelog entries

1499b86

carterkozak force-pushed the ckozak/AbstractTransactionManager_getRangesExecutor_threads branch from 400c80a to 1499b86 Compare January 26, 2021 22:33

carterkozak added autorelease merge when ready labels Jan 26, 2021

gmaretic approved these changes Feb 2, 2021

View reviewed changes

bulldozer-bot bot merged commit 9013add into develop Feb 2, 2021

bulldozer-bot bot deleted the ckozak/AbstractTransactionManager_getRangesExecutor_threads branch February 2, 2021 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid creating thousands of get-ranges threads #5224

Avoid creating thousands of get-ranges threads #5224

carterkozak commented Jan 26, 2021

changelog-app bot commented Jan 26, 2021 •

edited by carterkozak

Loading

gmaretic left a comment

svc-autorelease commented Feb 2, 2021

Avoid creating thousands of get-ranges threads #5224

Avoid creating thousands of get-ranges threads #5224

Conversation

carterkozak commented Jan 26, 2021

changelog-app bot commented Jan 26, 2021 • edited by carterkozak Loading

Generate changelog in changelog/@unreleased

gmaretic left a comment

Choose a reason for hiding this comment

svc-autorelease commented Feb 2, 2021

changelog-app bot commented Jan 26, 2021 •

edited by carterkozak

Loading

Generate changelog in `changelog/@unreleased`