-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add field data memory circuit breaker #4261
Conversation
Cool stuff @dakrone |
@@ -60,6 +60,9 @@ By default, `indices` stats are returned. With options for `indices`, | |||
Transport statistics about sent and received bytes in | |||
cluster communication | |||
|
|||
`breaker`:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think breaker
is too generic a name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about circuit-breaker
?
I realized that the FieldDataEstimator class is no longer needed, as estimations have been moved into their respective field data loading classes, so I'll remove it. |
double estimatedBytes = ((RamAccountingTermsEnum)termsEnum).getTotalBytes(); | ||
breaker.addWithoutBreaking(-(long)((estimatedBytes * breaker.getOverhead()) - actualUsed)); | ||
} else { | ||
logger.warn("Trying to adjust circuit breaker, but TermsEnum has not been wrapped!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have an assertion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an assertion makes sense, I'll do that instead of the if
statement.
Updated code and force-pushed another squashed commit (because there were going to be merge conflicts regardless, and I'd rather rebase and deal with them now rather than after reviews). Changes:
I may have forgotten other changes that went in, so more reviews welcome :) |
this.maxBytes = settings.getAsBytesSize(CIRCUIT_BREAKER_MAX_BYTES_SETTING, new ByteSizeValue(fieldDataMax)).bytes(); | ||
this.overhead = settings.getAsDouble(CIRCUIT_BREAKER_OVERHEAD_SETTING, DEFAULT_OVERHEAD_CONSTANT); | ||
|
||
this.breaker = new MemoryCircuitBreaker(new ByteSizeValue(maxBytes), overhead, 0, logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like breaker is initialized twice. Here and then in doStart()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the doStart()
one.
Pushed a new version of the circuit breaker that addresses @imotov's comments. |
public class InternalCircuitBreakerService extends AbstractLifecycleComponent<InternalCircuitBreakerService> implements CircuitBreakerService { | ||
|
||
public static final String CIRCUIT_BREAKER_MAX_BYTES_SETTING = "indices.fielddata.cache.breaker.limit"; | ||
public static final String CIRCUIT_BREAKER_OVERHEAD_SETTING = "indices.fielddata.cache.breaker.overhead"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the settings names do not match the package, it should be indices.fielddata.breaker.xxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I'll change those.
.setSource(MapBuilder.<String, Object>newMapBuilder().put("test", "value" + id).map()).execute().actionGet(); | ||
} | ||
|
||
// refresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a refresh()
shortcut in ElasticsearchIntegationTest
This adds the field data circuit breaker, which is used to estimate the amount of memory required to load field data before loading it. It then raises a CircuitBreakingException if the limit is exceeded. It is configured with two parameters: `indices.fielddata.cache.breaker.limit` - the maximum number of bytes of field data to be loaded before circuit breaking. Defaults to `indices.fielddata.cache.size` if set, unbounded otherwise. `indices.fielddata.cache.breaker.overhead` - a contast for all field data estimations to be multiplied with before aggregation. Defaults to 1.03. Both settings can be configured dynamically using the cluster update settings API.
startObject("type"). | ||
startObject("properties"). | ||
startObject("test") | ||
.field("type", "string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have some more types like long / double etc. as well as use a random field_data impl? We could have something like
.field("type", "string")
.startObject("fielddata")
.field("format", randomStringFieldDataFormat())
and something like this for numeric as well:
private static String randomNumericFieldDataFormat() {
return randomFrom(Arrays.asList("array", "compressed", "doc_values"));
}
private static String randomBytesFieldDataFormat() {
return randomFrom(Arrays.asList("paged_bytes", "fst", "doc_values"));
}
I guess we can add those to ElasticsearchIntegrationTest
LGTM please squash and push 👍 |
This adds the field data circuit breaker, which is used to estimate
the amount of memory required to load field data before loading it. It
then raises a CircuitBreakingException if the limit is exceeded.
It is configured with two parameters:
indices.fielddata.cache.breaker.limit
- the maximum number of bytesof field data to be loaded before circuit breaking. Defaults to
indices.fielddata.cache.size
if set, unbounded otherwise.indices.fielddata.cache.breaker.overhead
- a contast for all fielddata estimations to be multiplied with before aggregation. Defaults to
1.03.
Both settings can be configured dynamically using the cluster update
settings API.