Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing throughput drop after types removal issues deprecation messages #37411

Closed
dliappis opened this issue Jan 14, 2019 · 3 comments
Closed
Assignees
Labels
>regression :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@dliappis
Copy link
Contributor

#36549 introduced deprecation messages when types are used in _bulk APIs.

This has caused major throughput drops in our standard nightly benchmarks (that use a _doc type) like the one seen here https://elasticsearch-benchmarks.elastic.co/#tracks/geopoint/nightly/30d:

image

Other tracks affected are http_logs, nyc_taxis, noaa, [geonames][(https://elasticsearch-benchmarks.elastic.co/#tracks/geonames/nightly/30d) and nested.

The deprecation logger is suppressing duplicate messages hence there is only one entry in the log file:

$ cat rally-benchmark_deprecation.log
[2019-01-14T11:05:37,399][WARN ][o.e.d.a.b.BulkRequest    ] [rally-node-0] [types removal] Specifying types in bulk requests is deprecated.

IMO a deprecated feature shouldn't cause a performance regression (at least not that magnitude) until it gets removed.

@dliappis dliappis added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Jan 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@colings86
Copy link
Contributor

I think we need to change the code here so it only makes this call one for the bulk request and not every time the add() method is called? Alternatively we could modify the deprecation logger to not duplicate deprecation warnings form the same call site in the warning headers for the response within the same request but that might be a much harder change

@markharwood
Copy link
Contributor

I think we need to change the code here so it only makes this call one for the bulk request

Makes sense. Bulk indexing is likely the main source of repeated deprecated events.

ebadyano added a commit to ebadyano/elasticsearch that referenced this issue Feb 12, 2019
1. Setting length for formatWarning String to avoid
AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid
unnecessarily creating StringBuilder in LoggerMessageFormat.format

Relates to elastic#37530
Relates to elastic#37411
ebadyano added a commit that referenced this issue Feb 22, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: #37530 (comment) 

Relates to #37530
Relates to #37411
Relates to #35754
ebadyano added a commit to ebadyano/elasticsearch that referenced this issue Feb 25, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (elastic#37411) by 3%. For more details: elastic#37530 (comment) 

Relates to elastic#37530
Relates to elastic#37411
Relates to elastic#35754
ebadyano added a commit that referenced this issue Feb 25, 2019
1. Setting length for formatWarning String to avoid AbstractStringBuilder.ensureCapacityInternal calls
2. Adding extra check for parameter array length == 0 to avoid unnecessarily creating StringBuilder in LoggerMessageFormat.format

Helps to narrow the performance gap in throughout for geonames benchmark (#37411) by 3%. For more details: #37530 (comment) 

Relates to #37530
Relates to #37411
Relates to #35754
@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>regression :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

5 participants