Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException in ShardFieldUsageTracker #78899

Closed
scampi opened this issue Oct 10, 2021 · 7 comments · Fixed by #79088
Closed

ConcurrentModificationException in ShardFieldUsageTracker #78899

scampi opened this issue Oct 10, 2021 · 7 comments · Fixed by #79088
Labels
:Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@scampi
Copy link
Contributor

scampi commented Oct 10, 2021

While testing the version 7.15, I got a ConcurrentModificationException coming from the FieldUsageTrackingDirectoryReader:

java.util.ConcurrentModificationException
	at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1134)
	at org.elasticsearch.index.search.stats.ShardFieldUsageTracker$FieldUsageStatsTrackingSession.getOrAdd(ShardFieldUsageTracker.java:160)
	at org.elasticsearch.index.search.stats.ShardFieldUsageTracker$FieldUsageStatsTrackingSession.onTermsUsed(ShardFieldUsageTracker.java:165)
	at org.elasticsearch.search.internal.FieldUsageTrackingDirectoryReader$FieldUsageTrackingLeafReader.terms(FieldUsageTrackingDirectoryReader.java:118)
	at org.elasticsearch.search.internal.ExitableDirectoryReader$ExitableLeafReader.terms(ExitableDirectoryReader.java:93)
	at org.apache.lucene.index.TermStates.loadTermsEnum(TermStates.java:121)
	at org.apache.lucene.index.TermStates.get(TermStates.java:187)
	at org.apache.lucene.search.TermQuery$TermWeight.getTermsEnum(TermQuery.java:134)
	at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:109)
	at org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
	at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:838)
	at org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:157)

From what I can tell, the ShardFieldUsageTracker uses an hashmap that is modified concurrently: I create a Scorer in several threads, one per segment. The creation of the scorer fails with the ConcurrentModificationException.

return usages.computeIfAbsent(fieldName, k -> new PerField());

Is it possible to make that usages variable a ConcurrentHashMap ? Should I try to create a PR with a reproducing unit test ?

@jimczi
Copy link
Contributor

jimczi commented Oct 11, 2021

I create a Scorer in several threads, one per segment.

This is not something that we support by default. Are you saying that you do that in a custom plugin ? I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

Is it possible to make that usages variable a ConcurrentHashMap ? Should I try create a PR with a reproducing unit test ?

The problem is more general I think. We expect a single thread per search at the moment so changing this would require more architectural changes. Can you describe the use case that you're trying to solve with the concurrent search ?

@scampi
Copy link
Contributor Author

scampi commented Oct 11, 2021

Can you describe the use case that you're trying to solve with the concurrent search ?

This is done in a plugin that executes joins over indices. The goal is to execute a search concurrently against segments of a shard: a series of searches is executed on several indices, thus we seek to reduce the time spent on a search as much as possible. A segment being read-only, it is better to do it this way than to operate on the segments sequentially.

This is not something that we support by default. Are you saying that you do that in a custom plugin ? I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

I am sure you are aware of this, but doesn't this go against the thread-safety comment of the DirectoryReader class (see below) ? Is that meant for a different context ?

NOTE: {@link IndexReader} instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead.

https://github.com/apache/lucene/blob/ed69f6080f5943b6f547d1d431e6d34ebd7f9e36/lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java#L41-L46

I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

Off the top of your head, can you point to some of the readers that would be problematic ?

@jimczi
Copy link
Contributor

jimczi commented Oct 11, 2021

I am sure you are aware of this, but doesn't this go against the thread-safety comment of the DirectoryReader class (see below) ? Is that meant for a different context ?

Well it should work in practice and it's supported in Lucene. However we don't use this feature in Elasticsearch so it's not tested and some of our custom collectors could break like aggregations for instance. It's something that can change, we already discussed supporting this feature, so it seems reasonable to try to avoid breaking the concurrency when possible.
Although the fact that FieldUsageTrackingDirectoryReader don't use a synchronized map is an optimization. @ywelsch do you know the cost of switching to a concurrent hash map there ? I think that the initial implementation had the synchronized map but I don't recall if it was an issue in benchmarks or not.

@ywelsch
Copy link
Contributor

ywelsch commented Oct 11, 2021

It's something we will need to benchmark: While the initial implementation was using synchronization, it was only the simpler non-synchronized version later on that was benchmarked.

@danhermann danhermann added the :Search Relevance/Ranking Scoring, rescoring, rank evaluation. label Oct 12, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@scampi
Copy link
Contributor Author

scampi commented Oct 13, 2021

Can I help running this benchmark ?

@ywelsch
Copy link
Contributor

ywelsch commented Oct 13, 2021

We discussed this internally, and will check the effects of #79088 using our nightly regression benchmarks.

ywelsch added a commit that referenced this issue Oct 14, 2021
While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently.

Closes #78899
ywelsch added a commit to ywelsch/elasticsearch that referenced this issue Oct 14, 2021
While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently.

Closes elastic#78899
ywelsch added a commit that referenced this issue Oct 14, 2021
While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently.

Closes #78899
@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants