ConcurrentModificationException in ShardFieldUsageTracker #78899

scampi · 2021-10-10T23:03:01Z

While testing the version 7.15, I got a ConcurrentModificationException coming from the FieldUsageTrackingDirectoryReader:

java.util.ConcurrentModificationException
	at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1134)
	at org.elasticsearch.index.search.stats.ShardFieldUsageTracker$FieldUsageStatsTrackingSession.getOrAdd(ShardFieldUsageTracker.java:160)
	at org.elasticsearch.index.search.stats.ShardFieldUsageTracker$FieldUsageStatsTrackingSession.onTermsUsed(ShardFieldUsageTracker.java:165)
	at org.elasticsearch.search.internal.FieldUsageTrackingDirectoryReader$FieldUsageTrackingLeafReader.terms(FieldUsageTrackingDirectoryReader.java:118)
	at org.elasticsearch.search.internal.ExitableDirectoryReader$ExitableLeafReader.terms(ExitableDirectoryReader.java:93)
	at org.apache.lucene.index.TermStates.loadTermsEnum(TermStates.java:121)
	at org.apache.lucene.index.TermStates.get(TermStates.java:187)
	at org.apache.lucene.search.TermQuery$TermWeight.getTermsEnum(TermQuery.java:134)
	at org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:109)
	at org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
	at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:838)
	at org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:157)

From what I can tell, the ShardFieldUsageTracker uses an hashmap that is modified concurrently: I create a Scorer in several threads, one per segment. The creation of the scorer fails with the ConcurrentModificationException.

elasticsearch/server/src/main/java/org/elasticsearch/index/search/stats/ShardFieldUsageTracker.java

Line 160 in 79d65f6

return usages.computeIfAbsent(fieldName, k -> new PerField());

Is it possible to make that usages variable a ConcurrentHashMap ? Should I try to create a PR with a reproducing unit test ?

The text was updated successfully, but these errors were encountered:

jimczi · 2021-10-11T09:47:09Z

I create a Scorer in several threads, one per segment.

This is not something that we support by default. Are you saying that you do that in a custom plugin ? I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

Is it possible to make that usages variable a ConcurrentHashMap ? Should I try create a PR with a reproducing unit test ?

The problem is more general I think. We expect a single thread per search at the moment so changing this would require more architectural changes. Can you describe the use case that you're trying to solve with the concurrent search ?

scampi · 2021-10-11T12:06:00Z

Can you describe the use case that you're trying to solve with the concurrent search ?

This is done in a plugin that executes joins over indices. The goal is to execute a search concurrently against segments of a shard: a series of searches is executed on several indices, thus we seek to reduce the time spent on a search as much as possible. A segment being read-only, it is better to do it this way than to operate on the segments sequentially.

This is not something that we support by default. Are you saying that you do that in a custom plugin ? I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

I am sure you are aware of this, but doesn't this go against the thread-safety comment of the DirectoryReader class (see below) ? Is that meant for a different context ?

NOTE: {@link IndexReader} instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead.

https://github.com/apache/lucene/blob/ed69f6080f5943b6f547d1d431e6d34ebd7f9e36/lucene/core/src/java/org/apache/lucene/index/DirectoryReader.java#L41-L46

I am afraid that there's more than just the FieldUsageTrackingDirectoryReader that will break.

Off the top of your head, can you point to some of the readers that would be problematic ?

jimczi · 2021-10-11T12:41:22Z

I am sure you are aware of this, but doesn't this go against the thread-safety comment of the DirectoryReader class (see below) ? Is that meant for a different context ?

Well it should work in practice and it's supported in Lucene. However we don't use this feature in Elasticsearch so it's not tested and some of our custom collectors could break like aggregations for instance. It's something that can change, we already discussed supporting this feature, so it seems reasonable to try to avoid breaking the concurrency when possible.
Although the fact that FieldUsageTrackingDirectoryReader don't use a synchronized map is an optimization. @ywelsch do you know the cost of switching to a concurrent hash map there ? I think that the initial implementation had the synchronized map but I don't recall if it was an issue in benchmarks or not.

ywelsch · 2021-10-11T13:21:17Z

It's something we will need to benchmark: While the initial implementation was using synchronization, it was only the simpler non-synchronized version later on that was benchmarked.

elasticmachine · 2021-10-12T11:24:24Z

Pinging @elastic/es-search (Team:Search)

scampi · 2021-10-13T08:41:37Z

Can I help running this benchmark ?

ywelsch · 2021-10-13T16:09:45Z

We discussed this internally, and will check the effects of #79088 using our nightly regression benchmarks.

While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently. Closes #78899

While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently. Closes elastic#78899

While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently. Closes #78899

danhermann added the :Search Relevance/Ranking Scoring, rescoring, rank evaluation. label Oct 12, 2021

elasticmachine added the Team:Search Meta label for search team label Oct 12, 2021

ywelsch mentioned this issue Oct 13, 2021

Allow field usage tracker to be concurrently accessed #79088

Merged

ywelsch closed this as completed in #79088 Oct 14, 2021

ywelsch added a commit that referenced this issue Oct 14, 2021

Allow field usage tracker to be concurrent accessed (#79088)

d29a8d3

While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently. Closes #78899

ywelsch added a commit that referenced this issue Oct 14, 2021

Allow field usage tracker to be concurrent accessed (#79088) (#79125)

445d995

While Lucene readers are currently only sequentially accessed, we expect future usages (and custom plugins) to access this concurrently. Closes #78899

javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConcurrentModificationException in ShardFieldUsageTracker #78899

ConcurrentModificationException in ShardFieldUsageTracker #78899

scampi commented Oct 10, 2021 •

edited

Loading

jimczi commented Oct 11, 2021

scampi commented Oct 11, 2021

jimczi commented Oct 11, 2021

ywelsch commented Oct 11, 2021

elasticmachine commented Oct 12, 2021

scampi commented Oct 13, 2021

ywelsch commented Oct 13, 2021

ConcurrentModificationException in ShardFieldUsageTracker #78899

ConcurrentModificationException in ShardFieldUsageTracker #78899

Comments

scampi commented Oct 10, 2021 • edited Loading

jimczi commented Oct 11, 2021

scampi commented Oct 11, 2021

jimczi commented Oct 11, 2021

ywelsch commented Oct 11, 2021

elasticmachine commented Oct 12, 2021

scampi commented Oct 13, 2021

ywelsch commented Oct 13, 2021

scampi commented Oct 10, 2021 •

edited

Loading