Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapper annotated text plugin highlighter crashes during text processing #39395

Closed
SilentAntenna opened this issue Feb 26, 2019 · 6 comments
Closed
Assignees
Labels
:Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v6.6.0

Comments

@SilentAntenna
Copy link

SilentAntenna commented Feb 26, 2019

Elasticsearch version: 6.5.4

Plugins installed: ["analysis-kuromoji", "ik-analyzer", "mapper-annotated-text"]

JVM version:

Java(TM) SE Runtime Environment (build 1.8.0_191-b12)

Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

OS version:

Windows 10 Professional Ver 1803 Build 17134.590

Description of the problem including expected versus actual behavior:

@markharwood
The annotated highlighter in the mapper-annotated-text plugin crashes randomly when processing documents.

Steps to reproduce:

  1. Create an index with an annotated_text field, then add some documents:
PUT /test_index
{
   "mappings": {
       "_doc": {
           "properties": {
               "my_field": {
                   "type": "annotated_text",
                   "analyzer": "whitespace"
               }
           }
       }
   }
}

PUT /test_index/_doc/1
{
   "my_field" : "[A](~MARK0) [B](~MARK1)"
}

PUT /test_index/_doc/2
{
   "my_field" : "[A](~MARK0) [C](~MARK2)"
}

The following search query will crash the annotated highlighter randomly. The number of returned items in hits.hits can be fewer than hits.total in the query result:

GET /test_index/_search
{
   "query": {
       "match_phrase": {
           "my_field": {
               "query": "~MARK0",
               "analyzer": "whitespace"
           }
       }
   },
   "highlight": {
       "type": "annotated",
       "fields": {
           "my_field": {}
       }
   }
}

Provide logs (if relevant):

The highlighter throws java.lang.NullPointerException in two different lines:

[XXXX-XX-XXTXX:XX:XX,XXX][DEBUG][o.e.a.s.TransportSearchAction] [XXXXXXX] [XXXXX] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [XXXXXXX][XXX.XXX.XXX.XXX:XXXX][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.NullPointerException
	at org.elasticsearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper$AnnotatedHighlighterTokenStreamComponents.setReader(AnnotatedTextFieldMapper.java:392) ~[?:?]
	at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:200) ~[lucene-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:01:13]
	at org.apache.lucene.search.uhighlight.AnalysisOffsetStrategy.tokenStream(AnalysisOffsetStrategy.java:56) ~[lucene-highlighter-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:02:19]
	at org.apache.lucene.search.uhighlight.MemoryIndexOffsetStrategy.getOffsetsEnum(MemoryIndexOffsetStrategy.java:98) ~[lucene-highlighter-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:02:19]
	at org.apache.lucene.search.uhighlight.FieldHighlighter.highlightFieldForDoc(FieldHighlighter.java:76) ~[lucene-highlighter-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:02:19]
	at org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFieldsAsObjects(UnifiedHighlighter.java:639) ~[lucene-highlighter-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:02:19]
	at org.apache.lucene.search.uhighlight.CustomUnifiedHighlighter.highlightField(CustomUnifiedHighlighter.java:107) ~[elasticsearch-6.5.4.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - jimczi - 2018-09-18 13:02:19]
	at org.elasticsearch.search.fetch.subphase.highlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:128) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.fetch.subphase.highlight.HighlightPhase.hitExecute(HighlightPhase.java:110) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:159) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.SearchService.lambda$executeFetchPhase$3(SearchService.java:556) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:361) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]

[XXXX-XX-XXTXX:XX:XX,XXX][DEBUG][o.e.a.s.TransportSearchAction] [XXXXXXX] [XXXXX] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [XXXXXXX][XXX.XXX.XXX.XXX:XXXX][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.NullPointerException
	at org.elasticsearch.index.mapper.annotatedtext.AnnotatedTextFieldMapper$AnnotatedHighlighterAnalyzer.getPlainTextValuesForHighlighter(AnnotatedTextFieldMapper.java:337) ~[?:?]
	at org.elasticsearch.search.fetch.subphase.highlight.AnnotatedTextHighlighter.loadFieldValues(AnnotatedTextHighlighter.java:55) ~[?:?]
	at org.elasticsearch.search.fetch.subphase.highlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:74) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.fetch.subphase.highlight.HighlightPhase.hitExecute(HighlightPhase.java:110) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:159) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.SearchService.lambda$executeFetchPhase$3(SearchService.java:556) ~[elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:361) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.5.4.jar:6.5.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
@markharwood markharwood self-assigned this Feb 26, 2019
@markharwood markharwood added the :Search Relevance/Highlighting How a query matched a document label Feb 26, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@markharwood
Copy link
Contributor

Thanks for raising this issue.
The good news is this looks to work OK in master and in 7.0beta1 - the bad news is broken in 6.6.

I'll dig some deeper into why - my guess is it's some change in Lucene behaviour around the life cycles of Analyzers because the object that we null-pointer on is expected to be initialised as part of calls made during the analysis phase.

@markharwood
Copy link
Contributor

This is a multi-threading issue. A single AnnotatedTextHighlighter object is shared across search threads but the search threads cause changes to its annotatedHighlighterAnalyzer instance variable.

@SilentAntenna
Copy link
Author

Thanks, waiting for v6.7.

By the way, could you allow the user to escape the brackets in the annotated text field? Sometimes we do have non-omissible brackets appearing in our search text.

@markharwood
Copy link
Contributor

By the way, could you allow the user to escape the brackets in the annotated text field?

Probably best to open another issue to track that. I imagine following whatever markdown syntax offers is the way forward on that

markharwood added a commit to markharwood/elasticsearch that referenced this issue Mar 6, 2019
Added thread safety by moving custom Analyzer object to per-request context rather than singleton.
Added YAML test that reproduced the error.

Closes elastic#39395
markharwood added a commit that referenced this issue Mar 6, 2019
Bug fix for AnnotatedTextHighlighter.
Added thread safety by moving parsed state to per-request context rather than holding in AnnotatedTextHighlighter singleton.
Added YAML test that reproduced the error.
Refactored to pull formatting logic from AnnotatedHighlighterAnalyzer into AnnotatedPassageFormatter

Closes #39395
markharwood added a commit that referenced this issue Mar 6, 2019
Bug fix for AnnotatedTextHighlighter - port of 39525
Relates to #39395
markharwood added a commit that referenced this issue Mar 6, 2019
Bug fix for AnnotatedTextHighlighter - port of 39525

Relates to #39395
markharwood added a commit that referenced this issue Mar 6, 2019
Bug fix for AnnotatedTextHighlighter - port of 39525

Relates to #39395
@markharwood
Copy link
Contributor

Fixed in 6.7 up

@javanna javanna added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Relevance/Highlighting How a query matched a document Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v6.6.0
Projects
None yet
Development

No branches or pull requests

4 participants