Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native image hangs #3241

Closed
gustavocoding opened this issue Feb 24, 2021 · 7 comments
Closed

Native image hangs #3241

gustavocoding opened this issue Feb 24, 2021 · 7 comments
Assignees

Comments

@gustavocoding
Copy link

Describe the issue

Executing the sample native image causes it to hang

Steps to reproduce the issue

git clone https://github.com/gustavonalle/lucene-graalvm
cd lucene-graalvm
mvn clean package
target/lucene-sample

(hangs)

Describe GraalVM and your environment:

Linux 5.8.18-100.fc31.x86_64 #1 SMP Mon Nov 2 20:32:55 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
openjdk version "11.0.10" 2021-01-19
OpenJDK Runtime Environment GraalVM CE 21.0.0.2 (build 11.0.10+8-jvmci-21.0-b06)
OpenJDK 64-Bit Server VM GraalVM CE 21.0.0.2 (build 11.0.10+8-jvmci-21.0-b06, mixed mode, sharing)

More details

Attaching gdb to the hang process:

gdp -p <PID>

gives:

Type "apropos word" to search for commands related to "word".
Attaching to process 334723
Reading symbols from /home/gfernandes/github/lucene-graalvm/target/lucene-sample...
Reading symbols from /lib64/libpthread.so.0...
Reading symbols from /usr/lib/debug/usr/lib64/libpthread-2.30.so.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /lib64/libdl.so.2...
Reading symbols from /usr/lib/debug/usr/lib64/libdl-2.30.so.debug...
Reading symbols from /lib64/libz.so.1...
Reading symbols from /usr/lib/debug/usr/lib64/libz.so.1.2.11-1.2.11-20.fc31.x86_64.debug...
Reading symbols from /lib64/librt.so.1...
Reading symbols from /usr/lib/debug/usr/lib64/librt-2.30.so.debug...
Reading symbols from /lib64/libc.so.6...
Reading symbols from /usr/lib/debug/usr/lib64/libc-2.30.so.debug...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/usr/lib64/ld-2.30.so.debug...
org.apache.lucene.util.RamUsageEstimator.<clinit>(void) () at com/oracle/svm/core/graal/snippets/SafepointSnippets.java:64
64	com/oracle/svm/core/graal/snippets/SafepointSnippets.java: No such file or directory.
(gdb) bt
#0  org.apache.lucene.util.RamUsageEstimator.<clinit>(void) () at com/oracle/svm/core/graal/snippets/SafepointSnippets.java:64
#1  0x0000000000416b38 in com.oracle.svm.core.classinitialization.ClassInitializationInfo.initialize(com.oracle.svm.core.classinitialization.ClassInitializationInfo, java.lang.Class)(void) ()
    at com/oracle/svm/core/classinitialization/ClassInitializationInfo.java:375
#2  0x0000000000b2f833 in org.apache.lucene.util.FrequencyTrackingRingBuffer.<clinit>(void) () at org/apache/lucene/util/FrequencyTrackingRingBuffer.java:36
#3  0x0000000000416b38 in com.oracle.svm.core.classinitialization.ClassInitializationInfo.initialize(com.oracle.svm.core.classinitialization.ClassInitializationInfo, java.lang.Class)(void) ()
    at com/oracle/svm/core/classinitialization/ClassInitializationInfo.java:375
#4  0x0000000000ad317a in org.apache.lucene.search.UsageTrackingQueryCachingPolicy.<init>(int)(void) () at org/apache/lucene/search/UsageTrackingQueryCachingPolicy.java:109
#5  0x0000000000ab0822 in org.apache.lucene.search.IndexSearcher.<clinit>(void) () at org/graalvm/compiler/replacements/AllocationSnippets.java:289
#6  0x0000000000416b38 in com.oracle.svm.core.classinitialization.ClassInitializationInfo.initialize(com.oracle.svm.core.classinitialization.ClassInitializationInfo, java.lang.Class)(void) ()
    at com/oracle/svm/core/classinitialization/ClassInitializationInfo.java:375
#7  0x0000000000a2cf80 in org.apache.lucene.index.LiveIndexWriterConfig.<init>(org.apache.lucene.analysis.Analyzer)(void) () at org/apache/lucene/index/LiveIndexWriterConfig.java:125
#8  0x0000000000be2c6c in org.infinispan.LuceneBackend.basicIndexAndSearch(java.lang.Class)(void) () at org/graalvm/compiler/replacements/AllocationSnippets.java:70
#9  0x0000000000be2a89 in org.infinispan.LuceneBackend.basicIndexAndSearch(void) () at org/infinispan/LuceneBackend.java:55
#10 0x0000000000be2952 in org.infinispan.App.main(java.lang.String[])(void) () at org/graalvm/compiler/replacements/AllocationSnippets.java:289
#11 0x00000000004225e6 in com.oracle.svm.core.code.IsolateEnterStub.JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b(int, org.graalvm.nativeimage.c.type.CCharPointerPointer)(void) ()
    at com/oracle/svm/core/JavaMainWrapper.java:146
(gdb) 




@munishchouhan
Copy link
Contributor

@gustavonalle Thanks for reporting the issue
We will check it out and get back to you

@munishchouhan munishchouhan self-assigned this Feb 28, 2021
@laszbalo
Copy link

laszbalo commented Mar 5, 2021

I have a Lucene project, too, where calling the constructor of IndexWriterConfig makes the process hang with a 100% CPU utilisation.

@munishchouhan
Copy link
Contributor

@gustavonalle I am getting below error, when running this progeam:

/lucene-graalvm$ target/lucene-sample
Starting app
Created document Document<stored,indexed,tokenized<name:Bruce Wayne> docValuesType=NUMERIC<age:54> stored<metadata:553d26b58f840> FloatPoint <height:1.8> LatLonDocValuesField <location:54.199999971315265,0.019999975338578224>>
Exception in thread "main" java.lang.ExceptionInInitializerError
        at com.oracle.svm.core.classinitialization.ClassInitializationInfo.initialize(ClassInitializationInfo.java:315)
        at org.apache.lucene.analysis.Analyzer.attributeFactory(Analyzer.java:294)
        at org.apache.lucene.analysis.custom.CustomAnalyzer.createComponents(CustomAnalyzer.java:150)
        at org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:136)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:199)
        at org.apache.lucene.document.Field.tokenStream(Field.java:513)
        at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:915)
        at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:524)
        at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:488)
        at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:208)
        at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)
        at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1471)
        at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1757)
        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1400)
        at org.infinispan.LuceneBackend.basicIndexAndSearch(LuceneBackend.java:73)
        at org.infinispan.LuceneBackend.basicIndexAndSearch(LuceneBackend.java:55)
        at org.infinispan.App.main(App.java:9)
Caused by: java.lang.IllegalArgumentException: Cannot lookup accessible no-arg constructor for: org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl
        at org.apache.lucene.util.AttributeFactory.findAttributeImplCtor(AttributeFactory.java:48)
        at org.apache.lucene.util.AttributeFactory.getStaticImplementation(AttributeFactory.java:143)
        at org.apache.lucene.analysis.TokenStream.<clinit>(TokenStream.java:86)
        at com.oracle.svm.core.classinitialization.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:375)
        at com.oracle.svm.core.classinitialization.ClassInitializationInfo.initialize(ClassInitializationInfo.java:295)
        ... 16 more
Caused by: java.lang.NoSuchMethodException: no such constructor: org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl.<init>()void/newInvokeSpecial
        at java.lang.invoke.MemberName.makeAccessException(MemberName.java:961)
        at java.lang.invoke.MemberName$Factory.resolveOrFail(MemberName.java:1101)
        at java.lang.invoke.MethodHandles$Lookup.resolveOrFail(MethodHandles.java:2030)
        at java.lang.invoke.MethodHandles$Lookup.findConstructor(MethodHandles.java:1264)
        at org.apache.lucene.util.AttributeFactory.findAttributeImplCtor(AttributeFactory.java:46)
        ... 20 more
Caused by: java.lang.NoSuchMethodError: org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl.<init>()
        at java.lang.invoke.MethodHandleNatives.resolve(MethodHandleNatives.java:230)
        at java.lang.invoke.MemberName$Factory.resolve(MemberName.java:1070)
        at java.lang.invoke.MemberName$Factory.resolveOrFail(MemberName.java:1098)
        ... 23 more

@munishchouhan
Copy link
Contributor

@gustavonalle just to clarify i got the above exception with EE and native-image is hanging with CE,
I have logged the bug with dev team

@cstancu
Copy link
Member

cstancu commented Jun 3, 2021

The underlying issue here is that Lucene makes some assumptions about Java semantics that don't always hold, more exactly it relies on identity of Long boxes in java.org.apache.lucene.util.RamUsageEstimator where it tries to determine the size of the LongCache.cache cache:

long longCacheMinValue = 0;
while (longCacheMinValue > Long.MIN_VALUE
    && Long.valueOf(longCacheMinValue - 1) == Long.valueOf(longCacheMinValue - 1)) {
    longCacheMinValue -= 1;
}

On SVM (and in JIT mode) the second part of the condition is optimized by the Graal escape analysis as always true, and the loop condition is transformed into longCacheMinValue > Long.MIN_VALUE, so the loop is not really stuck but it ends up running Long.MIN_VALUE times. In the HotSpot interpreter the loop runs 127 times but the Graal compiler assumes all Long values are equal which is equivalent with having a maximum length value cache, which is a legal assumption according to the Long.valueOf() JavaDoc.

To fix this there are two options:

  • first, Lucene could use reflection to get the actual size of LongCache.cache if it really needs it, although I am a bit suspicious about this method of estimating used RAM.
  • second, Graal could detect this pattern and not optimize it, iff it doesn't lead to any performance regression. @davleopo is looking further into that.

@christianwimmer
Copy link

Looks like there is already a Lucene bug from more than a year ago: https://issues.apache.org/jira/browse/LUCENE-9117
Since jaotc uses Graal, it is not surprising that it has the same problem. They just removed the flawed long cache detection.

Now I don't understand the Lucene release model enough, but I think it will be fixed in the not-yet-released version 9 of Lucene. And code like that is probably in many other projects too. So if we can detect and properly handle the pattern, it would certainly be good for compatibility.

@dougxc
Copy link
Member

dougxc commented Jun 7, 2021

Should be resolved by 21d5002 - please re-open if not.

@dougxc dougxc closed this as completed Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants