Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread interrupts permanently break the HostClassLoader #3273

Closed
kustosz opened this issue Mar 11, 2021 · 7 comments
Closed

Thread interrupts permanently break the HostClassLoader #3273

kustosz opened this issue Mar 11, 2021 · 7 comments
Assignees

Comments

@kustosz
Copy link
Contributor

kustosz commented Mar 11, 2021

Describe GraalVM and your environment :

  • GraalVM version or commit id if built from source: 21.0.0.2
  • CE or EE: CE
  • JDK version: JDK11
  • OS and OS Version: macOS Catalina 11.1
  • Architecture: amd64
  • The output of java -Xinternalversion:
OpenJDK 64-Bit Server VM (11.0.10+8-jvmci-21.0-b06) for bsd-amd64 JRE (11.0.10+8-jvmci-21.0-b06), built on Jan 15 2021 00:28:57 by "graal1" with gcc 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)

Have you verified this issue still happens when using the latest snapshot?
You can find snapshot builds here: https://github.com/graalvm/graalvm-ce-dev-builds/releases
No

Describe the issue
Using host interop in the presence of thread interrupts may break host interop forever.
The HostClassLoader seems to enter an unrecoverably broken state when a thread interrupt happens.
This is very concerning, as it suggests that it is unsafe to use thread interrupts with Truffle, as we have no guarantee
as to when the classloader is running.

Code snippet or code repository that reproduces the issue
Very minimal (albeit impractical) example (it uses icu4j, but I assume other jars would be equally problematic):

import org.graalvm.polyglot.Context;
import org.graalvm.polyglot.PolyglotException;

public class Main {
  public static void main(String[] args) {
    var ctx = Context.newBuilder().allowAllAccess(true).build();
    ctx.eval("js", "Java.addToClasspath('icu4j-67.1.jar')");
    var cls = ctx.eval("js", "Java.type('com.ibm.icu.text.BreakIterator')");
    var fn = ctx.eval("js", "(cls) => { cls.getCharacterInstance(); }");
    Thread.currentThread().interrupt();
    try {
      fn.execute(cls);
    } catch (PolyglotException e) {
      System.out.println(e);
      e.printStackTrace();
    }
    Thread.interrupted(); // make sure interrupt is cleared
    try {
      fn.execute(cls);
    } catch (PolyglotException e) {
      System.out.println(e);
      e.printStackTrace();
    }

  }
}

Steps to reproduce the issue
Run the code above (possibly changing the JAR added to host classpath).

Expected behavior
The second call to fn.execute(cls) should succeed.

Additional context
Notice that the first call to fn.execute(cls) fails with NoClassDefFoundError, with the root cause being

Caused by: java.nio.channels.ClosedByInterruptException
        at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)
        at java.base/sun.nio.ch.FileChannelImpl.endBlocking(FileChannelImpl.java:162)
        at java.base/sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:366)
        at java.base/sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:52)
        at org.graalvm.truffle/com.oracle.truffle.api.TruffleFile$ByteChannelDecorator.position(TruffleFile.java:2219)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$JarLoader$ZipUtils.getInputStream(HostClassLoader.java:537)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$JarLoader$1.getInputStream(HostClassLoader.java:388)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$Resource.getContent(HostClassLoader.java:247)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader.findClass(HostClassLoader.java:157)
        ... 58 more

That seems OK – the thread was interrupted, failures are to be expected.

However, the second (and any further) call to fn.execute(cls) fails with a NoClassDefFoundError, with the root cause:

Caused by: java.nio.channels.ClosedChannelException
        at java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150)
        at java.base/sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:349)
        at java.base/sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:52)
        at org.graalvm.truffle/com.oracle.truffle.api.TruffleFile$ByteChannelDecorator.position(TruffleFile.java:2219)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$JarLoader$ZipUtils.getInputStream(HostClassLoader.java:537)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$JarLoader$1.getInputStream(HostClassLoader.java:388)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader$Resource.getContent(HostClassLoader.java:247)
        at org.graalvm.truffle/com.oracle.truffle.polyglot.HostClassLoader.findClass(HostClassLoader.java:157)
        ... 53 more

Which suggests that the class loader has entered some broken state.

@oubidar-Abderrahim
Copy link
Member

Tracked internally at GR-30019

@chumer
Copy link
Member

chumer commented Mar 23, 2021

Thanks for the report.

We want to move away from using Thread.interrupt for interruption in general in 21.2. Until then we can maybe handle interrupt more gracefully.

@chumer
Copy link
Member

chumer commented Mar 23, 2021

We think we have already fixed this in latest.
@kustosz Can you double check wheter this still happens on latest?
You can find a latest snapshot build in: https://github.com/graalvm/graalvm-ce-dev-builds/releases/
We will add a test for this in the meantime.

@iamrecursion
Copy link
Contributor

Initial news is positive. The above reproducer does indeed appear to be fixed and behave as expected on 21.1.0-dev-20210324_0142. Before calling this truly fixed, however, I need to see if the problem as we encountered it in Enso is fixed. Expect updates in a day or so.

@radeusgd
Copy link
Contributor

We think we have already fixed this in latest.
@kustosz Can you double check wheter this still happens on latest?
You can find a latest snapshot build in: graalvm/graalvm-ce-dev-builds/releases
We will add a test for this in the meantime.

To be able to build our project, we also need the variant of truffle-api.jar that has the open modules (normally I think it is available on https://mvnrepository.com/artifact/org.graalvm.truffle/truffle-api/21.0.0.2 ) (so its module-info contains exports com.oracle.truffle.api; and not exports com.oracle.truffle.api to ...; as the one in the distribution).

Is it possible to somehow get this JAR variant? Or maybe it's possible to just somehow patch the module-info?

Unfortunately, sbt does not allow us to build the project without this as it fails the module exports checks and we cannot easily override this.

@iamrecursion
Copy link
Contributor

iamrecursion commented Mar 25, 2021

Alrighty, a bunch of --add-exportss fixed our build issue for the purposes of testing and it looks like this has indeed fixed the issue we were seeing! Both the repro and the Enso interpreter now behave as expected. Thanks so much!

@tzezula
Copy link
Member

tzezula commented Mar 26, 2021

Fixed by 9e83f61 in the graalvm-21.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants