Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java sporadic crashes occur in Tomcat when Native.loadLibrary("c") function gets called from war file #678

Open
alelsg opened this issue Jul 8, 2016 · 2 comments

Comments

@alelsg
Copy link

alelsg commented Jul 8, 2016

Hi All,

We have implemented a war file to provide WEB API for customers. This is a middleware application between our main c++ library that processes queries (further libonetick.dll) and client applications. .war file gets loaded by tomcat. Once we move this war to tomcat's webapps directory our customers are able to query our libeontick.dll using curl, browser etc.

Our java API classes and functions called from this war file source code are generated by swig(war redirects calls from java to c++ using jni).

You can imagine the sequence of library loads as follows

First we load jomd.dll (which contains swig generated c++ source code) using System.loadLibrary("jomd"); call in a static block of jomd_20160320121237JNI class, then jomd.dll loads libonetick.dll as a dependency.

So to load correct libonetick.dll we should have correct PATH and LD_LIBRARY_PATH set before initializing libonetick objects in java. We have to provide a mechanism to upgrade war file without restarting tomcat.

War file has configuration file that points to the new distribution bin directory. Each time we redeploy this war we change this config file to point to the new distribution's bin directory. Directory from where jomd.dll and new libonetick.dll should be loaded.

Now to load libonetick.dll from correct location during upgrade we should call c++ native methods programmatically to update PATH and LD_LIBRARY_PATH env vars each time we redeploy webapi.

We use libc library on linux and msvcrt on Windows to expose c++ interface in java and set new environment before loading jomd.dll and libonetick.dll.

Here is the code we use to get c++ API functions and call them to set PATH and LD_LIBRARY_PATH.

interface CLibraryInstance extends Library {

CLibraryInstance INSTANCE = (CLibraryInstance) Native.loadLibrary(
    (Platform.isWindows() ? "msvcrt" : "c"),
    CLibraryInstance.class);

int setenv(String envvar, String value); //This one will be used from Linux
int _putenv_s(String envvar, String value); ////This one will be used from Windows

}

All crashes we observed occur during dlsym calls , here is the stack

Thread 1 (process 3054):
#0 0x0000003335630155 in raise () from /lib64/libc.so.6
#1 0x0000003335631bf0 in abort () from /lib64/libc.so.6
#2 0x00002b09f919aac5 in os::abort ()

from /vol2/omdshare/Linux_RHEL5_x86_64/tools/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so
#3 0x00002b09f92fa137 in VMError::report_and_die ()

from /vol2/omdshare/Linux_RHEL5_x86_64/tools/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so
#4 0x00002b09f919e5e0 in JVM_handle_linux_signal ()

from /vol2/omdshare/Linux_RHEL5_x86_64/tools/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so
#5
#6 0x0000003334608db5 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#7 0x0000003334609252 in _dl_lookup_symbol_x ()

from /lib64/ld-linux-x86-64.so.2
#8 0x0000003335706734 in do_sym () from /lib64/libc.so.6
#9 0x0000003335e01104 in dlsym_doit () from /lib64/libdl.so.2
#10 0x000000333460ce56 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#11 0x0000003335e0150d in _dlerror_run () from /lib64/libdl.so.2
#12 0x0000003335e010ba in dlsym () from /lib64/libdl.so.2
#13 0x00002b09f9196f6d in os::dll_lookup ()

from /vol2/omdshare/Linux_RHEL5_x86_64/tools/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so
#14 0x00002aaaaacf4a6e in Java_java_lang_ClassLoader_00024NativeLibrary_find

()

from /vol2/omdshare/Linux_RHEL5_x86_64/tools/jdk1.7.0_25/jre/lib/amd64/libjava.so
#15 0x00002aaaab378d8e in ?? ()
#16 0x0000000000000000 in ?? ()

(gdb)

Java frames look like the following

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)

j java.lang.ClassLoader$NativeLibrary.find(Ljava/lang/String;)J+0
j java.lang.ClassLoader.findNative(Ljava/lang/ClassLoader;Ljava/lang/String;)J+49
v ~StubRoutines::call_stub
j com.sun.jna.Native.close(J)V+0
j com.sun.jna.NativeLibrary.dispose()V+85
j com.sun.jna.NativeLibrary.finalize()V+1
v ~StubRoutines::call_stub
J java.lang.ref.Finalizer.invokeFinalizeMethod(Ljava/lang/Object;)V
J java.lang.ref.Finalizer.access$100(Ljava/lang/ref/Finalizer;)V
J java.lang.ref.Finalizer$FinalizerThread.run()V
v ~StubRoutines::call_stub

I compiled dlsym and dlopen functions wrappers into a separate dll which did some extra logging.

From dlsym wrapper logs I found the problematic call, which was
0x40a9db4000000000 Call 479 to Java_com_sun_jna_Native_close function with handle=0x2aaab8cc4f50
on handle 0x2aaab8cc4f50

code tries to find Java_com_sun_jna_Native_close function on a handle 0x2aaab8cc4f50 which was invalidated

From dlopen wrapper dll logs I found problematic dll name (for which 0x2aaab8cc4f50 handle belongs)
...
0x40194c4100000000 Call 51: /home/build/aleksg/dev/testruns/20160618231559/webapi_test.small/apache-tomcat-7.0.52/temp/jna-94094958/jna7683664748922841283.tmp=0x2aaab8cc4f50
...

Eliminating the code little by little I found that this tmp dll gets loaded to tomcat after above mentioned call of Native.loadLibrary(in the source code I sent you).

In successful runs, if Java_com_sun_jna_Native_close is called, it gets called before JNI_OnUnload function gets called for tmp dll. Here is a successful run debug info example (only last few lines)

Call:483 to Java_com_sun_jna_Native_close function
Call:484 to JNI_OnUnload function
Call:485 to Java_com_omd_jomd_jomd_120160519120533JNI_delete_1StreamingCallbackWrapperBase function
Call:486 to Java_com_omd_jomd_jomd_120160519120533JNI_delete_1NameValueMap function

For crashed run the sequence of Jni_OnUnload and Java_com_sun_jna_Native_close was vice verca.
In opposite to this in all crashed cases Java_com_sun_jna_Native_closewas got called after JNIOnUnload was called for tmp dll.

Call:486 to JNI_OnUnload function
Call:487 to Java_com_sun_jna_Native_close function Finished Call:487 to Java_com_sun_jna_Native_close function
РЇ;ёЄ*: undefined symbol: Java_com_sun_jna_Native_close handle was invalid
Call:488 to Java_com_sun_jna_Native_close__J function

To summarize

1.The problem is sporadic
2.It occurs when garbage collector thread tries to close handle of tmp dll created in tomcat temp folder after Native.loadLibrary((Platform.isWindows() ? "msvcrt" : "c"), call, when JNI_OnUnload was already been called for this tmp lib.
3.We observed crash only on Linux operating system not Windows.

JDK vesrion jdk1.7.0_25
Operating System CentOS release 5.2 (Final)
Tomcat version apache-tomcat-7.0.52
Kernel Vesrion 2.6.18-92.el5xen

@alelsg alelsg changed the title Crash in dlsym function Java sporadic crashes occur in Tomcat when Native.loadLibrary("c") function gets called from war file Jul 8, 2016
@oleg-nenashev
Copy link

Seems we have the same issue in Jenkins: JENKINS-39388. No symbols from the requester to say for sure, but the pattern is very similar: https://issues.jenkins-ci.org/secure/attachment/34653/hs_err.txt

@twall
Copy link
Contributor

twall commented Dec 24, 2016

The auto-unpack/load feature of JNA was intended to make self-contained distributions easier. In situations where you have an installation that is in regular use (like this one), it's generally better and more efficient to install the shared library in a known location and add it to the system library load path.

mstyura pushed a commit to mstyura/jna that referenced this issue Sep 9, 2024
Motivation:
Ability to get the RTT as requested in java-native-access#678

Modification:
Add ability to get the path stats using quiche_conn_path_stats

Result:
The change adds the API to get the path stats.
---------

Co-authored-by: Norman Maurer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants