-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent crash attempting to embed Julia in Java via dynamically loading libjulia.so #36092
Comments
Hi @cnuernber, my sense is the signal handling between Java and Julia are getting crossed. We may need to set |
I agree, this looks very promising. It looks like I can call I also recently realized that potentially the original pathway in javacall.jl of hosting the jvm as opposed to the jvm hosting julia may work better as that may disable some of the JVM's less community oriented |
https://github.com/JuliaLang/julia/blob/master/src/julia.h#L1941-L1942 |
Signal chaining facilities may also be important when embedding in JVM: |
Success! That indeed stopped the memory error: julia-clj.core> (disable-julia-signals!)
Nov 23, 2020 11:31:19 PM clojure.tools.logging$eval153$fn__156 invoke
INFO: Library julia found at [:system "julia"]
nil
julia-clj.core> (println (.toString (JLOptions. (find-julia-symbol "jl_options"))))
JLOptions(native@0x7fab48a9ede0) (184 bytes) {
byte quiet@0x0=0x0
byte banner@0x1=0xFF
Pointer julia_bindir@0x8=null
Pointer julia_bin@0x10=null
Pointer cmds@0x18=null
Pointer image_file@0x20=null
Pointer cpu_target@0x28=null
int nthreads@0x30=0x0000
int nprocs@0x34=0x0000
Pointer machine_file@0x38=null
Pointer project@0x40=null
byte isinteractive@0x48=0x0
byte color@0x49=0x0
byte historyfile@0x4A=0x1
byte startupfile@0x4B=0x0
byte compile_enabled@0x4C=0x1
byte code_coverage@0x4D=0x0
byte malloc_log@0x4E=0x0
byte opt_level@0x4F=0x2
byte debug_level@0x50=0x1
byte check_bounds@0x51=0x0
byte depwarn@0x52=0x0
byte warn_overwrite@0x53=0x0
byte can_inline@0x54=0x1
byte polly@0x55=0x1
Pointer trace_compile@0x58=null
byte fast_math@0x60=0x0
byte worker@0x61=0x0
Pointer cookie@0x68=null
byte handle_signals@0x70=0x0
byte use_sysimage_native_code@0x71=0x1
byte use_compiled_modules@0x72=0x1
Pointer bindto@0x78=null
Pointer outputbc@0x80=null
Pointer outputunoptbc@0x88=null
Pointer outputo@0x90=null
Pointer outputasm@0x98=null
Pointer outputji@0xA0=null
Pointer output_code_coverage@0xA8=null
byte incremental@0xB0=0x0
byte image_file_specified@0xB1=0x0
byte warn_scope@0xB2=0x1
}
nil
julia-clj.core> (jl_init__threading)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> (System/gc)
nil
julia-clj.core> |
OK, on to next crash:
Also
libjulia-clj.impl.base> (-> (jl_eval_string "sqrt(2.0)")
(jl_unbox_float64))
1.4142135623730951 It seems like something still is not initialized. |
I think I figured out out. Symbols returned from jna's getGlobalVariableAddress are ptr-to-ptrs. basemod is located at the first pointer pointed to from the global address. |
Closing this issue for now. Will open a new one if I get truly stuck again but it looks like things are working according to the documentation. |
@mkitti - Thanks for your help :-). Will be in touch if I get something interesting. |
This is awesome. I'm very interested in seeing how this works out going forward. Are the Java and Clojure parts currently integrated or is it possible to have a more generic JNA/JNR Java embedding of Julia and a separate Clojure part? Some other notes:
|
Thank you, I appreciate all of this. I got past all blocking things and now I can scan a julia module and call functions out of like they are native Clojure functions: user> (require '[libjulia-clj.impl.base :as base])
nil
user> (base/initialize!)
Nov 25, 2020 12:52:59 PM clojure.tools.logging$eval7895$fn__7898 invoke
INFO: Library /home/chrisn/dev/cnuernber/libjulia-clj/julia-1.5.3/lib/libjulia.so found at [:system "/home/chrisn/dev/cnuernber/libjulia-clj/julia-1.5.3/lib/libjulia.so"]
:okNov 25, 2020 12:53:00 PM clojure.tools.logging$eval7895$fn__7898 invoke
INFO: Reference thread starting
user> (require '[libjulia-clj.modules.Base :as Base])
nil
user> (Base/sqrt 2.0)
1.4142135623730951
user> I think next up will be trying to see how hard zerocopy of ND datastructures are. If you are interested, the same underlying numerics system (https://github.com/cnuernber/dtype-next) allows: And has an extremely performant dataframe abstraction. This seems to start to parallel your imglib pathways. Note that nio buffers are limited to 2 billion entries, I switched to straight sun.misc.unsafe pathways and have things like the dataset library working buffers larger than 2GB (large text). Julia is like the pinnacle of everything I mentioned immediately above :-) at least in terms of a language to write new numeric code in. Getting any amount of the above working in pure java would be hideous from a time/LOC perspective but what I would recommend is I get as far as I can and then we hire someone to start moving pieces into pure Java. Honestly I think it is just easier to put a Java wrapper over the Clojure than anything else. Then at least you aren't typing a godawful amount :-) and the base Clojure libraries are small and stable. Certainly nothing like the giant mess that is Scala. |
I had noticed you had written JLOptions in Java, but I see most of the JNA interface is in Clojure. There was a prior extension to ImgLib2 (better known as part of the ImageJ2 / FIJI distribution) using My work around for large arrays has been using com.sun.jna.Memory and obtaining This is all a very exciting development. I'm aware only of JNI based efforts to do this so far but that requires that one has a compiler on hand: Cheers! |
I am also excited! You have been extremely helpful and found the missing piece! There could be some more in Java for sure if I want to hand-code the java file. DirectMapped JNA is much faster than the generic find-symbol interface I use so it will make sense at some point to encode a few (or all) of the specific JNA methods into a Java class file. That just involves a compilation step and is thus a lot slower than when I can define functions ad-hoc with no meaningful recompilation step. The Clojure REPL is a quite powerful tool for exploration of codebases and maps really well to the JNA project :-). Unsafe works on java8 and it works well on GraalVM, that author's issues aren't ones I encounter. I personally consider that blog post fairly silly as a very large portion of the java ecosystem (like spark and hadoop) sit on netty and the apache arrow project has an unsafe backend that works well on JDK8 and Graal. In any case changing the low level layer that essentially says 'write/read byte at this location' isn't exactly rocket science when the time comes; we aren't using it to define classes. It just needs to read and write the bytes... ;-). Maybe I should write a blog post: I tried that swig project and it failed to compile for me. I would like people who like Julia to enjoy using Clojure and vise versa and I for (great) personal reasons have limited time. But I think the potential is certainly there. Ideally if you have JULIA_HOME set the library just works in general across multiple Julia versions with no changes. That is one thing I would like and has really helped libpython-clj not be a complete nightmare when working with things like Conda. There is no compilation step at all nor project version changes in order to support different versions of Python (3.5-3.9) so that maps well to some set of desktop users. Aside we use Docker and k8 for production/orchestration. I will need other people to test/stabilize other operating systems. I run Ubuntu and we use Docker for production and I don't game which means I only ever use or develop with 1 operating system. |
Initial outline working with API documents pulled from the online Julia documentation: |
I am playing with JNA bindings and consistently getting a crash on 1.4.2 if
jl_init__threading
is called at all and then followed with 1 or 2 System.gc() calls.The crash I am seeing is:
We are setting up the system with a julia stack related variable set:
https://github.com/cnuernber/julia-clj/blob/master/dockerfiles/Dockerfile#L32
Potentially we should be building julia from source in the docker container with some different variables set related to stack management.
Again, this is simply calling
jl_init__threading
and then System.gc() a few times. Oddly enough, it is consistent when connecting via a remote repl but not when using a repl from the command line.If I startup a command line repl and initialize julia then the crash happens at the moment a remote repl connects; potentially this is spawning another thread or something along those lines.
Related issues are (at least) #32700 and #31104
Given we are running from a docker container in the first place all options are on the table; rebuilding julia, rebuilding openjdk, etc. I really appreciate any help with this and have some interest in accessing the excellent solvers (and GPU programming) in Julia via this pathway.
The text was updated successfully, but these errors were encountered: