Re-consider finalizers usage in Z3 Java API: now with some hard data #442

cheshire · 2016-02-08T18:25:45Z

Hi,

I have tried to open a ticket about this before with little success, I'll try again now, with some more evidence.
Z3 Java API relies on finalizers to clean up memory.
There exist numerous books and articles, starting with "Effective Java", stating that finalizers are not a good replacement for C++ destructors, and should generally be avoided.
The main problem is that the object can be "resurrected" in the finalizer, and the GC needs to do a second run ensuring that that hasn't happened.
Just adding an empty finalize method to an often used object can drastically decrease the performance of an application.

Conceptually, there is no reason to use finalizers in place of PhantomReferences: everything which can be done with finalizers, can be done with PhantomReferences, but faster.
Moreover, with PhantomReferences it is very easy to switch tracking on and off on demand, for example with a command line flag.

In our JavaSMT library we support both our own Java API, generated using JNI, and the one provided by Z3. The code in two APIs is very similar, the main difference lies in the memory management.
We would really like to switch from our home-grown Z3 JNI API to the Z3-supplied one, but we can't due to the heavy performance penalty associated with finalizers usage.

In our API, we use PhantomReferences to call dec_ref on Z3 ASTs and Models, and we use the try-with-resources syntax for opening and closing contexts (as normally context objects have a single owner).

Finally, I evaluate the performance of my verification tool in 3 different modes:

Not calling dec_refs at all, simply leaking memory (Our JNI API)
Using PhantomReference, cleaning up the queue after every allocation (Our JNI API)
Using finalizers (Z3 Java API)

This is the evaluation table: http://metaworld.me/finalizers_results.html
The table is generated with the benchexec tool, which takes resource usage very seriously (cf. https://github.com/sosy-lab/benchexec for description and the paper).

As can be seen, ironically, the best configuration by far is not tracking the memory at all, and just leaking it.
Of course that is also due to the fact that the programs I'm analyzing here are quite small, and the results could be different for a different dataset.

However, the table clearly shows that switching from phantom references to finalizers gives a huge performance hit (CPU time nearly doubles for correctly solved instances) and it more then doubles the memory usage!

Based on this evidence I think it is pretty clear that finalizers are not an optimal solution for memory management.

The text was updated successfully, but these errors were encountered:

wintersteiger · 2016-02-09T13:11:18Z

Our Java API was never intended to be fast. It was intended to provide a nice, convenient abstraction layer; everything "expensive" is assumed to be done inside of Z3, i.e., the bulk of the work is assumed to be done during checking, not during con/de-struction of ASTs. If the speed of AST handling is the main concern, we should use the C-API, or the "flat" layer provided in Native.java and everybody's free to use PhantomReferences if they chose to do so.

Further, the Java API is essentially an automatic translation of the .NET API, i.e., this is close to a zero-effort solution. At this point, I am essentially the maintainer of the Java API and I don't have the bandwidth to implement, test, and support new features like this, for an API that I don't use myself. I'm very happy to hand this over to a community maintainer though. Ideally we should try to find someone who knows Java, who has the time for maintenance (and their employers are happy with the spending time on this), and who actively uses Z3 in other projects.

cheshire · 2016-02-12T16:07:20Z

everything "expensive" is assumed to be done inside of Z3; If the speed of AST handling is the main concern

It's not that the speed of AST handling is the main concern; my analysis relies on maximization modulo SMT and performs relatively little expression manipulation.
I am trying to say that the finalizers are so bad that the performance penalty outweighs the query time.

I'm very happy to hand this over to a community maintainer though

I am the maintainer of the JavaSMT project, where we do have efficient Java bindings for Z3.
Our binding generation code is even more hacky then yours --- we have a python script which "parses" the Z3 header file, and generates the JNI bindings.
Maintaining that python script requires investing effort each time Z3 API is re-arranged, which is suboptimal.
I would prefer to use Z3 Java API, but currently the performance degradation due to finalizers usage makes it impossible.

If I implement more efficient memory handling for Z3 Java bindings in a pull request, would it get a chance to get merged?
I understand that testing is an issue, but we do have a large integration testing suite for various code analyses relying on Z3, and an extensive testing can be performed.

wintersteiger · 2016-02-12T17:51:40Z

We also have a Python script that parses our header files, which contain special macros that we use to annotate functions, so that's pretty much the same level of hackiness.

A PR would get merged, but only if you will test and maintain it, I just don't have the time for that. If you have and maintain existing bindings of the same level of hackiness, perhaps we can just declare those the "official" Z3 bindings?

cheshire · 2016-02-15T17:09:22Z

perhaps we can just declare those the "official" Z3 bindings?

We do have long term maintenance plans, so that is feasible.
A few issues would need to be considered:

Distribution --- unlike a single JAR for Z3 bindings, we do have dependencies (e.g. Guava, a library with a Rational class to represent the rationals returned in a model, etc.). Users who rely on ANT can automatically fetch all [transitive] dependencies from our repository automatically, users who don't, well, can't.
Currently I'm evaluating distribution to maven central vs. distributing a self-contained ZIP file which would contain everything-which-might-be-needed. Moreover, multiple ZIPs would be needed to cater to different audiences (all-solvers-included-zip, z3-zip, etc).
That would fix Bundling Dynamic Libraries Inside JAR for Ease of Redistribution #182 though.
Features --- we do not expose every single feature of Z3. Probes and proofs are missing, but Z3 Java API does not seem to expose proofs either. This would need double-checking.

cheshire · 2016-06-11T19:44:36Z

@wintersteiger I'm actually giving a shot at doing this, but it looks like it requires changing a very large chunk of Java API.
I fail to understand the dispose code associated to AST (and other) classes, do you think you could give some comment on this?

    /**
     * Disposes of the underlying native Z3 object.
     **/
    @Override
    public void dispose()
    {
        if (m_n_obj != 0)
        {
            decRef(m_n_obj);
            m_n_obj = 0;
        }

        if (m_ctx != null)
        {            
            if (m_ctx.m_refCount.decrementAndGet() == 0)
                m_ctx.dispose();
            m_ctx = null;
        }
    }

Why would you want to dispose of the context, even if it has no objects left? I think it would be much better if Context had an explicit close() method together with implements AutoCloseable.

NikolajBjorner · 2016-06-11T21:59:45Z

You definitely want to make sure that the memory associated with the context is released when nothing depends on the context. The ast objects rely on the context so you can only delete the context (it is not reference counted) when nothing depends on it.

cheshire · 2016-06-12T11:58:16Z

@NikolajBjorner sorry I don't agree at all with your comment.
Consider a freshly created context: nothing depends on it. Do you want to delete it straight away then?
Now consider creating a context, creating an AST on it, and then disposing of the AST.
Should we silently dispose of the context in that case as well? What about the possible future operations on that context?

NikolajBjorner · 2016-06-12T17:46:00Z

Granted. I had in mind an intent that doesn't match the implementation. So from what I understand you are right.

The intent I had in mind was that Z3Objects maintain a pointer to a structure that owns a reference count and an unmanaged pointer to the context. The Context should do that too. Then the unmanaged context goes out of scope when the last reference to this structure goes out of scope.

As it is now, the context is an attribute on Z3Object so it never goes out of scope until the last reference to it is gone, and the Context class is not tracking its own reference counts. The scenarios you describe say what is problematic about this. I am somewhat puzzled why this doesn't get exposed.

Given that AST and other objects inherit from Z3Object and rely on other attributes of Context I am not sure if a simple change suffices (maybe "close" is the answer, but I don't have experience with this) or we should examine the reference counting discipline on the Context object, which currently does not own a reference count, or consider intermediary objects that own the unmanaged context pointer and reference counting queues (because inc-ref and dec-ref on Z3 pointers are not thread safe and GC happens in separate threads).

Another note: the use of reference counts is different in Context.cs and Context.java. It is used in the dispose method in Context.java and the finalize method in Context.cs.

Would be fantastic with your brain and finger-on-the-keyboard cycles for finding solid solutions.

cheshire · 2016-06-12T18:23:56Z

I think making Context implement AutoCloseable is the way to go.
I don't know why the current solution works at all TBH. Maybe Java bindings aren't heavily used, and finalizers take a while to run, so the context never stays childfree for too long?

cheshire · 2016-06-12T19:09:32Z

I've created a pull request, it might be better to describe the details there instead.

wintersteiger · 2016-06-14T14:39:09Z

I think this bug was never triggered because there are no applications that clean up everything and then restart by using an old context, instead they'll usually create a new one. Anyways, looking at the code, the intention was to dispose of the context without waiting until the finalizer has time, but that will have to change one way or the other.

cheshire · 2016-06-27T11:22:42Z

With 3e96a79 this can be closed.

wintersteiger added the enhancement label Feb 9, 2016

arifogel mentioned this issue May 10, 2016

Addressing Context memory leakage in Java #245

Closed

cheshire mentioned this issue Jun 12, 2016

Replace finalizers with PhantomReferences in Java API #648

Merged

cheshire closed this as completed Jun 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-consider finalizers usage in Z3 Java API: now with some hard data #442

Re-consider finalizers usage in Z3 Java API: now with some hard data #442

cheshire commented Feb 8, 2016

wintersteiger commented Feb 9, 2016

cheshire commented Feb 12, 2016

wintersteiger commented Feb 12, 2016

cheshire commented Feb 15, 2016

cheshire commented Jun 11, 2016

NikolajBjorner commented Jun 11, 2016

cheshire commented Jun 12, 2016

NikolajBjorner commented Jun 12, 2016

cheshire commented Jun 12, 2016

cheshire commented Jun 12, 2016

wintersteiger commented Jun 14, 2016

cheshire commented Jun 27, 2016

Re-consider finalizers usage in Z3 Java API: now with some hard data #442

Re-consider finalizers usage in Z3 Java API: now with some hard data #442

Comments

cheshire commented Feb 8, 2016

wintersteiger commented Feb 9, 2016

cheshire commented Feb 12, 2016

wintersteiger commented Feb 12, 2016

cheshire commented Feb 15, 2016

cheshire commented Jun 11, 2016

NikolajBjorner commented Jun 11, 2016

cheshire commented Jun 12, 2016

NikolajBjorner commented Jun 12, 2016

cheshire commented Jun 12, 2016

cheshire commented Jun 12, 2016

wintersteiger commented Jun 14, 2016

cheshire commented Jun 27, 2016