-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test different heap sizes for Lucene benchy #37
Comments
Note that tantivy seems to use 1.5 GB resident:
And memory maps its index (like Lucene). We should compare the RAM requirements of each engine too! Since Rust "GC" is immediate (as soon as something becomes garbage it is reclaimed, like Python's non-cyclic "collector") it does not need the overhead to allow garbage to accumulate and then be reclaimed by a complex GC impl like Java. |
well, freeing garbage is not free (lol sorry for the pun). So for Rust, the latencies we see actually include garbage/memory management. So it is actually doing extra work on top of what is strictly needed for finishing the query. |
Is it possible that Rust only marks the garbage during query execution (which should be quite lightweight)? And, asynchronously clears memory blocks and make them available for allocation? |
Ha! Love the pun.
Yeah that is true -- Rust must still do the memory management work. But it avoids all the cost of crawling all object references, inserting memory barriers, etc. (I think?)
Maybe :) I am far from a Rust GC expert! I found this recent Reddit discussion, but I remain confused :) It looks like at compile time |
@jainankitk what you are talking about is basically a version of concurrent GC :) , which is not what rust has. As @mikemccand pointed out, rust statically knows when objects are out-of-scope (or more precisely, for data that no longer has an owner), so that it can call the destructor. This helps not only with memory management (working with the allocator) but also any clean-up actions (e.g. close a file handle). @mikemccand to your confusion -- those reference-count object are really Smart Pointers. Say you have Maybe this helps: |
Spinoff from this discussion: apache/lucene#12358 (comment)
We should fix the heap size for the JVM running Lucene. It can save some cost of the JVM trying to grow/shrink/reallocate, etc.?
I'm starting with 4 GB as a random guess but we should test at what heap size is GC cost minimized.
The text was updated successfully, but these errors were encountered: