Page-based cache recycling. #4559

jpountz · 2013-12-27T17:15:21Z

Here is an attempt to refactor cache recycling so that it only caches large
arrays (pages) that can later be used to build more complex data-structures
such as hash tables.

QueueRecycler now takes a limit like other non-trivial recyclers.
New PageCacheRecycler (inspired of CacheRecycler) has the ability to cache
byte[], int[], long[], double[] or Object[] arrays using a fixed amount of
memory (either globally or per-thread depending on the Recycler impl, eg.
queue is global while thread_local is per-thread).
Paged arrays in o.e.common.util can now optionally take a PageCacheRecycler
to reuse existing pages.
All aggregators' data-structures now use PageCacheRecycler:
- all arrays
- LongHash can now take a PageCacheRecycler
- there is a new BytesRefHash (inspired from Lucene but quite different,
  still; for instance it cheats on BytesRef comparisons by using Unsafe)
  that also takes a PageCacheRecycler

Close #4557

s1monw · 2013-12-27T18:01:57Z

src/main/java/org/elasticsearch/search/aggregations/AggregationPhase.java

+            }
+            context.queryResult().aggregations(new InternalAggregations(aggregations));
+        } finally {
+            for (Aggregator aggregator : aggregators) {


do we have some utils for Releasable like IOUtils.close(Closeable) that ensures that all of them are released?

jpountz · 2014-01-02T15:29:58Z

@kimchy @s1monw I pushed a new commit that addresses your concerns:

uses componentSettings
use of Releasable and Releasable.release(Releasable...)
try/finally for release
better configuration of the cache size, by default 10% of the heap size

kimchy · 2014-01-02T15:36:22Z

src/main/java/org/elasticsearch/common/unit/ByteSizeValue.java

+    public static ByteSizeValue parseBytesSizeValueOrHeapRatio(String sValue) {
+        if (sValue.endsWith("%")) {
+            double percent = Double.parseDouble(sValue.substring(0, sValue.length() - 1));
+            return new ByteSizeValue((long) ((percent / 100) * JvmInfo.jvmInfo().getMem().getHeapMax().bytes()), ByteSizeUnit.BYTES);


I think putting the heap knowledge in here is the wrong place encapsulation wise? Maybe add to Settings a getAsMemory, that will do this?

jpountz · 2014-01-03T11:17:21Z

New commit pushed:

added double-release detection
better separation of concerns: parseBytesSizeValueOrHeapRatio moved to MemorySizeValue
added comments about the weighting per data type

kimchy · 2014-01-03T16:42:01Z

src/main/java/org/elasticsearch/cache/recycler/PageCacheRecycler.java

+
+    private static int maximumSearchThreadPoolSize(ThreadPool threadPool) {
+        final Executor executor = threadPool.executor(ThreadPool.Names.SEARCH);
+        return ((ThreadPoolExecutor) executor).getMaximumPoolSize();


can we use the ThreadPool#info API here, so we don't have to cast to ThreadPoolExecutor, and be able to get the max back through the info class.

s1monw · 2014-01-03T20:18:17Z

src/test/java/org/elasticsearch/cache/recycler/MockPageCacheRecycler.java

+    @Inject
+    public MockPageCacheRecycler(Settings settings, ThreadPool threadPool) {
+        super(settings, threadPool);
+        random = new Random(0);


the settings should get a random seed per index maybe I should pass on the node seed from the test cluster as well so you can get it?

This would be nice.

I added this here 602c63d

s1monw · 2014-01-03T20:21:09Z

added some smallish comments - looks really good though. Lets get this in soon!

Refactor cache recycling so that it only caches large arrays (pages) that can later be used to build more complex data-structures such as hash tables. - QueueRecycler now takes a limit like other non-trivial recyclers. - New PageCacheRecycler (inspired of CacheRecycler) has the ability to cache byte[], int[], long[], double[] or Object[] arrays using a fixed amount of memory (either globally or per-thread depending on the Recycler impl, eg. queue is global while thread_local is per-thread). - Paged arrays in o.e.common.util can now optionally take a PageCacheRecycler to reuse existing pages. - All aggregators' data-structures now use PageCacheRecycler: - for all arrays (counts, mins, maxes, ...) - LongHash can now take a PageCacheRecycler - there is a new BytesRefHash (inspired from Lucene but quite different, still; for instance it cheats on BytesRef comparisons by using Unsafe) that also takes a PageCacheRecycler Close elastic#4557

jpountz · 2014-01-06T13:59:56Z

Thanks for the comments, @s1monw. I rebased and did the changes you suggested:

new Releasable.release(boolean success, Releasable... releasables) that forwards to release when success is true and releaseWhileHandlingException otherwise
fixed style
MockPageCacheRecycler gets the seed from TestCluster
s/AssertionError/ElasticsearchIllegalStateException/

s1monw · 2014-01-06T14:45:58Z

LGTM +1 to push

s1monw reviewed Dec 27, 2013
View reviewed changes

ghost assigned jpountz Dec 31, 2013

kimchy reviewed Jan 2, 2014
View reviewed changes

kimchy reviewed Jan 3, 2014
View reviewed changes

s1monw reviewed Jan 3, 2014
View reviewed changes

jpountz closed this Jan 6, 2014

jpountz deleted the fix/cache_recycling branch January 6, 2014 18:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Page-based cache recycling. #4559

Page-based cache recycling. #4559

jpountz commented Dec 27, 2013

s1monw Dec 27, 2013

jpountz commented Jan 2, 2014

kimchy Jan 2, 2014

jpountz commented Jan 3, 2014

kimchy Jan 3, 2014

s1monw Jan 3, 2014

jpountz Jan 3, 2014

s1monw Jan 3, 2014

s1monw commented Jan 3, 2014

jpountz commented Jan 6, 2014

s1monw commented Jan 6, 2014

Page-based cache recycling. #4559

Page-based cache recycling. #4559

Conversation

jpountz commented Dec 27, 2013

s1monw Dec 27, 2013

Choose a reason for hiding this comment

jpountz commented Jan 2, 2014

kimchy Jan 2, 2014

Choose a reason for hiding this comment

jpountz commented Jan 3, 2014

kimchy Jan 3, 2014

Choose a reason for hiding this comment

s1monw Jan 3, 2014

Choose a reason for hiding this comment

jpountz Jan 3, 2014

Choose a reason for hiding this comment

s1monw Jan 3, 2014

Choose a reason for hiding this comment

s1monw commented Jan 3, 2014

jpountz commented Jan 6, 2014

s1monw commented Jan 6, 2014