Refactor parallel indexing perfect rollup partitioning #8852

ccaominh · 2019-11-11T18:50:13Z

Description

Refactoring to make it easier to later add range partitioning for perfect rollup parallel indexing. This is accomplished by adding several new base classes (e.g., PerfectRollupWorkerTask) and new classes for encapsulating logic that needs to be changed for different partitioning
strategies (e.g., IndexTaskInputRowIteratorBuilder).

The code is functionally equivalent to before except for the following small behavior changes:

PartialSegmentMergeTask: Previously, this task had a priority of DEFAULT_TASK_PRIORITY. It now has a priority of DEFAULT_BATCH_INDEX_TASK_PRIORITY (via the new PerfectRollupWorkerTask base class), since it is a batch index task.

This PR has:

been self-reviewed.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths.

Refactoring to make it easier to later add range partitioning for perfect rollup parallel indexing. This is accomplished by adding several new base classes (e.g., PerfectRollupWorkerTask) and new classes for encapsulating logic that needs to be changed for different partitioning strategies (e.g., IndexTaskInputRowIteratorBuilder). The code is functionally equivalent to before except for the following small behavior changes: 1) PartialSegmentMergeTask: Previously, this task had a priority of DEFAULT_TASK_PRIORITY. It now has a priority of DEFAULT_BATCH_INDEX_TASK_PRIORITY (via the new PerfectRollupWorkerTask base class), since it is a batch index task. 2) ParallelIndexPhaseRunner: A decorator was added to subTaskSpecIterator to ensure the subtasks are generated with unique ids. Previously, only tests (i.e., MultiPhaseParallelIndexingTest) would have this decorator, but this behavior is desired for non-test code as well.

suneet-s · 2019-11-11T23:21:22Z

@ccaominh Any recommendations on where to start reviewing this code for someone who doesn't have a deep understanding of the ingest system? Also is there a design doc that talks about the different components involved

ccaominh · 2019-11-11T23:30:57Z

@suneet-amp The entry point for parallel native batch indexing is ParallelIndexSupervisorTask#runTask(), and the subsequent code path relevant to this PR is ParallelIndexSupervisorTask#runMultiPhaseParallel().

There is a "Proposal" issue label that'll list a bunch of design docs: https://github.com/apache/incubator-druid/labels/Proposal

In particular, #5543 and #8061 are related to the code paths that are refactored in this PR.

The relevant external documentation is: https://druid.apache.org/docs/latest/ingestion/native-batch.html#parallel-task

suneet-s · 2019-11-11T23:33:33Z

Thanks!

…ion-refactor

suneet-s · 2019-11-18T17:26:54Z

core/src/main/java/org/apache/druid/indexer/partitions/PartitionsSpec.java

+   * the partitionSpec is compatible.
+   */
+  @JsonIgnore
+  String getForceGuaranteedRollupIncompatiblityReason();


note to self: there are 3 ways of saying whether something is incompatible or not. Is there a better way to do this?

suneet-s

Finished reviewing stuff under core. More reviews coming today. Sorry it took me almost a week to get around to this.

suneet-s · 2019-11-18T17:28:03Z

core/src/main/java/org/apache/druid/indexer/partitions/SingleDimensionPartitionsSpec.java

  private static final String MAX_PARTITION_SIZE = "maxPartitionSize";
+  private static final String FORCE_GUARANTEED_ROLLUP_COMPATIBLE = "";


nit: unused

It's used in my follow-up PR that adds the implementation of range partitioning.

suneet-s · 2019-11-18T17:29:15Z

core/src/main/java/org/apache/druid/indexer/partitions/SingleDimensionPartitionsSpec.java

+  public String getForceGuaranteedRollupIncompatiblityReason()
+  {
+    return NAME + " partitions unsupported";
+  }


should isForceGuaranteedRollupCompatibleType also be implemented to return false

Perhaps it could be implemented in a clearer way, but currently I have the default implementation of isForceGuaranteedRollupCompatibleType in the PartitionsSpec interface that returns true if this method returns an empty string.

suneet-s

second round of comments - I haven't reviewed any of the tests in indexing-service, but have read through everything else. Overall the refactoring looks good to me 👍

suneet-s · 2019-11-18T18:36:13Z

...ervice/src/main/java/org/apache/druid/indexing/common/task/CachingLocalSegmentAllocator.java

-  {
-    return allocateSpec;
-  }
-


Not your code - but getSequenceName below uses String#format. If this is called a bunch of times I'd recommend switching this to using string addition or a StringBuilder. Here are some results from a micro-benchmark I ran comparing a million combinations. String.format takes 1.7s vs 300 ms for the other 2 methods. This is because internally String#format has to compile the format on each invocation.

I don't have enough context to know if this is important or not, but figured I'd flag it. Totally ok to skip this since it has nothing to do with your refactoring.

/** * String#format 1M times - * INIT Total: 1384.5 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms * RUN Total: 1720.3 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms * String plus 1M times - * INIT Total: 1323.1 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms * RUN Total: 235.5 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms * StringBuilder 1M times - * INIT Total: 1408.0 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms * RUN Total: 337.3 ms, average: 0.0 ms, stdev: 0.0 ms, median: 0.0 ms */

Good point. StringUtils.format() should never used in performance-sensitive code path. I'm pretty sure getSequenceName() is not a performance bottleneck in indexing for now, but guess this is good to have.

I think we can create a general ticket to profile parallel indexing and then make additional optimizations as needed (this potentially being one of them).

I'm ok with not doing it in this PR. @suneet-amp are you interested in filing this issue?

@jihoonson Happy to file an issue! Just to clarify - this should be an issue to profile parallel indexing and look for optimizations correct?

Uh, I think there could be two issues here. One is optimizing performance of indexing tasks in general and the other is not using StringUtils.format() in getSequenceName(). I think it's pretty obvious and good to have regardless of the profiling result.

Filed #8904 to replace StringUtils#format

Great, thanks!

suneet-s · 2019-11-18T18:44:07Z

...src/main/java/org/apache/druid/indexing/common/task/DefaultCachingLocalSegmentAllocator.java

+      );
+    }
+    return intervalToSegmentIds;
+  }


nit: Missing tests for creating the delegate using getIntervalToSegmentIds?

The logic appears non-trivial - It might be worth adding a very simple test just to ensure the delegate is created correctly.

Added a unit test

suneet-s · 2019-11-18T18:45:48Z

indexing-service/src/main/java/org/apache/druid/indexing/common/task/InputSourceProcessor.java

+          .granularitySpec(granularitySpec)
+          .nullRowRunnable(buildSegmentsMeters::incrementThrownAway)
+          .absentBucketIntervalConsumer(inputRow -> buildSegmentsMeters.incrementThrownAway())
+          .build();


Very nice - I like how this reads a lot!

suneet-s · 2019-11-18T18:51:21Z

indexing-service/src/main/java/org/apache/druid/indexing/common/task/InputSourceProcessor.java

            continue;
          }

+          Optional<Interval> optInterval = granularitySpec.bucketInterval(inputRow.getTimestamp());
+          @SuppressWarnings("OptionalGetWithoutIsPresent")  // always present via IndexTaskInputRowIteratorBuilder


I don't think I understand this. Both implementations of GranularitySpec can return absent

How is the granularitySpec that's passed to inputRowIteratorBuilder guaranteed to return a present Interval?

DefaultIndexTaskInputRowIteratorBuilder has an absent bucket interval handler, which adds a HandlingInputrowIterator.InputRowHandler that returns true if the row has a timestamp that doesn't match the intervals in the granularitySpec. That means the row gets skipped by this loop as the iterator will return null (i.e., row was handled already).

I'll improve the comment.

suneet-s · 2019-11-18T18:54:10Z

...g/apache/druid/indexing/common/task/batch/parallel/FirehoseSplitParallelIndexTaskRunner.java

  }
+
+  abstract SubTaskSpec<T> createSubTaskSpec(


nit: javadoc please

suneet-s · 2019-11-18T19:23:09Z

...ervice/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartitionStat.java

+  abstract int getPartitionId();
+
+  abstract T getSecondaryPartition();
+


super nit: javadocs - these seem like obvious things that someone more familiar with the codebase will know just from the name. Feel free to ignore.

suneet-s · 2019-11-18T19:33:21Z

.../main/java/org/apache/druid/indexing/common/task/batch/parallel/PerfectRollupWorkerTask.java

+  }
+
+  @Override
+  public final boolean isPerfectRollup()


nit: It looks like the intention of this is to indicate that we want to use the timeChunk for rollups instead of segment? Should we rename it to something like useTimeChunkForRollups It's unclear to me why perfect Rollups mean using the timeChunk instead of segment for rollups.

Also it looks like this is only used in AbstractBatchIndexTask can we make this method protected instead?

I'm not sure what you mean by "using timeChunk for rollups", but the perfect rollup is pre-aggregating rows across the entire input data. See https://druid.apache.org/docs/latest/ingestion/index.html#rollup for more details.

@jihoonson In AbstractBatchIndexTask line 268-271 it says

if (isPerfectRollup()) { log.info("Using timeChunk lock for perfect rollup"); ... }

Ah, it's the timeChunk lock rather than timeChunk. Tasks must use the timeChunk lock instead of segment lock for perfect rollup because the rollup happens across the entire input data. See #7491 for timeChunk lock vs segment lock.

suneet-s · 2019-11-18T19:45:41Z

...id/indexing/common/task/batch/parallel/iterator/DefaultIndexTaskInputRowIteratorBuilder.java

+  /**
+   * @param inputRowHandler Optionally, append this input row handler to the required ones.
+   */
+  DefaultIndexTaskInputRowIteratorBuilder appendInputRowHandler(HandlingInputRowIterator.InputRowHandler inputRowHandler)


nit: @VisibleForTesting

Or indicate that we can add additional row handlers in the top level javadoc

Added to javadoc.

suneet-s · 2019-11-18T19:46:17Z

...id/indexing/common/task/batch/parallel/iterator/DefaultIndexTaskInputRowIteratorBuilder.java

+ * If any of the handlers invoke their respective callback, the {@link HandlingInputRowIterator} will yield
+ * a null {@link InputRow} next; otherwise, the next {@link InputRow} is yielded.
+ * </pre>
+ */


I like this abstraction a lot 😍

suneet-s · 2019-11-18T19:48:40Z

...che/druid/indexing/common/task/batch/parallel/iterator/IndexTaskInputRowIteratorBuilder.java

+  };
+
+  Consumer<InputRow> NOOP_CONSUMER = inputRow -> {
+  };


nit: both the NOOP variables are only used in tests. Do you plan to use this in a future PR? Maybe move to the tests folder otherwise

They're going to be used in the production code for my follow up PR that adds the implementation for range partitioning.

jihoonson · 2019-11-18T21:52:44Z

indexing-service/src/main/java/org/apache/druid/indexing/common/task/InputSourceProcessor.java

+          .delegate(inputRowIterator)
+          .granularitySpec(granularitySpec)
+          .nullRowRunnable(buildSegmentsMeters::incrementThrownAway)
+          .absentBucketIntervalConsumer(inputRow -> buildSegmentsMeters.incrementThrownAway())


semi-related: I don't think this should be done in this pr, but I think it would be better to add MetricsCollectingInputSourceReader or something since this decoration will be same across all task types.

Also semi-related, another place where I think refactoring is needed is having a consistent and systematic way to throw ParseException. In the current code, ParseException can be thrown in any place and callers must be careful to not miss it (it's a RuntimeException. I think we need some ParseErrorHandlingRunner or something to handle this in a single place.

Good suggestions, but I won't address in this PR

jihoonson · 2019-11-18T21:53:39Z

...g/apache/druid/indexing/common/task/batch/parallel/FirehoseSplitParallelIndexTaskRunner.java

 */
-class PartialSegmentGenerateParallelIndexTaskRunner
-    extends ParallelIndexPhaseRunner<PartialSegmentGenerateTask, GeneratedPartitionsReport>
+abstract class FirehoseSplitParallelIndexTaskRunner<T extends Task, R extends SubTaskReport>


InputSourceSplit?

jihoonson · 2019-11-18T21:55:38Z

...main/java/org/apache/druid/indexing/common/task/batch/parallel/ParallelIndexPhaseRunner.java

+      return super.next();
+    }
+
+    private void ensureUniqueSubtaskId()


This was missing in production code since it's not really happening mostly. I think #8612 is a better way to fix this.

jihoonson · 2019-11-18T22:00:47Z

.../main/java/org/apache/druid/indexing/common/task/batch/parallel/PerfectRollupWorkerTask.java

+  }
+
+  @Override
+  public final boolean isPerfectRollup()


I'm not sure what you mean by "using timeChunk for rollups", but the perfect rollup is pre-aggregating rows across the entire input data. See https://druid.apache.org/docs/latest/ingestion/index.html#rollup for more details.

jihoonson · 2019-11-18T22:07:26Z

.travis.yml

@@ -277,6 +277,9 @@ jobs:
          echo $v dmesg ======================== ;
          docker exec -it druid-$v sh -c 'dmesg | tail -3' ;
          done
+        - for v in ~/shared/tasklogs/*.log ; do
+          echo $v logtail ======================== ; tail -100 $v ;


This could cause travis failure since it would fail once the number of output rows reaches to the limit.

I'll revert this for now

jihoonson · 2019-11-18T22:17:51Z

.../main/java/org/apache/druid/indexing/common/task/batch/parallel/PerfectRollupWorkerTask.java

+  @Override
+  public final int getPriority()
+  {
+    return getContextValue(Tasks.PRIORITY_KEY, Tasks.DEFAULT_BATCH_INDEX_TASK_PRIORITY);


Thanks for adding this! Wondering if we can move this to AbstractBatchIndexTask.

I've moved it to AbstractBatchIndexTask

jihoonson · 2019-11-18T22:40:33Z

indexing-service/src/main/java/org/apache/druid/indexing/common/task/InputSourceProcessor.java

+          .delegate(inputRowIterator)
+          .granularitySpec(granularitySpec)
+          .nullRowRunnable(buildSegmentsMeters::incrementThrownAway)
+          .absentBucketIntervalConsumer(inputRow -> buildSegmentsMeters.incrementThrownAway())


Also semi-related, another place where I think refactoring is needed is having a consistent and systematic way to throw ParseException. In the current code, ParseException can be thrown in any place and callers must be careful to not miss it (it's a RuntimeException. I think we need some ParseErrorHandlingRunner or something to handle this in a single place.

jihoonson · 2019-11-19T01:41:50Z

core/src/main/java/org/apache/druid/data/input/HandlingInputRowIterator.java

+/**
+ * Decorated {@link CloseableIterator<InputRow>} that can process rows with {@link InputRowHandler}s.
+ */
+public class HandlingInputRowIterator implements Iterator<InputRow>


Why not implementing CloseableIterator?

I can make that addition. When I originally implemented the class, it was providing a iterator interface for a firehose.

jihoonson · 2019-11-19T01:49:27Z

core/src/main/java/org/apache/druid/data/input/HandlingInputRowIterator.java

+    InputRow inputRow = delegate.next();
+
+    for (InputRowHandler inputRowHandler : inputRowHandlers) {
+      if (inputRowHandler.handle(inputRow)) {


This loop will invoke virtual calls per inputRow, which could be slow with large input. I don't think this will lead to any performance degradation for now because segment merge is the most prominent bottleneck. However, I would recommend adding some notes as here.

I'll add comments warning about the overhead and potential future work.

jihoonson · 2019-11-19T01:54:46Z

...ervice/src/main/java/org/apache/druid/indexing/common/task/CachingLocalSegmentAllocator.java

-  {
-    return allocateSpec;
-  }
-


I'm ok with not doing it in this PR. @suneet-amp are you interested in filing this issue?

jihoonson · 2019-11-19T02:07:41Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+import java.util.List;
+import java.util.Map;
+
+class Factory


I like this class!

I would suggest to rename the class more intuitive or add javadoc so that other people can also use this class. Maybe both would be best.

I've renamed to something more descriptive and added a javadoc

jihoonson · 2019-11-19T02:10:00Z

...ervice/src/main/java/org/apache/druid/indexing/common/task/CachingLocalSegmentAllocator.java

 */
-public class CachingLocalSegmentAllocator implements IndexTaskSegmentAllocator
+class CachingLocalSegmentAllocator implements IndexTaskSegmentAllocator


What would be the non-default version of CachingLocalSegmentAllocator?

Preview of RangePartitionCachingLocalSegmentAllocator: https://github.com/ccaominh/incubator-druid/blob/backup-superbatch-range-partitioning/indexing-service/src/main/java/org/apache/druid/indexing/common/task/RangePartitionCachingLocalSegmentAllocator.java

"Default" may not be the best name for the class, so I'm open to better suggestions.

Thanks. Maybe HashPartitioningCachingLocalSegmentAllocator?

jihoonson · 2019-11-19T02:10:32Z

...ce/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/PartitionLocation.java

 {
  private final String host;
  private final int port;
  private final boolean useHttps;
  private final String subTaskId;
  private final Interval interval;
-  private final int partitionId;
+  private final T secondaryPartition;


What would be secondaryPartition for range partitioning?

It uses the ShardSpec, so that it's generic enough for any kind of partitioning. Here's a preview of GenericPartitionLocation: https://github.com/ccaominh/incubator-druid/blob/backup-superbatch-range-partitioning/indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/GenericPartitionLocation.java

For hash partitions, the existing implementation (HashPartitionLocation) is more optimal with regard to having a smaller payload (since the partition dimensions do not need to be repeated for each partition).

Hmm, does it need the entire shardSpec for range partitioning? I thought a triple of (startKey, endKey, partitionId) would be enough.

The discussion of what exactly is used is probably best on the follow up PR that adds the implementation of range partitioning.

Sounds right.

…ion-refactor

ccaominh · 2019-11-19T18:41:25Z

Travis "other modules" test failure is due to flaky EmitterTest

ccaominh · 2019-11-19T19:25:30Z

Travis failure for "other integration test" is due to flaky ItBasicAuthConfigurationTest as described in #7021

…ion-refactor

jihoonson · 2019-11-20T05:19:51Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+  static final String HOST = "host";
+  static final int PORT = 1;
+  static final String SUBTASK_ID = "subtask-id";
+  private static final ObjectMapper NESTED_OBJECT_MAPPER = TestHelper.makeJsonMapper();


Hmm, maybe TestUtils.getTestObjectMapper() is more useful. It's lame that we have a couple of helper classes to make an ObjectMapper though.

What's the advantage of using the object mapper from TestUtils versus TestHelper. Both are probably more than what's actually needed for the current usages of NESTED_OBJECT_MAPPER.

ccaominh · 2019-11-20T05:31:33Z

I've had to merge master a few times now to resolve merge conflicts, so getting this through code review soon would be greatly appreciated!

…ion-refactor

jihoonson · 2019-11-20T05:36:17Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+
+  static ObjectMapper createObjectMapper()
+  {
+    InjectableValues injectableValues = new InjectableValues.Std()


Hmm sorry, I left a comment on a wrong line. This seems duplicate with TestUtils.getTestObjectMapper() except HttpClient. Probably better to reuse it.

I've made changes to TestUtils and changed this class to use TestUtils

jihoonson · 2019-11-20T05:37:48Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+
+    Map<String, Object> parser = NESTED_OBJECT_MAPPER.convertValue(
+        new StringInputRowParser(
+            new JSONParseSpec(


It can use the new InputFormat API.

jihoonson · 2019-11-20T05:37:58Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+  }
+
+  static ParallelIndexIngestionSpec createIngestionSpec(
+      InlineFirehoseFactory inlineFirehoseFactory,


It can use the new InputSource API.

jihoonson · 2019-11-20T05:46:04Z

core/src/test/java/org/apache/druid/data/input/HandlingInputRowIteratorTest.java

+    @Override
+    public void close()
+    {
+      throw new UnsupportedOperationException();


What is this class for?

It is for the convenience of not having to override close() when creating each of the iterators that are used for the tests.

Hmm, I think it will be more clear to not extend CloseableIterator in that case. If it had to extend CloseableIterator, it makes more sense to me to call close() properly.

Changed this to use CloseableIterators.withEmptyBaggage()

jihoonson · 2019-11-20T05:58:23Z

...xing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/Factory.java

+import java.util.List;
+import java.util.Map;
+
+class Factory


I would suggest to rename the class more intuitive or add javadoc so that other people can also use this class. Maybe both would be best.

jihoonson · 2019-11-20T05:58:42Z

...ice/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/iterator/Factory.java

+import java.util.List;
+import java.util.function.Supplier;
+
+class Factory


Same here. I would suggest to rename the class more intuitive or add javadoc so that other people can also use this class. Maybe both would be best.

I've renamed to something more descriptive and added a javadoc

…ion-refactor

jihoonson

+1 after CI.

I've had to merge master a few times now to resolve merge conflicts, so getting this through code review soon would be greatly appreciated!

@ccaominh sorry for those conflicts and thanks for your patience! Please understand sometimes merge conflicts happen especially when different people work on the parts related to each other.

* Refactor parallel indexing perfect rollup partitioning Refactoring to make it easier to later add range partitioning for perfect rollup parallel indexing. This is accomplished by adding several new base classes (e.g., PerfectRollupWorkerTask) and new classes for encapsulating logic that needs to be changed for different partitioning strategies (e.g., IndexTaskInputRowIteratorBuilder). The code is functionally equivalent to before except for the following small behavior changes: 1) PartialSegmentMergeTask: Previously, this task had a priority of DEFAULT_TASK_PRIORITY. It now has a priority of DEFAULT_BATCH_INDEX_TASK_PRIORITY (via the new PerfectRollupWorkerTask base class), since it is a batch index task. 2) ParallelIndexPhaseRunner: A decorator was added to subTaskSpecIterator to ensure the subtasks are generated with unique ids. Previously, only tests (i.e., MultiPhaseParallelIndexingTest) would have this decorator, but this behavior is desired for non-test code as well. * Fix forbidden apis and pmd warnings * Fix analyze dependencies warnings * Fix IndexTask json and add IT diags * Fix parallel index supervisor<->worker serde * Fix TeamCity inspection errors/warnings * Fix TeamCity inspection errors/warnings again * Integrate changes with those from apache#8823 * Address review comments * Address more review comments * Fix forbidden apis * Address more review comments

ccaominh added 3 commits November 11, 2019 10:45

Fix forbidden apis and pmd warnings

b84012c

Fix analyze dependencies warnings

36b8e65

Fix IndexTask json and add IT diags

206c3aa

ccaominh added 6 commits November 11, 2019 17:46

Fix parallel index supervisor<->worker serde

e09a754

Fix TeamCity inspection errors/warnings

db78ea4

Fix TeamCity inspection errors/warnings again

dd2731f

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

7426f51

…ion-refactor

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

ff85116

…ion-refactor

Integrate changes with those from apache#8823

3ea47c6

leventov added the Refactoring label Nov 16, 2019

suneet-s reviewed Nov 18, 2019

View reviewed changes

suneet-s approved these changes Nov 18, 2019

View reviewed changes

jihoonson reviewed Nov 18, 2019

View reviewed changes

jihoonson added the Area - Batch Ingestion label Nov 18, 2019

jihoonson reviewed Nov 18, 2019

View reviewed changes

Address review comments

1c3b6bf

jihoonson reviewed Nov 19, 2019

View reviewed changes

ccaominh added 3 commits November 18, 2019 21:28

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

4b4cfff

…ion-refactor

Address more review comments

cb9b6c6

Fix forbidden apis

0981636

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

3c927d8

…ion-refactor

jihoonson reviewed Nov 20, 2019

View reviewed changes

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

3d3ef54

…ion-refactor

jihoonson reviewed Nov 20, 2019

View reviewed changes

ccaominh added 2 commits November 20, 2019 14:58

Address more review comments

43f046a

Merge remote-tracking branch 'upstream/master' into superbatch-partit…

9a3ae71

…ion-refactor

jihoonson approved these changes Nov 21, 2019

View reviewed changes

gianm added this to the 0.17.0 milestone Nov 21, 2019

gianm merged commit ff62173 into apache:master Nov 21, 2019

ccaominh deleted the superbatch-partition-refactor branch November 21, 2019 01:25

		private static final String MAX_PARTITION_SIZE = "maxPartitionSize";
		private static final String FORCE_GUARANTEED_ROLLUP_COMPATIBLE = "";

		abstract int getPartitionId();

		abstract T getSecondaryPartition();

Refactor parallel indexing perfect rollup partitioning #8852

Refactor parallel indexing perfect rollup partitioning #8852

Conversation

ccaominh commented Nov 11, 2019 • edited Loading

Description

suneet-s commented Nov 11, 2019 • edited Loading

ccaominh commented Nov 11, 2019

suneet-s commented Nov 11, 2019

Choose a reason for hiding this comment

suneet-s left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

suneet-s left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jihoonson Nov 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jihoonson Nov 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccaominh Nov 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccaominh commented Nov 19, 2019

ccaominh commented Nov 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccaominh commented Nov 20, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccaominh commented Nov 11, 2019 •

edited

Loading

suneet-s commented Nov 11, 2019 •

edited

Loading

jihoonson Nov 18, 2019 •

edited

Loading

jihoonson Nov 19, 2019 •

edited

Loading

ccaominh Nov 19, 2019 •

edited

Loading