Skip to content

Commit

Permalink
[improve](routine load) adjust default values to make routine load mo…
Browse files Browse the repository at this point in the history
…re convenient to use (apache#42491)

For a routine load job, it will be divided into many tasks, each of
which is a transaction. Currently, the default time
consumed(max_batch_interval) is 10 seconds. The benefits of increasing
this value are:
1. Larger batch consumption can lead to better performance.
2. Reducing the number of transactions can alleviate the pressure of
compaction and the conflicts of concurrent transaction submissions.

related doc: https://github.com/apache/doris-website/pull/1236/files
  • Loading branch information
sollhui authored Oct 31, 2024
1 parent e9401c2 commit 19016b1
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 5 deletions.
7 changes: 4 additions & 3 deletions be/src/runtime/stream_load/stream_load_context.h
Original file line number Diff line number Diff line change
Expand Up @@ -164,9 +164,10 @@ class StreamLoadContext {

// the following members control the max progress of a consuming
// process. if any of them reach, the consuming will finish.
int64_t max_interval_s = 5;
int64_t max_batch_rows = 100000;
int64_t max_batch_size = 100 * 1024 * 1024; // 100MB
// same as values set in fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadJob.java
int64_t max_interval_s = 60;
int64_t max_batch_rows = 20000000;
int64_t max_batch_size = 1024 * 1024 * 1024; // 1GB

// for parse json-data
std::string data_format = "";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ public abstract class RoutineLoadJob
public static final long DEFAULT_MAX_ERROR_NUM = 0;
public static final double DEFAULT_MAX_FILTER_RATIO = 1.0;

public static final long DEFAULT_MAX_INTERVAL_SECOND = 10;
public static final long DEFAULT_MAX_INTERVAL_SECOND = 60;
public static final long DEFAULT_MAX_BATCH_ROWS = 20000000;
public static final long DEFAULT_MAX_BATCH_SIZE = 1024 * 1024 * 1024; // 1GB
public static final long DEFAULT_EXEC_MEM_LIMIT = 2 * 1024 * 1024 * 1024L;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -362,7 +362,7 @@ public void testGetShowCreateInfo() throws UserException {
+ "\"desired_concurrent_number\" = \"0\",\n"
+ "\"max_error_number\" = \"10\",\n"
+ "\"max_filter_ratio\" = \"1.0\",\n"
+ "\"max_batch_interval\" = \"10\",\n"
+ "\"max_batch_interval\" = \"60\",\n"
+ "\"max_batch_rows\" = \"10\",\n"
+ "\"max_batch_size\" = \"1073741824\",\n"
+ "\"format\" = \"csv\",\n"
Expand Down

0 comments on commit 19016b1

Please sign in to comment.