Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High tps test #13573

Closed
wants to merge 74 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
9836994
Quorum store config
sitalkedia Jun 5, 2024
1829a96
Block size and gas limit tuning
sitalkedia Jun 5, 2024
daeed8b
100 validator test
sitalkedia Jun 5, 2024
ac0e4e1
enable consensus observer and mempool backlog tuning
sitalkedia Jun 5, 2024
82d4f71
Use crazy machines
sitalkedia Jun 5, 2024
89e2e66
Optimized state sync throughput
sitalkedia Jun 5, 2024
3dca531
100 node test
sitalkedia Jun 5, 2024
9b53d88
remove VFNs
sitalkedia Jun 5, 2024
bb85d11
Increase the duration to 20m
sitalkedia Jun 6, 2024
66f5846
Add increase fraction to qs backpressure config
vusirikala Jun 6, 2024
ec34e5d
Start with high dynamic_pull_txn_per_s
vusirikala Jun 6, 2024
e5860f0
Increase batch expiration time to 3 seconds
vusirikala Jun 7, 2024
942e2d6
increase duration
vusirikala Jun 7, 2024
2dd6fd9
Fixed the typo in batch generator
vusirikala Jun 7, 2024
635d541
Add counters
vusirikala Jun 7, 2024
63f13e6
Pull txns more frequently
vusirikala Jun 7, 2024
bc5eb70
Merge branch 'main' into high_tps_test
vusirikala Jun 7, 2024
a9c951c
Reduced the limits for pulling
vusirikala Jun 7, 2024
2d7a174
Add more counters
vusirikala Jun 8, 2024
336dc35
Add counters
vusirikala Jun 8, 2024
8db0ed0
Change priority order when pulling from mempool
vusirikala Jun 8, 2024
cbb2e7c
Update sorting order:
vusirikala Jun 8, 2024
bf71df1
Fixing a counter
vusirikala Jun 8, 2024
b43b466
Add more conters
vusirikala Jun 8, 2024
022ed02
Merge branch 'main' into high_tps_test
vusirikala Jun 11, 2024
d24337f
Remove inserted txns from skipped
vusirikala Jun 11, 2024
c033e27
Add a counter
vusirikala Jun 11, 2024
93d1898
Added more counters for sent batch requests and implemented increase …
vusirikala Jun 12, 2024
cc87685
Add more counters
vusirikala Jun 12, 2024
00b54d0
Merge branch 'main' into high_tps_test
vusirikala Jun 12, 2024
69b830e
remove some counters
vusirikala Jun 12, 2024
d8f7188
remove some counters
vusirikala Jun 12, 2024
52bb2b6
Update counters
vusirikala Jun 12, 2024
c7dbd14
Merge branch 'main' into high_tps_test
vusirikala Jun 12, 2024
b08df0a
Add counters
vusirikala Jun 12, 2024
f6409bd
Add latency counters
vusirikala Jun 12, 2024
1460da1
Add latency counters
vusirikala Jun 12, 2024
dbbf7c3
Use the old get_batch code
vusirikala Jun 12, 2024
83c9b92
Use durationhistogram
vusirikala Jun 13, 2024
dca1972
Fix
vusirikala Jun 13, 2024
c0efbd9
Add more counters
vusirikala Jun 13, 2024
5c70d3d
Add counters in mempool
vusirikala Jun 13, 2024
2728f93
Add transactions pulled total count counter
vusirikala Jun 14, 2024
bb281ed
Add transactions pulled total count counter
vusirikala Jun 14, 2024
76c6ad8
Add transactions pulled total count counter
vusirikala Jun 14, 2024
94e32a2
Merge branch 'main' into high_tps_test
vusirikala Jun 18, 2024
bc0ca52
Switch with load_sweep_env
vusirikala Jun 18, 2024
7f32471
Add some latency counters
vusirikala Jun 22, 2024
7d53ab1
Adding more counters
vusirikala Jun 24, 2024
2227114
Removing total_num_txns
vusirikala Jun 24, 2024
fbdeba9
Remove call to actual num transactions
vusirikala Jun 24, 2024
8e23dc9
Merge branch 'main' into high_tps_test
vusirikala Jun 24, 2024
4fa1d43
Remove subtraction
vusirikala Jun 24, 2024
489ae74
Add some counters in info
vusirikala Jun 24, 2024
7e9241b
Add info for skipped txns
vusirikala Jun 24, 2024
f9ab336
Check for account sequence number in excluded
vusirikala Jun 24, 2024
081a348
bypass if the previous seq number is excluded
vusirikala Jun 24, 2024
73631ee
Add more print statements
vusirikala Jun 25, 2024
e95a2cb
Add more counters
vusirikala Jun 25, 2024
a04b89c
Add more info statements
vusirikala Jun 25, 2024
d38c880
Replace btreeset with hashet for priority index
vusirikala Jun 25, 2024
e977065
Add info statements in transaction store
vusirikala Jun 26, 2024
7c384da
Print result length
vusirikala Jun 26, 2024
0db0057
Removing ordering for orderedkey
vusirikala Jun 26, 2024
f9a6c36
Print max txns as well
vusirikala Jun 27, 2024
d64e7d5
More info statements
vusirikala Jun 27, 2024
3336a64
Use pull txn window in batch generator
vusirikala Jun 27, 2024
236f86b
Add info statements in batch generator
vusirikala Jun 27, 2024
7244f40
Increasing to 20 validators
vusirikala Jun 27, 2024
5a87f1c
using btreeset again
vusirikala Jun 27, 2024
d5cfd8d
Use pull window
vusirikala Jun 27, 2024
eed9035
Merge branch 'main' into high_tps_test
vusirikala Jun 27, 2024
8b3f438
Change the ordering for priority index
vusirikala Jun 27, 2024
d6ebc54
Merge branch 'main' into high_tps_test
vusirikala Jun 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added more counters for sent batch requests and implemented increase …
…fraction
vusirikala committed Jun 12, 2024

Verified

This commit was signed with the committer’s verified signature.
cosmicBboy Niels Bantilan
commit 93d189880f9394069e5a734e32fd2b2f64733bea
49 changes: 49 additions & 0 deletions consensus/src/counters.rs
Original file line number Diff line number Diff line change
@@ -1118,3 +1118,52 @@ pub static RAND_QUEUE_SIZE: Lazy<IntGauge> = Lazy::new(|| {
)
.unwrap()
});

pub static PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_DURATION: Lazy<Histogram> = Lazy::new(|| {
register_histogram!(
"aptos_consensus_payload_manager_request_transactions_duration",
"Histogram of the time it takes to request transactions from the payload manager.",
[
0.005, 0.010, 0.015, 0.020, 0.025, 0.030, 0.035, 0.040, 0.045, 0.050, 0.055, 0.060,
0.065, 0.070, 0.075, 0.080, 0.085, 0.090, 0.095, 0.100, 0.110, 0.120, 0.130, 0.140,
0.150, 0.160, 0.170, 0.180, 0.190, 0.200, 0.225, 0.250
]
.to_vec()
)
.unwrap()
});

pub static PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(
"aptos_consensus_payload_manager_request_transactions_proof_count",
"Count of the number of times a proof is requested for transactions."
)
.unwrap()
});

pub static PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_1: Lazy<IntCounter> =
Lazy::new(|| {
register_int_counter!(
"aptos_consensus_payload_manager_request_transactions_proof_count_purpose_1",
"Count of the number of times a proof is requested for transactions for purpose 1."
)
.unwrap()
});

pub static PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_2: Lazy<IntCounter> =
Lazy::new(|| {
register_int_counter!(
"aptos_consensus_payload_manager_request_transactions_proof_count_purpose_2",
"Count of the number of times a proof is requested for transactions for purpose 2."
)
.unwrap()
});

pub static PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_3: Lazy<IntCounter> =
Lazy::new(|| {
register_int_counter!(
"aptos_consensus_payload_manager_request_transactions_proof_count_purpose_3",
"Count of the number of times a proof is requested for transactions for purpose 3."
)
.unwrap()
});
19 changes: 19 additions & 0 deletions consensus/src/payload_manager.rs
Original file line number Diff line number Diff line change
@@ -58,11 +58,25 @@ impl PayloadManager {
proofs: Vec<ProofOfStore>,
block_timestamp: u64,
batch_reader: Arc<dyn BatchReader>,
purpose: u64,
) -> Vec<(
HashValue,
oneshot::Receiver<ExecutorResult<Vec<SignedTransaction>>>,
)> {
let mut receivers = Vec::new();
counters::PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT.inc_by(proofs.len() as u64);
if purpose == 1 {
counters::PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_1
.inc_by(proofs.len() as u64);
} else if purpose == 2 {
counters::PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_2
.inc_by(proofs.len() as u64);
} else if purpose == 3 {
counters::PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_PROOF_COUNT_PURPOSE_3
.inc_by(proofs.len() as u64);
}

let start_time = std::time::Instant::now();
for pos in proofs {
trace!(
"QSE: requesting pos {:?}, digest {}, time = {}",
@@ -76,6 +90,8 @@ impl PayloadManager {
debug!("QSE: skipped expired pos {}", pos.digest());
}
}
counters::PAYLOAD_MANAGER_REQUEST_TRANSACTIONS_DURATION
.observe(start_time.elapsed().as_secs_f64());
receivers
}

@@ -141,6 +157,7 @@ impl PayloadManager {
proof_with_status.proofs.clone(),
timestamp,
batch_reader.clone(),
1,
);
proof_with_status
.status
@@ -257,6 +274,7 @@ impl PayloadManager {
proof_with_data.proofs.clone(),
block.timestamp_usecs(),
batch_reader.clone(),
2,
);
// Could not get all data so requested again
proof_with_data
@@ -273,6 +291,7 @@ impl PayloadManager {
proof_with_data.proofs.clone(),
block.timestamp_usecs(),
batch_reader.clone(),
3,
);
// Could not get all data so requested again
proof_with_data
2 changes: 1 addition & 1 deletion consensus/src/quorum_store/batch_generator.rs
Original file line number Diff line number Diff line change
@@ -434,7 +434,7 @@ impl BatchGenerator {
if back_pressure_increase_latest.elapsed() >= back_pressure_increase_duration {
back_pressure_increase_latest = tick_start;
dynamic_pull_txn_per_s = std::cmp::min(
dynamic_pull_txn_per_s + self.config.back_pressure.dynamic_min_txn_per_s,
(dynamic_pull_txn_per_s as f64 * self.config.back_pressure.increase_fraction) as u64,
self.config.back_pressure.dynamic_max_txn_per_s,
);
trace!("QS: dynamic_max_pull_txn_per_s: {}", dynamic_pull_txn_per_s);
2 changes: 1 addition & 1 deletion consensus/src/quorum_store/batch_requester.rs
Original file line number Diff line number Diff line change
@@ -48,7 +48,7 @@ impl BatchRequesterState {
// make sure nodes request from the different set of nodes
self.next_index = rng.gen::<usize>() % self.signers.len();
counters::SENT_BATCH_REQUEST_COUNT.inc_by(num_peers as u64);
counters::SENT_INDIVIDUAL_BATCH_REQUEST_COUNT.inc(1);
counters::SENT_INDIVIDUAL_BATCH_REQUEST_COUNT.inc();
} else {
counters::SENT_BATCH_REQUEST_RETRY_COUNT.inc_by(num_peers as u64);
}
1 change: 1 addition & 0 deletions consensus/src/quorum_store/batch_store.rs
Original file line number Diff line number Diff line change
@@ -428,6 +428,7 @@ impl<T: QuorumStoreSender + Clone + Send + Sync + 'static> BatchReader for Batch
let batch_requester = self.batch_requester.clone();
tokio::spawn(async move {
if let Ok(mut value) = batch_store.get_batch_from_local(proof.digest()) {
counters::FOUND_BATCHES_LOCALLY_COUNT.inc();
if tx
.send(Ok(value.take_payload().expect("Must have payload")))
.is_err()
16 changes: 16 additions & 0 deletions consensus/src/quorum_store/counters.rs
Original file line number Diff line number Diff line change
@@ -518,6 +518,14 @@ pub static MISSED_BATCHES_COUNT: Lazy<IntCounter> = Lazy::new(|| {
.unwrap()
});

pub static FOUND_BATCHES_LOCALLY_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(
"quorum_store_found_batches_locally_count",
"Count of the found batches locally."
)
.unwrap()
});

/// Count of the timeout batches at the sender side.
pub static TIMEOUT_BATCHES_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(
@@ -563,6 +571,14 @@ pub static SENT_BATCH_REQUEST_COUNT: Lazy<IntCounter> = Lazy::new(|| {
.unwrap()
});

pub static SENT_INDIVIDUAL_BATCH_REQUEST_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(
"quorum_store_sent_individual_batch_request_count",
"Count of the number of individual batch request sent to other nodes."
)
.unwrap()
});

/// Count of the number of batch request retry sent to other nodes.
pub static SENT_BATCH_REQUEST_RETRY_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(