Skip to content

Commit

Permalink
Cost model 1.7 (solana-labs#20188)
Browse files Browse the repository at this point in the history
* Cost Model to limit transactions which are not parallelizeable (solana-labs#16694)

* * Add following to banking_stage:
  1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions.
  2. CostTracker which is shared between threads, tracks transaction costs for each block.

* replace hard coded program ID with id() calls

* Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed.

* Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table.

* add test for cost_tracker atomically try_add operation, serves as safety guard for future changes

* check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker;

* bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations

* replay stage feed back program cost (solana-labs#17731)

* replay stage feeds back realtime per-program execution cost to cost model;

* program cost execution table is initialized into empty table, no longer populated with hardcoded numbers;

* changed cost unit to microsecond, using value collected from mainnet;

* add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs.

* investigate system performance test degradation  (solana-labs#17919)

* Add stats and counter around cost model ops, mainly:
- calculate transaction cost
- check transaction can fit in a block
- update block cost tracker after transactions are added to block
- replay_stage to update/insert execution cost to table

* Change mutex on cost_tracker to RwLock

* removed cloning cost_tracker for local use, as the metrics show clone is very expensive.

* acquire and hold locks for block of TXs, instead of acquire and release per transaction;

* remove redundant would_fit check from cost_tracker update execution path

* refactor cost checking with less frequent lock acquiring

* avoid many Transaction_cost heap allocation when calculate cost, which
is in the hot path - executed per transaction.

* create hashmap with new_capacity to reduce runtime heap realloc.

* code review changes: categorize stats, replace explicit drop calls, concisely initiate to default

* address potential deadlock by acquiring locks one at time

* Persist cost table to blockstore (solana-labs#18123)

* Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks
* Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()`

* Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time
* Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory
* Only try to persist to blockstore when cost_table is changed.
* Restore cost table during validator startup

* Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads;
* Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model.

* log warning when channel send fails (solana-labs#18391)

* Aggregate cost_model into cost_tracker (solana-labs#18374)

* * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions

* review fixes

* update ledger tool to restore cost table from blockstore (solana-labs#18489)

* update ledger tool to restore cost model from blockstore when compute-slot-cost

* Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool

* refactor and simplify a test

* manually fix merge conflicts

* Per-program id timings (solana-labs#17554)

* more manual fixing

* solve a merge conflict

* featurize cost model

* more merge fix

* cost model uses compute_unit to replace microsecond as cost unit
(solana-labs#18934)

* Reject blocks for costs above the max block cost (solana-labs#18994)

* Update block max cost limit to fix performance regession (solana-labs#19276)

* replace function with const var for better readability (solana-labs#19285)

* Add few more metrics data points (solana-labs#19624)

* periodically report sigverify_stage stats (solana-labs#19674)

* manual merge

* cost model nits (solana-labs#18528)

* Accumulate consumed units (solana-labs#18714)

* tx wide compute budget (solana-labs#18631)

* more manual merge

* ignore zerorize drop security

* - update const cost values with data collected by solana-labs#19627
- update cost calculation to closely proposed fee schedule solana-labs#16984

* add transaction cost histogram metrics (solana-labs#20350)

* rebase to 1.7.15

* add tx count and thread id to stats (solana-labs#20451)
each stat reports and resets when slot changes

* remove cost_model feature_set

* ignore vote transactions from cost model

Co-authored-by: sakridge <[email protected]>
Co-authored-by: Jeff Biseda <[email protected]>
Co-authored-by: Jack May <[email protected]>
  • Loading branch information
4 people authored Oct 6, 2021
1 parent 414674e commit 1dd6dc3
Show file tree
Hide file tree
Showing 40 changed files with 3,205 additions and 263 deletions.
246 changes: 139 additions & 107 deletions Cargo.lock

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ members = [
"poh-bench",
"program-test",
"programs/bpf_loader",
"programs/compute-budget",
"programs/config",
"programs/exchange",
"programs/failure",
Expand Down
7 changes: 5 additions & 2 deletions banking-bench/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use crossbeam_channel::unbounded;
use log::*;
use rand::{thread_rng, Rng};
use rayon::prelude::*;
use solana_core::banking_stage::BankingStage;
use solana_core::{banking_stage::BankingStage, cost_model::CostModel, cost_tracker::CostTracker};
use solana_gossip::{cluster_info::ClusterInfo, cluster_info::Node};
use solana_ledger::{
blockstore::Blockstore,
Expand All @@ -27,7 +27,7 @@ use solana_sdk::{
};
use solana_streamer::socket::SocketAddrSpace;
use std::{
sync::{atomic::Ordering, mpsc::Receiver, Arc, Mutex},
sync::{atomic::Ordering, mpsc::Receiver, Arc, Mutex, RwLock},
thread::sleep,
time::{Duration, Instant},
};
Expand Down Expand Up @@ -231,6 +231,9 @@ fn main() {
vote_receiver,
None,
replay_vote_sender,
Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
CostModel::default(),
))))),
);
poh_recorder.lock().unwrap().set_bank(&bank);

Expand Down
1 change: 1 addition & 0 deletions ci/do-audit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,5 +45,6 @@ cargo_audit_ignores=(
# Blocked on jsonrpc removing dependency on unmaintained `websocket`
# https://github.com/paritytech/jsonrpc/issues/605
--ignore RUSTSEC-2021-0079

)
scripts/cargo-for-all-lock-files.sh stable audit "${cargo_audit_ignores[@]}"
7 changes: 4 additions & 3 deletions core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,14 @@ ed25519-dalek = "=1.0.1"
fs_extra = "1.2.0"
flate2 = "1.0"
indexmap = { version = "1.5", features = ["rayon"] }
itertools = "0.9.0"
libc = "0.2.81"
log = "0.4.11"
lru = "0.6.1"
miow = "0.2.2"
net2 = "0.2.37"
num-traits = "0.2"
histogram = "0.6.9"
itertools = "0.10.1"
log = "0.4.14"
lru = "0.6.6"
rand = "0.7.0"
rand_chacha = "0.2.2"
rand_core = "0.6.2"
Expand Down
12 changes: 11 additions & 1 deletion core/benches/banking_stage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ use log::*;
use rand::{thread_rng, Rng};
use rayon::prelude::*;
use solana_core::banking_stage::{BankingStage, BankingStageStats};
use solana_core::cost_model::CostModel;
use solana_core::cost_tracker::CostTracker;
use solana_core::cost_tracker_stats::CostTrackerStats;
use solana_gossip::cluster_info::ClusterInfo;
use solana_gossip::cluster_info::Node;
use solana_ledger::blockstore_processor::process_entries;
Expand All @@ -33,7 +36,7 @@ use solana_streamer::socket::SocketAddrSpace;
use std::collections::VecDeque;
use std::sync::atomic::Ordering;
use std::sync::mpsc::Receiver;
use std::sync::Arc;
use std::sync::{Arc, RwLock};
use std::time::{Duration, Instant};
use test::Bencher;

Expand Down Expand Up @@ -92,6 +95,10 @@ fn bench_consume_buffered(bencher: &mut Bencher) {
None::<Box<dyn Fn()>>,
&BankingStageStats::default(),
&recorder,
&Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
CostModel::new(std::u64::MAX, std::u64::MAX),
))))),
&mut CostTrackerStats::default(),
);
});

Expand Down Expand Up @@ -218,6 +225,9 @@ fn bench_banking(bencher: &mut Bencher, tx_type: TransactionType) {
vote_receiver,
None,
s,
Arc::new(RwLock::new(CostTracker::new(Arc::new(RwLock::new(
CostModel::new(std::u64::MAX, std::u64::MAX),
))))),
);
poh_recorder.lock().unwrap().set_bank(&bank);

Expand Down
Loading

0 comments on commit 1dd6dc3

Please sign in to comment.