Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable qos adjustment logic when feature apply_cost_tracker_during_replay is activated #31671

Conversation

tao-stones
Copy link
Contributor

Problem

Enabling the mentioned feature without first removing the QoS Adjustment Logic (#29595 (comment)) breaks cluster test runs off head of Master, where all features are enabled by default.

Summary of Changes

Fixes #31340

@tao-stones
Copy link
Contributor Author

tao-stones commented May 16, 2023

tag @KirillLykov @jeffwashington @sakridge
This should fix kin-sim and other cluster test without needing to specifically disable the feature.

Also worth mentioning: if the TXs used in test request higher compute-unit-limit, the number of tx included in block will reduce with this change, therefore reduce throughput, or the # account created in kin-sim. To avoid that, request accurate CU 😄

@apfitzge
Copy link
Contributor

Made a comment on the feature-gate issue: #29595 (comment)

@tao-stones
Copy link
Contributor Author

Made a comment on the feature-gate issue: #29595 (comment)

made comment there too

@codecov
Copy link

codecov bot commented May 16, 2023

Codecov Report

Merging #31671 (b9cd201) into master (db4b76d) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

@@            Coverage Diff            @@
##           master   #31671     +/-   ##
=========================================
- Coverage    81.9%    81.9%   -0.1%     
=========================================
  Files         737      737             
  Lines      205985   205989      +4     
=========================================
- Hits       168718   168712      -6     
- Misses      37267    37277     +10     

@KirillLykov
Copy link
Contributor

Cluster didn't panic any longer with this fix

@tao-stones tao-stones merged commit 692e1f2 into solana-labs:master May 17, 2023
@tao-stones tao-stones deleted the apply_feautre_gate_to_qos_adjustment_logic branch May 17, 2023 16:25
Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, merged while I was reviewing 🐢

@@ -1074,7 +1082,9 @@ mod tests {
mint_keypair,
..
} = create_slow_genesis_config(10_000);
let bank = Arc::new(Bank::new_no_wallclock_throttle_for_tests(&genesis_config));
let mut bank = Bank::new_no_wallclock_throttle_for_tests(&genesis_config);
bank.deactivate_feature(&feature_set::apply_cost_tracker_during_replay::id());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to explicitly disable the feature for this test? I don't see anything in it that stands out as failing immediately.
If that's the case, imo it's better to have this test run the scenario twice with the feature enabled and disabled.

But obviously let me know if I'm missing something!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per the intention of the pr, with the feature enabled, the adjustment will be bypassed, which breaks this test.

But a good point to test it with out both feature on/off. Will PR with added test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what part of the test breaks without this? All the asserts check, afaict, is that there is some non-zero cost, and the tx was committed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It breaks here:

-            assert_eq!(get_block_cost(), 2 * single_transfer_cost);
-            assert_eq!(get_tx_count(), 2);
+            assert_eq!(get_block_cost(), expected_block_cost);
+            assert_eq!(get_tx_count(), expected_tracked_tx_count);

the adjustment logic removes failed tx (the one fail due to AccountInUse) from cost tracker. #31708 for it.

);
// once feature `apply_cost_tracker_during_replay` is activated, leader shall no longer
// adjust block with executed cost (a behavior more inline with bankless leader), instead
// will be exclusively using requested `compute_unit_limit` in cost tracking.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

exclusively using requested compute_unit_limit in cost tracking

Sorry to nit-pick this comment, but I want to be really clear about the expectations and behaviors, since this kind of implies that we will require the request to be set (to me at least), and I don't beleive that is the case.

  1. We still have a default CU-limit which is used if not specified.
  2. If we have pure builtin txs, even if the request is set, we don't use that limit in the cost tracking, only fee calculations.

Are these 2 points still going to be true moving forward?

Might be better to just be a bit more vague in the comments, w/ details being in the cost model itself. Something along the lines of "instead calculated, requested, or default costs will be used in cost tracking".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes for point 1, default will be used if compute_unit_limit is not explicitly set. I implicitly included default when mentioning requested cu. But good point, #31700 clarifies it.

Point 2 is also correct, but it probably should change. Since all builtins now (in 1.16+) also consume requested CUs, probably no need to distinguish builtin and bpf programs in cost model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all builtins now (in 1.16+) also consume requested CUs, probably no need to distinguish builtin and bpf programs in cost model.

I suppose the difference is we know the static builtin costs up-front, compared to bpf. So we can accurately determine those costs, which are (if tx will succeed) strictly <= requested.

Actually, taking it a slight step further - if we found it to be a problem we could immediately reject (try taking fees only) if a builtin-only tx requested too few CUs.

wen-coding pushed a commit to wen-coding/solana that referenced this pull request May 18, 2023

Unverified

The email in this signature doesn’t match the committer email.
…eplay is activated (solana-labs#31671)

* disable qos adjustment logic when feature apply_cost_tracker_during_replay is activated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kin cluster stalled after seeing WouldExceedMaxAccountCostLimit error
3 participants