Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test for slash packet throttling #509

Closed
2 tasks done
shaspitz opened this issue Nov 22, 2022 · 7 comments · Fixed by #600
Closed
2 tasks done

Integration test for slash packet throttling #509

shaspitz opened this issue Nov 22, 2022 · 7 comments · Fixed by #600
Assignees
Labels
help wanted Open for all. You do not need permission to work on these. scope: testing Code review, testing, making sure the code is following the specification.

Comments

@shaspitz
Copy link
Contributor

shaspitz commented Nov 22, 2022

Problem

#462 passes the currently written integration tests, since only one validator is ever slashed from a consumer in existing steps (ie. no throttling happens). In order to get 462 out quickly, I'm gonna skip writing a specialized set of steps for validating the throttling logic. This test can be added in a separate PR.

Closing criteria

Implement a specialized set of integration test steps that validate slash packet throttling logic.

Problem details

I tried quickly implementing the integration test myself, but ran into this error. I can come back to this issue later to debug

=============== started slash-throttling tests ===============
running slash-throttling: step 1 == StartChainAction
=============== started democracy tests ===============
running democracy: step 1 == StartChainAction
=============== started default tests ===============
running default: step 1 == StartChainAction
=============== started multi-consumer tests ===============
running multi-consumer: step 1 == StartChainAction
2022/11/22 15:29:57 exit status 1
Error: post failed: Post "http://7.7.7.253:26658": dial tcp 7.7.7.253:26658: connect: no route to host
Usage:
  interchain-security-pd query bank balances [address] [flags]

Flags:
      --count-total       count total number of records in all balances to query for
      --denom string      The specific balance denomination to query for
      --height int        Use a specific height to query state at (this can error if the node is pruning state)
  -h, --help              help for balances
      --limit uint        pagination limit of all balances to query for (default 100)
      --node string       <host>:<port> to Tendermint RPC interface for this chain (default "tcp://localhost:26657")
      --offset uint       pagination offset of all balances to query for
  -o, --output string     Output format (text|json) (default "text")
      --page uint         pagination page of all balances to query for. This sets offset to a multiple of limit (default 1)
      --page-key string   pagination page-key of all balances to query for
      --reverse           results are sorted in descending order

Global Flags:
      --chain-id string     The network chain ID (default "interchain-security-p")
      --home string         directory for config and data (default "/root/.interchain-security-p")
      --log_format string   The logging format (json|plain) (default "plain")
      --log_level string    The logging level (trace|debug|info|warn|error|fatal|panic) (default "info")
      --trace               print out full stack trace on errors

exit status 1

Note: the two useful genesis changes you'll need to make to implement a slash packet throttling integration test is:

".app_state.ccv.params.slash_meter_replenish_period = \"60s\" | " +
".app_state.ccv.params.slash_meter_replenish_fraction = \"0.33\" | ",

TODOs

  • Labels have been added for issue
  • Issue has been added to the ICS project
@shaspitz shaspitz added help wanted Open for all. You do not need permission to work on these. scope: testing Code review, testing, making sure the code is following the specification. labels Nov 22, 2022
@shaspitz shaspitz moved this to Todo in Replicated Security Nov 22, 2022
@MSalopek
Copy link
Contributor

I'm looking into this. Have in mind that adding a new test run will probably make the integration tests run for about 20 mins.

Double signing was done in a separate test run, it took forever to finish so I merged it into the happy path.

@MSalopek
Copy link
Contributor

Can you please specify how the slash meter would come into play?

We could cause downtime for 3/4 of the validators pretty easily using existing actions. Would that work?

@shaspitz
Copy link
Contributor Author

Appreciate you looking into this! The new steps could probably be a part of the existing happy path steps, that's fine with me.

The slash throttling would come into play when more than one consumer-initiated slash occurs (two downtime actions on the consumer). The current set of steps only invokes one consumer initiated slash, which is always allowed by the slash throttling. The second consumer initiated slash will only be handled on the provider once the SlashMeterReplenishPeriod (provider param) elapses enough times to add SlashMeterReplenishFraction (provider param) * currentTotalVotingPower to the meter. Once the meter returns to a positive value, the second slash packet will be handled on the provider.

You can tweak the mentioned provider params to properly test throttling, and that the second slash packet is eventually handled once enough time elapses.

Lmk if I can make any of that more clear over a call! The description of #462 could help out with general understanding of the feature as well.

@danwt
Copy link
Contributor

danwt commented Nov 30, 2022

I would suggest that we try to use integration tests as much as possible to test things which cannot be tested any other way. I feel that the throttle can be tested very well with unit/e2e/diff tests so we should focus on those and not try to write integration tests for it.

What do you guys think?

@shaspitz
Copy link
Contributor Author

I think that a very simple integration test (and nothing more) for slash packet throttling would be useful as a sanity check. The test would look like this:

  1. Invoke two consumer initiated slash events
  2. Confirm first slash packet is handled immediately by provider, other slash packet is not
  3. Wait X blocks (can be 1-2 blocks if params are set in a useful way)
  4. confirm second slash packet is now handled, due to slash meter being replenished

@shaspitz
Copy link
Contributor Author

integration tests are useful for things which cannot be tested any other way as Dan has mentioned. But imo they're also useful just as a sanity check that the real system works. Any in-mem test will always be limited to some extent

@jtremback
Copy link
Contributor

Integration tests are the only way that you know that things actually work and you're not just testing some part of your test framework.

@mpoke mpoke moved this from Todo to In Progress in Replicated Security Dec 19, 2022
@mpoke mpoke moved this from In Progress to Waiting for review in Replicated Security Dec 20, 2022
Repository owner moved this from Waiting for review to Done in Replicated Security Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Open for all. You do not need permission to work on these. scope: testing Code review, testing, making sure the code is following the specification.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants
@jtremback @danwt @MSalopek @shaspitz and others