Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Asynchronous faucet. #1157

Merged
merged 4 commits into from
Jun 22, 2022
Merged

Asynchronous faucet. #1157

merged 4 commits into from
Jun 22, 2022

Conversation

jbearer
Copy link
Member

@jbearer jbearer commented Jun 21, 2022

Instead of processing requests directly in tide request handlers,
the request handler simply adds the request to a queue, and the
requests are dequeued and processed by a fixed number of worker
threads.

This solves two problems:

  • Requests taking a long time to process do not time out the HTTP
    request
  • We limit the CPU usage by limiting the number of worker threads,
    so the server does not get overloaded.

Closes #1151

Instead of processing requests directly in `tide` request handlers,
the request handler simply adds the request to a queue, and the
requests are dequeued and processed by a fixed number of worker
threads.

This solves two problems:
* Requests taking a long time to process do not time out the HTTP
  request
* We limit the CPU usage by limiting the number of worker threads,
  so the server does not get overloaded.

Closes #1151
This fixes a race condition where we could be shut down in the middle
of transferring several grants to a key. After restarting, previously,
we would start granting to this key from scratch, potentially granting
more than we intended and even, if we get shut down frequently, not
making progress at all.

Note, there is still a small race condition: if we get shut down just
after building a transfer but before incrementing the count of transfers
to that key (a very small window of time) we may repeat the grant the
next time we are restarted. This is a minor problem, as it means at
worst tranfserring an extra grant per restart. To fix this, we need
to synchronize between the faucet's storage and the wallet library
storage, which I think requires some kind of composabilities between
AtomicStores.
@jbearer
Copy link
Member Author

jbearer commented Jun 22, 2022

Note that we aren't currently getting parallelism out of having multiple workers, since each worker currently locks the global wallet exclusively while it is transferring. So for now it doesn't make much sense to configure num_workers greater than 1. But this design makes it pretty easy to get even fancier, like having n separate wallets for n workers, achieving true parallelism up to the constraints of the hardware.

Comment on lines +373 to +382
if let Some(val) = val {
// This is the most recent value for `key`, and it is an insert, which means
// `key` is in the queue. Go ahead and add it to the index and the message
// channel.
index.insert(key.clone(), Some(val));
} else {
// This is the most recent value for `key`, and it is a delete, which means
// `key` is not in the queue. Remember this information in `index`.
index.insert(key.clone(), None);
}
Copy link
Contributor

@sveitser sveitser Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as index.insert(key.clone(), val);, so we could also write it in the more compact version without the if statement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. There used to be some extra logic in one of the branches, but I removed it

/// form `UserPubKey -> Option<usize>`. An entry `key -> Some(n)` corresponds to updating the
/// counter associated with `key` to `n`. An entry `key -> None` corresponds to deleting the entry
/// for `key`. We can recover the in-memory index by simply replaying each log entry and inserting
/// or deleting into a `HashMap` as indicated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't too hard to understand the code thanks to the explanations here but I wonder if it would make the code more clear by itself if instead of using Some(0), Some(n > 0), None to represent the state of a faucet request we had an enum for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good suggestion

@sveitser
Copy link
Contributor

Weird. I get 2 faucet test binaries so the new test runs twice. Also seems to happen on the CI.

Copy link
Contributor

@sveitser sveitser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me. I did also try it with cape-demo-local and the https://github.com/EspressoSystems/cape-ui/pull/529 and that worked fine.

@philippecamacho
Copy link
Collaborator

I got some error when running the tests, but it might not be related to the change of this PR.

test backend::test::test_anonymous_erc20_transfer ... FAILED <69.927s>

failures:

---- backend::test::test_anonymous_erc20_transfer stdout ----
Sent funding tx to deployer
xfr: 987 bytes
mint: 971 bytes
freeze: 979 bytes
Sent funding tx to deployer
thread 'backend::test::test_anonymous_erc20_transfer' panicked at 'called `Result::unwrap()` on an `Err` value: Failed { msg: "relayer error: response body fails to deserialize: io error: unexpected end of file" }', wallet/src/backend.rs:995:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
   2: core::result::unwrap_failed
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/result.rs:1749:5
   3: cape_wallet::backend::test::test_anonymous_erc20_transfer::{{closure}}
   4: std::thread::local::LocalKey<T>::with
   5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
   6: async_io::driver::block_on
   7: std::thread::local::LocalKey<T>::with
   8: std::thread::local::LocalKey<T>::with
   9: async_std::task::builder::Builder::blocking
  10: core::ops::function::FnOnce::call_once
  11: core::ops::function::FnOnce::call_once
             at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    backend::test::test_anonymous_erc20_transfer

test result: FAILED. 14 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 69.93s

error: test failed, to rerun pass '-p cape_wallet --lib'
Sending HUP signal to run-geth: 14774
Cleaning up geth node
Removing geth data dir /run/user/1000/cap-ethereum-data-M56icRDG

@jbearer
Copy link
Member Author

jbearer commented Jun 22, 2022

@philippecamacho that appears unrelated but I've also never seen that error before. Can you checkout main and see if you get a similar error? I'll try running this test locally a few times as well before I merge this

@sveitser
Copy link
Contributor

For me all tests pass.

@sveitser
Copy link
Contributor

sveitser commented Jun 22, 2022

One thing I noticed is that sometimes the faucet process from the test does not seem to terminate after the tests are over.

@jbearer
Copy link
Member Author

jbearer commented Jun 22, 2022

I'm getting the same error on main. Not sure what's going on but I will go ahead and merge this and then try to figure it out

@jbearer jbearer merged commit 529efca into main Jun 22, 2022
@jbearer jbearer deleted the feat/faucet-scalability branch June 22, 2022 16:21
@jbearer
Copy link
Member Author

jbearer commented Jun 22, 2022

@philippecamacho @sveitser this failing test is because the relayer in this test uses port 60000, and I already had a process using port 60000...basically I think this test was sending relayer requests to the CAPE wallet API

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve scalability of Faucet
3 participants