Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid nonce on parallel tests via transact() #254

Open
Tarnadas opened this issue Dec 9, 2022 · 7 comments
Open

invalid nonce on parallel tests via transact() #254

Tarnadas opened this issue Dec 9, 2022 · 7 comments

Comments

@Tarnadas
Copy link

Tarnadas commented Dec 9, 2022

Hey,

in my integration tests I have couple of tests that send very many transactions in parallel.
It's in fact so many, that I got an OS error, because it could not make that many requests in parallel, so I at most send 1k tx now in parallel and then wait until some of them have finished to send the next batch.

I found some issues (#163, #228), but am not sure how they relate.
I use transact() instead of transact_async(). Honestly I have noticed the _async function too late, which is why I still use transact().

The problem is that the bug is not easily reproducible, because it only happens about 1 out of 10 times maybe when running one of these long running tests.
The error looks as follows:

Error: handler error: [An error happened during transaction execution: InvalidNonce { tx_nonce: 17002007, ak_nonce: 17002007 }]

To make it run in parallel I spawn async tasks via tokio and put them into a task queue like this:

    let mut tasks: Vec<JoinHandle<anyhow::Result<()>>> = vec![];
    for item in items {
        tasks.push(tokio::spawn(async move {
            // ...
        }));

        if tasks.len() >= 1_000 {
            for task in tasks.drain(..10) {
                task.await??;
            }
        }
    }

    for task in tasks {
        task.await??;
    }

I might be able to give you the full repo soon™, since we're about to open source it at some point.

If you wan't to I can try to make it run with transact_async(), but it's quite some refactoring that I need to do for this to work, because in my integration tests it checks dynamically whether a transaction can be async or not based on a scenario I give it as an input, which is defined in a struct like so:

pub struct TestCase<'a> {
    pub base_token: Token<'a>,
    pub quote_token: Token<'a>,
    pub accounts: HashMap<AccountId, TestAccount>,
    pub orders: Vec<ExecutableOrderWithOutcome>,
}

pub struct ExecutableOrderWithOutcome {
    pub sender_id: AccountId,
    pub order: ExecutableOrder,
    pub outcome: Option<ExpectedOutcome>,
}

If ExecutableOrderWithOutcome#outcome is None, then it can run in parallel, otherwise it waits until all tx have finished up until that point

@ChaoticTempest
Copy link
Member

ChaoticTempest commented Dec 9, 2022

So a couple questions first:

  • what version of workspaces are you using?
    • if you're using version < 0.7.0, then there's potentially a bug you're running with reusing the same nonce depending on how your transactions are being performed in relation to new public-keys and access-keys.
  • what OS are you using? -- I mainly test on an M1 so other OS might have issues when it comes to OS-primitives used in rust.
  • How many worker_threads did you specify for tokio runtime if at all?

Also, the following processes transactions sequentially anyways for tokio spawned tasks:

    for task in tasks {
        task.await??;
    }

You can try using futures::future::join_all(tasks).await; instead for polling the tasks in intervals, and see if that helps. It's kinda weird that bug is presenting itself for sequential tasks

@Tarnadas
Copy link
Author

I'm still @workspaces 0.6.1, so that might possibly fix it. Sorry for not checking that I'm on the latest version before opening a ticket.

I'm on Linux Manjaro 22.

I did not specify worker threads, so it should be 16, 1 for each CPU.

The tasks are not run sequentially, since they are spawned via tokio::spawn or am I missing something? Otherwise a test would last forever.

@ChaoticTempest
Copy link
Member

I'm still @workspaces 0.6.1, so that might possibly fix it. Sorry for not checking that I'm on the latest version before opening a ticket.

No worries, let me know if that fixes things. If not I'll take a deeper look when i get the chance.

The tasks are not run sequentially, since they are spawned via tokio::spawn or am I missing something? Otherwise a test would last forever.

nevermind about this one. They should be running in the background.

@Tarnadas
Copy link
Author

So with workspaces >= 0.7 this error should never be displayed where tx_nonce and ak_nonce are the same?
Because I just got this failure after workspaces update

Error: unable to broadcast the transaction to the network

Caused by:
    handler error: [An error happened during transaction execution: InvalidNonce { tx_nonce: 68000016, ak_nonce: 68000016 }]

@ChaoticTempest
Copy link
Member

@Tarnadas hmm, now that's quite peculiar. If you don't mind, can I take a look at your repo to see what's up? And this is still for calls to transact correct?

@Tarnadas
Copy link
Author

Tarnadas commented Feb 6, 2023

Hey, sorry for late reply. I'm still looking into making this open source and will keep you updated once it's done.

Yes this is using transact

@ghost
Copy link

ghost commented Nov 7, 2023

Does this issue still persist @Tarnadas ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants