Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh-gossip): connection loop misuses tokio::select! leading to read errors #2572

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

Frando
Copy link
Member

@Frando Frando commented Aug 1, 2024

Description

The connection loop of iroh-gossip misused tokio-select by selecting over a future that is not cancellation safe. This means that if the timings are bad, the message reading future would be aborted midway in a message, and then restart by reading a length, which would then yield some random number because it would be reading some random bytes in the middle of a message. This means it would lead to random connection drops.

Breaking Changes

Backport from #2570

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

Copy link

github-actions bot commented Aug 1, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2572/docs/iroh/

Last updated: 2024-08-01T14:35:35Z

@Frando Frando marked this pull request as ready for review August 1, 2024 14:46
@Frando Frando changed the title fix: connection loop was not cancellation safe fix(iroh-gossip): connection loop was not cancellation safe Aug 1, 2024
@Frando Frando changed the title fix(iroh-gossip): connection loop was not cancellation safe fix(iroh-gossip): connection loop misuses tokio::select! leading to read errors Aug 1, 2024
Copy link
Contributor

@dignifiedquire dignifiedquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we have this bug in other parts of the stack as well, iroh-net relay handshakes comes to mind

@Frando Frando added this pull request to the merge queue Aug 1, 2024
Merged via the queue into main with commit 32bb0f3 Aug 1, 2024
35 checks passed
@dignifiedquire dignifiedquire deleted the fix-gossip-cancel-safe branch August 1, 2024 19:03
@matheus23
Copy link
Contributor

We use tokio_util::codec::Framed everywhere else we do something like this. Perhaps we should do so here, too?
Framed implements Sink and Stream both in a cancel-safe manner, according to its docs.

I wonder if we have this bug in other parts of the stack as well, iroh-net relay handshakes comes to mind

Erm. Now I do, too. What exactly are you referring to, though?
We have iroh-net::relay::codec::read_frame, which is only used in recv_client_key, which is (IIUC), never used in a tokio::select!.
Is it something else you're thinking about?

@Frando
Copy link
Member Author

Frando commented Aug 2, 2024

Yeah, using framed makes sense here likely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants