Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network/litep2p: High number of connection errors during handshake #5239

Open
lexnv opened this issue Aug 5, 2024 · 0 comments
Open

network/litep2p: High number of connection errors during handshake #5239

lexnv opened this issue Aug 5, 2024 · 0 comments

Comments

@lexnv
Copy link
Contributor

lexnv commented Aug 5, 2024

Litep2p reports a high number of connection errors with reason "other".
This is considerably higher than the libp2p network backend.

Screenshot 2024-08-05 at 13 30 58

@lexnv lexnv added this to Networking Aug 5, 2024
lexnv added a commit to paritytech/litep2p that referenced this issue Aug 21, 2024
… error reporting (#206)

The purpose of this PR is to pave the way for making the Identify
protocol more robust, which is currently linked with the low number of
peers and connective issues over a long period of time
- paritytech/polkadot-sdk#4925

This PR adds a coherent `DialError` that exposes the minimal information
users need to know about dial failures.
- paritytech/polkadot-sdk#5239

A new litep2p event is added for reporting multiple dial errors that
occur on different protocols back to the user:

```rust
    /// A list of multiple dial failures.
    ListDialFailures {
        /// List of errors.
        ///
        /// Depending on the transport, the address might be different for each error.
        errors: Vec<(Multiaddr, DialError)>,
    },
```

This event eases the debugging of substrate connectivity issues. At the
same time, it can be used in a future PR to inform back to the Identify
protocol which self-reported addresses of some peers are unreachable:
- #203

### Next Steps
- Add more tests
- Warp sync + sync full nodes since this is touching individual
transports

### Future Work
- The overarching `litep2p::Error` needs a closer look and a
refactoring:
  - #204
  - #128
  
- ConnectionError event for individual transports can be simplified:
  - #205
  
- I've observed some inconsistencies in handling TCP vs WebSocket
connection timeouts. I believe that we can have another pass and share
even more code between them:
  - #70

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Dmitry Markin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant