Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iroh): Gossip client #2258

Merged
merged 30 commits into from
Jul 5, 2024
Merged

feat(iroh): Gossip client #2258

merged 30 commits into from
Jul 5, 2024

Conversation

rklaehn
Copy link
Contributor

@rklaehn rklaehn commented May 1, 2024

Description

This makes gossip available in iroh. Using iroh-gossip directly, while not horrible, is a bit verbose.

Breaking Changes

Not sure. It exports a few more things, and adds things. But in theory it should not modify existing things.

There should be none, and the semver test seems to agree...

Notes & open questions

  • There can be some scenarios where this can cause trouble. E.g. when subscribing and then unsubscribing to a topic that is also used for doc sync.

This ^ is taken care of, since docs no longer use subscribe_all!

How to deal with missed messages due to slow reader:

I think that relying on guaranteed delivery for gossip is not a good idea in any case. so maybe just sending a Lagged but then continuing is best? Forcing the client to resubscribe could be a bit annoying.

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.
  • All breaking changes documented.

and make a fuew things public
@rklaehn rklaehn requested a review from Frando May 1, 2024 13:23
@rklaehn
Copy link
Contributor Author

rklaehn commented May 1, 2024

I think it is safe to just ignore any messages that are not for us. See b88398e

Obviously you can mess with things by subscribing to a namespace that is currently being synced, but doing so accidentally is highly unlikely. And you can also mess with other stuff by using the low level blobs api, so 🤷 .

@rklaehn
Copy link
Contributor Author

rklaehn commented May 1, 2024

It seems that due to the fact that the rpc protocol is public, we can not ever do an additional rpc request without breaking compat.

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.31.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field Iroh.gossip in /home/runner/work/iroh/iroh/iroh/src/client.rs:44

--- failure enum_variant_added: enum variant added on exhaustive enum ---

Description:
A publicly-visible enum without #[non_exhaustive] has a new variant.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#enum-variant-new
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.31.0/src/lints/enum_variant_added.ron

@dignifiedquire
Copy link
Contributor

we need to mark the rpc enums as non exhaustive, that should solve this

@dignifiedquire
Copy link
Contributor

longterm I wish the rpc types where all private anyway

@dignifiedquire
Copy link
Contributor

It seems that the sync engine thinks that it "owns" gossip, so it uses subscribe_all

@Frando can we change this? seems unfortunate to subscribe to everything by default

@Frando
Copy link
Member

Frando commented May 3, 2024

It seems that the sync engine thinks that it "owns" gossip, so it uses subscribe_all

@Frando can we change this? seems unfortunate to subscribe to everything by default

I think we can, yes. What is our primitive of choice these days to merge a changing list of streams?

with futures and thus its merge combinators phased out in iroh - maybe https://docs.rs/futures-concurrency/latest/futures_concurrency/stream/trait.Merge.html ?

@dignifiedquire
Copy link
Contributor

@Frando
Copy link
Member

Frando commented May 3, 2024

I did not yet give the new gossip dispatcher module a real review, but my first thought is: What about we move the gossip_dispatcher from iroh to iroh-gossip? And make it the main interface. Because then we don't need broadcast channels at all anymore. And save one channel with its buffers. The sync engine would then use the gossip dispatcher, as would the RPC handlers. And if we get things right, less code - because the sync engine also dances around the different state (but your dispatcher has the nicer impl with the state enum).

@Frando
Copy link
Member

Frando commented May 6, 2024

#2265 changes the sync engine usage of gossip to not use subscribe_all (based on this branch).
However as outlined in the previous comment I think I'd prefer to move the new gossip_dispatcher module to iroh_gossip, and make both the node's gossip client and the node's sync engine use that dispatcher.

@ramfox ramfox added this to the v0.18.0 milestone Jun 3, 2024
@rklaehn
Copy link
Contributor Author

rklaehn commented Jun 5, 2024

@Frando so the basic idea of the gossip dispatcher is just to have a thing that wraps the gossip and does some management so that it is less fragile to use. It does the join/unjoin etc. for you and deals with laggy subscribers.

If you agree with / can live with the public interface (which is just the subscribe function), we could move something with that signature into iroh-gossip as a higher level interface and then I would use it here, and you could update docs to use it (or not...).

The interface is basically this:

pub fn subscribe(
    &self,
    msg: GossipSubscribeRequest,
    updates: UpdateStream,
) -> impl Stream<Item = RpcResult<GossipSubscribeResponse>> {

where UpdateStream is just a stream of either broadcast or broadcast_neighbours:

/// Send a gossip message
#[derive(Serialize, Deserialize, Debug)]
pub enum GossipSubscribeUpdate {
    /// Broadcast a message to all nodes in the swarm
    Broadcast(Bytes),
    /// Broadcast a message to all direct neighbors
    BroadcastNeighbors(Bytes),
}

There is a hardcoded constant somewhere about how much you are allowed to lag before you are thrown out. We could make that configurable as well.

@Frando
Copy link
Member

Frando commented Jun 5, 2024

I think the API is fine. And still very much in favor of having only one and not 2 high level dispatch interfaces to gossip.

@rklaehn
Copy link
Contributor Author

rklaehn commented Jun 5, 2024

Ok, I will clean up the dispatcher a bit and make sure it is independent of the rpc protocol, in preparation for moving it into iroh-gossip.

@rklaehn
Copy link
Contributor Author

rklaehn commented Jun 5, 2024

Moved the dispatcher to iroh-gossip and moved the types into the dispatcher file.

I reexported some of the dispatcher types instead of making newtypes for them in the rpc protocol. Not 100% sure about this, but making so many single element newtypes is a drag...

One question is if the dispatcher should present a simplified interface for creation. Currently it takes a Gossip. But I don't really see how this can be made really nice. Take an endpoint and an options object?

# Conflicts:
#	Cargo.lock
#	iroh-docs/src/engine/gossip.rs
#	iroh-gossip/Cargo.toml
#	iroh/src/client.rs
#	iroh/src/node.rs
#	iroh/src/node/builder.rs
#	iroh/src/rpc_protocol.rs
@rklaehn rklaehn marked this pull request as ready for review June 6, 2024 09:11
@rklaehn
Copy link
Contributor Author

rklaehn commented Jun 6, 2024

This is missing adapting the docs stuff to the new gossip dispatcher. Other than that I think it is good to go.

- connect two nodes
- send a message from node 1
- recv it on node 2
otherwise it is impossible to write certain tests using just the client.
impl deref for backwards compat (maybe we can remove this at some point)
# Conflicts:
#	iroh/src/client.rs
this test is not meant to be a discovery test...
# Conflicts:
#	iroh/src/client.rs
@rklaehn
Copy link
Contributor Author

rklaehn commented Jul 1, 2024

@Frando I have added a test. It seems to work, now that I have added the ability to tell the endpoint about node addrs (see #2433 ).

I think this kind of test is nice, because it forces us to make the client API sufficiently rich that you can do tests with it. Also it is much less likely to fluke out than a full cli type tests.

But - do you think this test for gossip will reliably work? I am connecting two nodes and then send a msg from one to the next. Can there be a situation where I send the message too early and it gets lost?

iroh/tests/client.rs Outdated Show resolved Hide resolved
@Frando
Copy link
Member

Frando commented Jul 5, 2024

But - do you think this test for gossip will reliably work? I am connecting two nodes and then send a msg from one to the next. Can there be a situation where I send the message too early and it gets lost?

Yes, Gossip::join is awaited here, and that waits for at least one neighbor to come up before completing.

So the remaining TODO is to merge the two dispatchers (in dispatcher.rs and net.rs) so that messages flow only over a single channel and not two. We should do this, but IMO it can also happen in a followup.

@rklaehn rklaehn added this pull request to the merge queue Jul 5, 2024
Merged via the queue into main with commit b0d5413 Jul 5, 2024
31 checks passed
@rklaehn rklaehn deleted the gossip-client branch July 5, 2024 07:12
github-merge-queue bot pushed a commit that referenced this pull request Aug 5, 2024
…opic (#2570)

## Description

This PR changes the main public `iroh_gossip` to keep track of
client-side gossip subscriptions. The `net::Gossip` struct now keeps
track of client-side subscribers per topic, which are made up of a pair
of two streams/channels: from the client to the actor a stream of
updates (outgoing messages) and from the actor to the client a stream of
events (incoming messages). Once all client streams&sinks for a topic
are dropped, the topic is being quit.

This builds on the client API added in #2258, but completely removes the
`dispatcher` module, integrating its features directly into the gossip
actor. See below for a short list of the API changes. The new API can be
browsed
[here](https://n0-computer.github.io/iroh/pr/2570/docs/iroh/gossip/net/index.html).

The refactor turned out bigger than initially intended, sorry for that,
but I did not see a good way to reduce the scope.

What's still missing (can also be follow-ups)?:

- [ ] Review the new public API
- [ ] Align the client API to the iroh_gossip API. The `GossipTopic` can
be made to work on both the client and the native API, as it only deals
with streams and sinks.

## Breaking Changes

* `iroh_gossip::dispatcher` is removed with everything that was in it.
use the new API from `iroh_gossip::net::Gossip` instead (see below).
* `iroh_gossip::net::Gossip` methods changed:
  * changed: `join` now returns a `GossipTopic`
* removed: `broadcast`, `broadcast_neighbors`, `subscribe`,
`subscribe_all`, `quit`.
* for `subscribe` use `join` instead, which returns a `GossipTopic`
* for `broadcast` and `broadcast_neighbors` use the respective methods
on `GossipTopic` .
* `quit` is obsolete now, the topic will be quitted once all
`GossipTopic` handles are dropped.
      * `subscribe_all` is no longer available
* `iroh_gossip::net::JoinTopicFut` is removed (is now obsolete)

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
rklaehn added a commit to n0-computer/iroh-blobs that referenced this pull request Oct 22, 2024
…and use ref-cast to eliminate them entirely (#2350)

## Description

With 4 different clients, the current approach might be OK. But we are
going to have more. E.g. gossip, see
n0-computer/iroh#2258 .

And in any case it feels weird to store the same thing multiple times.

So this replaces public fields in the iroh client with accessors and use
ref-cast to eliminate them entirely. There is now only one rpc field,
and you can switch from that to the different subsystem client views
without runtime overhead, not even an arc clone.

## Breaking Changes

Everything that uses the iroh client will have to switch from field
accesses to accessor fns.

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
matheus23 pushed a commit that referenced this pull request Nov 14, 2024
## Description

This makes gossip available in iroh. Using iroh-gossip directly, while
not horrible, is a bit verbose.

## Breaking Changes

~~Not sure. It exports a few more things, and adds things. But in theory
it should not modify existing things.~~

There should be none, and the semver test seems to agree...

## Notes & open questions

- ~~There can be some scenarios where this can cause trouble. E.g. when
subscribing and then unsubscribing to a topic that is also used for doc
sync.~~

This ^ is taken care of, since docs no longer use subscribe_all!

How to deal with missed messages due to slow reader:

I think that relying on guaranteed delivery for gossip is not a good
idea in any case. so maybe just sending a Lagged but then continuing is
best? Forcing the client to resubscribe could be a bit annoying.

## Change checklist

- [x] Self-review.
- [x] Documentation updates if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
matheus23 pushed a commit that referenced this pull request Nov 14, 2024
…opic (#2570)

## Description

This PR changes the main public `iroh_gossip` to keep track of
client-side gossip subscriptions. The `net::Gossip` struct now keeps
track of client-side subscribers per topic, which are made up of a pair
of two streams/channels: from the client to the actor a stream of
updates (outgoing messages) and from the actor to the client a stream of
events (incoming messages). Once all client streams&sinks for a topic
are dropped, the topic is being quit.

This builds on the client API added in #2258, but completely removes the
`dispatcher` module, integrating its features directly into the gossip
actor. See below for a short list of the API changes. The new API can be
browsed
[here](https://n0-computer.github.io/iroh/pr/2570/docs/iroh/gossip/net/index.html).

The refactor turned out bigger than initially intended, sorry for that,
but I did not see a good way to reduce the scope.

What's still missing (can also be follow-ups)?:

- [ ] Review the new public API
- [ ] Align the client API to the iroh_gossip API. The `GossipTopic` can
be made to work on both the client and the native API, as it only deals
with streams and sinks.

## Breaking Changes

* `iroh_gossip::dispatcher` is removed with everything that was in it.
use the new API from `iroh_gossip::net::Gossip` instead (see below).
* `iroh_gossip::net::Gossip` methods changed:
  * changed: `join` now returns a `GossipTopic`
* removed: `broadcast`, `broadcast_neighbors`, `subscribe`,
`subscribe_all`, `quit`.
* for `subscribe` use `join` instead, which returns a `GossipTopic`
* for `broadcast` and `broadcast_neighbors` use the respective methods
on `GossipTopic` .
* `quit` is obsolete now, the topic will be quitted once all
`GossipTopic` handles are dropped.
      * `subscribe_all` is no longer available
* `iroh_gossip::net::JoinTopicFut` is removed (is now obsolete)

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants