Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm*/: Remove NetworkBehaviourEventProcess and generate OutEvent #2751

Closed
wants to merge 14 commits into from

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Jul 13, 2022

Description

This pull request does two things:

  1. Remove support for NetworkBehaviourEventProcess.

    I am of the opinion that NetworkBehaviourEventProcess is not worth the complexity, especially for newcomers. I am not aware of a use-case that can not be supported without.

  2. If user does not specify out_event = "", generate NetworkBehaviour::OutEvent enum

    Instead of requiring the user to provide an enum as the OutEvent of the derived NetworkBehaviour we can generate this enum in the procedural macro itself if not specified.

    See swarm-derive/tests/test.rs for examples how much this simplifies things.

Links to any relevant issues

Open Questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

1. Remove support for `NetworkBehaviourEventProcess`.

2. Generate `NetworkBehaviour::OutEvent` `enum`.
@mxinden mxinden requested a review from thomaseizinger July 13, 2022 06:05
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been toying with a similar idea in my head. The only problem I see is that the user has to interact with generated code: the enum.

This is not ideal. Really good proc-macros like serde use a lot of generated code but it is hidden from the end-user. The problem here is that people need to effectively understand, what the macro does to imagine what the enum looks like.

I think this is an improvement on the whole but if we can somehow fix this part too, then I don't see any downsides!

What do you think of:

  • Letting the user write the enum themselves
  • Providing a custom-derive OutEvent that generates the From implementations

We can enforce the name of the event via a naming convention and remove the out_event parameter from the NB derive.

Coupled with good error messages in the custom derive, we should be able to guide the user towards the solution?

@mxinden
Copy link
Member Author

mxinden commented Jul 13, 2022

The only problem I see is that the user has to interact with generated code: the enum.

Good point. Agree that this adds mental complexity. That said, if documented well, I think it is worth the complexity.

Letting the user write the enum

I would like the user to not have to write the enum out manually. I find this tedious and error prone.

@thomaseizinger
Copy link
Contributor

Letting the user write the enum

I would like the user to not have to write the enum out manually. I find this tedious and error prone.

It is tedious, I agree. I do want to mention though that generating the enum for the user also hosts other problems:

  • It takes away the ability for the user to control the visibility: pub, pub(super), etc
  • It is harder for the user to add more derives, be it their own or well-known ones like serde
  • A user may not want to actually track all variants and discard some of them

All of these are perhaps solvable in one way or another but the bigger picture is that we are taking away a lot of control from the user by generating the enum for them.

This ties in with the "really good proc-macros" from above: If the use of a piece of generated code is completely internal to the generated-code, you are in full control of all requirements. Once you expose a piece of generated code to the user, you are opening the door to more and more feature requests like the above list.

I am not sure if the added benefit of not writing the enum (which is a one-time cost) is worth this complexity.

@dignifiedquire
Copy link
Member

Regarding the enum, I would prefer having a gradual approach. Full generation, custom enum with automatic from generations, full custom. This will reduce the boilerplate but still allow users to upgrade to fully customized versions if desired.

@mxinden
Copy link
Member Author

mxinden commented Jul 14, 2022

It takes away the ability for the user to control the visibility: pub, pub(super), etc

Why would a user want the event to have a different visibility than the behaviour?

It is harder for the user to add more derives, be it their own or well-known ones like serde

I can not think of a scenario where one would need additional derive statements. As far as I know none of the sub-behaviour events derive serde, thus deriving serde on the top level enum would be in vain, right?

(I think I am missing to add the standard (e.g. Debug, PartialEq, ...) on the generated enum. Thanks for triggering that thought!)

A user may not want to actually track all variants and discard some of them

They can do so when matching on the event returned by Swarm. Note that this wasn't possible with event-process=false either.

Regarding the enum, I would prefer having a gradual approach. Full generation, custom enum with automatic from generations, full custom. This will reduce the boilerplate but still allow users to upgrade to fully customized versions if desired.

Can you expand on the concrete use-cases for each of these @dignifiedquire?

I would like to make the easy things easy, and the hard things possible? In my opinion this pull request makes the common case really easy and thus far I don't see any uncommon use-cases being impossible.

Comment on lines 617 to 627
fn capitalize(ident: Ident) -> Ident {
let name = ident.to_string();
let mut chars = name.chars();
let capitalized = chars
.next()
.expect("Identifier to have at least one character.")
.to_uppercase()
.collect::<String>()
+ chars.as_str();
Ident::new(&capitalized, Span::call_site().into())
}
Copy link
Contributor

@elenaf9 elenaf9 Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use capitalize for the Idents of our data struct fields, right? Since those are written in snake_case, shouldn't we iter through all chars, filter out "_" and capitalize the subsequent char? Otherwise we might end up with something like Gossipsub_behaviourEvent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Oh, good catch. Silly me.

}
_ => (),
}
let event = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not at least give the user the option to set their own OutEvent? Per default it could still be generated. This would also avoid breaking the implementation of current users that already use a custom OutEvent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which use-case would a user need their own OutEvent? Asked the other way around, which use-case would not be supported with a generated OutEvent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they strictly need their own OutEvent. But it may be more convenient. Especially because of the point that @thomaseizinger raised above, that is the interaction with a generated enum. As a user I'd always prefer to have my own type rather than one generated by a macro. Most likely I would parse the OutEvent into my custom type anyway, e.g. when bubbling the event up to a higher layer. So still having to deal with the generated one would probably just be extra work (understand how the enum looks like, where the variant names come from, handling changes in the enum if this variant names change because a field was renames, etc).

@rkuhn
Copy link
Contributor

rkuhn commented Jul 14, 2022

How about a hybrid approach regarding the event type: have the user spell out the precise enum they want to use (which is not much work and provides occasion for some doc comments) and do all the wiring in the proc macro.

pub(crate) enum MyEvent {
    Ping(PingEvent),
    ...
}

#[derive(NetworkBehaviour)]
#[behaviour(out_event = MyEvent)]
struct MyNet {
    #[event(Ping)]
    ping: ping::Behaviour,
    // events from this one will be ignored
    identify: Identify,
    ...
}

This could be extended to allow calling a function for turning an event into the out_event type (perhaps just form a path <out_event>::<ident> and invoke with a single argument, which covers all cases already).

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Jul 15, 2022

It takes away the ability for the user to control the visibility: pub, pub(super), etc

Why would a user want the event to have a different visibility than the behaviour?

Like I tried to point out above, I raised these points as examples of usecases, there are surely more than what I currently think of but these examples show that not having control over the generated enum can be a blocker that can only be resolved by patching the derive, which will trigger PR discussions because a solution has to be designed, requiring a release, etc.

It is harder for the user to add more derives, be it their own or well-known ones like serde

I can not think of a scenario where one would need additional derive statements. As far as I know none of the sub-behaviour events derive serde, thus deriving serde on the top level enum would be in vain, right?

You can't say that for sure. I user could use this derive with only behaviours that they write themselves where all events derive serde. Again, this example was meant as a thought experiment rather than a concrete problem that we need to solve.

(I think I am missing to add the standard (e.g. Debug, PartialEq, ...) on the generated enum. Thanks for triggering that thought!)

I think the above point may actually result in a real problem here: We don't which behaviours a user will compose with this macro and thus, we cannot possibly know, which "standard" derives to add. E.g. I don't think all our events will implement Ord so we can't derive that. But what if a user wants to have their OutEvent implement Ord and all their events implement Ord?

As much as I'd love for ergonomics of this macro to get better, I don't think removing control over the out event completely is a good solution.

How about the following (I've outlined it above briefly already, but here in more detail):

  1. Define the convention on what the out-event should be called (we can infer this from the NB name)
  2. Provide a derive(OutEvent) proc-macro that implements From for all branches

For example:

#[derive(NetworkBehaviour)]
struct Foo {
    ping: libp2p::ping::Ping,
}

#[derive(OutEvent)]
enum FooEvent {
	Ping(libp2p::ping::PingEvent)
}
  • The boilerplate is minimal (writing the enum is a one-time cost)
  • The macro is simple because we don't need any additional configuration options
  • We can enforce the conversion to the enum via From in the NB-derive macro
  • derive(OutEvent) would generate the From implementations
  • The user has full control over the enum without having to bother us with feature-requests

@rkuhn
Copy link
Contributor

rkuhn commented Jul 15, 2022

What would be the difference between #[derive(OutEvent)] and #[derive(derive_more::From)]?

@thomaseizinger
Copy link
Contributor

What would be the difference between #[derive(OutEvent)] and #[derive(derive_more::From)]?

Functionally, nothing but it:

  • saves users from adding another dependency
  • allows us to "hide" some of the details on how things work internally which should make things easier to understand for newcomers

@mxinden
Copy link
Member Author

mxinden commented Jul 18, 2022

Meta level: Thanks for all the input here!

With the above arguments, agreed that there should be a way to bring your own OutEvent.

Define the convention on what the out-event should be called (we can infer this from the NB name)

I would prefer making this explicit instead of inferring the enum based on its name. Thus I am would be in favor of sticking with #[behaviour(out_event = MyEvent)].

Provide a derive(OutEvent) proc-macro that implements From for all branches

Haven't put enough thought into this. Off the top of my head I would prefer #[derive(derive_more::From)] as suggested by @rkuhn. I too want to reduce the amount of dependencies in rust-libp2p, but in this case the dependency is just a perfect match.

allows us to "hide" some of the details on how things work internally which should make things easier to understand for newcomers

I expect users to be very familiar with the From trait. Thus I am not sure the custom generation is easier to understand than the well known From concept.


I would like to make the following suggestion. Note that I am still suggesting to additionally support the lazy case, where the OutEvent is generated by the procedural macro.

  • Have the user write their own enum used as the OutEvent.
  • Have the user specify the custom OutEvent in #[behaviour(out_event = MyEvent)].
  • Require OutEvent to implement From. Document that users can use derive_more.
  • If the user does not specify an out_event, generate the enum for the user.

What do you all think? Would you be fine with the option to generate the OutEvent if there is the alternative to bring your own?

@rkuhn
Copy link
Contributor

rkuhn commented Jul 18, 2022

That proposal covers all aspects but one — and I’m not certain that it is an important one so it might be fine: impl From can only distinguish based on source type, so if there are two nested behaviours that emit the same event type (e.g. two instances of a behaviour, parameterised differently) then the only way to distinguish their events is to use the generated OutEvent option. One way to fix this retroactively (i.e. if the situation actually does come up frequently enough) is to consider using From the default and handling a new attribute in the derive macro to customise the transformation.

So +1 from me :-)

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Jul 20, 2022

Define the convention on what the out-event should be called (we can infer this from the NB name)

I would prefer making this explicit instead of inferring the enum based on its name. Thus I am would be in favor of sticking with #[behaviour(out_event = MyEvent)].

Convention over configuration is a well established way of dealing with such problems but I if people feels strongly about wanting to configure the enum name, then I am happy to go with that.

As a user, I wouldn't care about the name, on the contrary, having the enum enforce it creates consistency!

Provide a derive(OutEvent) proc-macro that implements From for all branches

Haven't put enough thought into this. Off the top of my head I would prefer #[derive(derive_more::From)] as suggested by @rkuhn. I too want to reduce the amount of dependencies in rust-libp2p, but in this case the dependency is just a perfect match.

Implementation-wise it is a perfect match yes. I suggested this because it hides what people have to understand about this derive. From by itself is easy to understand. How the compiler or a macro composes together a tree of poll-style state machines is not. But to understand why you need From you need to think about that.

Contrarily, people are used to macros working in a way of: "Put this sticker here, this other one there and leave the rest to us!"

It also gives as flexibility to add things to the derive later if we ever need it. Not sure how good of an argument that is.

  • Have the user write their own enum used as the OutEvent.
  • Have the user specify the custom OutEvent in #[behaviour(out_event = MyEvent)].
  • Require OutEvent to implement From. Document that users can use derive_more.
  • If the user does not specify an out_event, generate the enum for the user.

What do you all think? Would you be fine with the option to generate the OutEvent if there is the alternative to bring your own?

I am okay with this solution too but I also think that it carries a higher complexity because there are more moving parts a user needs to understand.

Bottom line: These are my points, I'll leave the decision to you :)

@elenaf9
Copy link
Contributor

elenaf9 commented Jul 23, 2022

@mxinden:

I would like to make the following suggestion. Note that I am still suggesting to additionally support the lazy case, where the OutEvent is generated by the procedural macro.

  • Have the user write their own enum used as the OutEvent.
  • Have the user specify the custom OutEvent in #[behaviour(out_event = MyEvent)].
  • Require OutEvent to implement From. Document that users can use derive_more.
  • If the user does not specify an out_event, generate the enum for the user.

What do you all think? Would you be fine with the option to generate the OutEvent if there is the alternative to bring your own?

👍 sounds good to me.

@thomaseizinger:

The macro is simple because we don't need any additional configuration options

How about the following (I've outlined it above briefly already, but here in more detail):

1. Define the convention on what the out-event should be called (we can infer this from the NB name)

2. Provide a `derive(OutEvent)` proc-macro that implements `From` for all branches

For example:

#[derive(NetworkBehaviour)]
struct Foo {
    ping: libp2p::ping::Ping,
}

#[derive(OutEvent)]
enum FooEvent {
	Ping(libp2p::ping::PingEvent)
}

(...)
- The macro is simple because we don't need any additional configuration options

We would still need the configuration option to inform the macro that we use a custom event, no?
I am also more in favor of making the enum name explicit.

@mxinden
Copy link
Member Author

mxinden commented Jul 26, 2022

I would like to make the following suggestion. Note that I am still suggesting to additionally support the lazy case, where the OutEvent is generated by the procedural macro.

* Have the user write their own `enum` used as the `OutEvent`.

* Have the user specify the custom `OutEvent` in `#[behaviour(out_event = MyEvent)]`.

* Require `OutEvent` to implement `From`. Document that users can use `derive_more`.

* If the user does not specify an `out_event`, generate the `enum` for the user.

What do you all think? Would you be fine with the option to generate the OutEvent if there is the alternative to bring your own?

I adjusted the pull request to implement the suggestion above. I would very much appreciate reviews.


Regarding #[derive(OutEvent)] vs #[derive(derive_more::From)], I don't have a strong opinion on this. This pull request does not implement the former, nor does it use the latter. Still I think the pull request adds value in its status quo, thus I am suggesting to merge without either of the two. Is that OK for folks?

@thomaseizinger
Copy link
Contributor

Answering for completness, not to question the implementation

- The macro is simple because we don't need any additional configuration options

We would still need the configuration option to inform the macro that we use a custom event, no?

@elenaf9 The idea was to always require a custom event, i.e. to not auto-generate anything but expect the user to provide a type in the same module that follows the naming convention of NetworkBehaviourName + Event, i.e.FooBehaviourEvent for a NB FooBehaviour.

With this assumption, no configuration would be necessary.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a nice, coherent improvement.

@@ -20,6 +20,7 @@

#![recursion_limit = "256"]

use heck::ToUpperCamelCase;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +509 to +518
let generate_event_match_arm = {
// If the `NetworkBehaviour`'s `OutEvent` is generated by the derive macro, wrap the sub
// `NetworkBehaviour` `OutEvent` in the variant of the generated `OutEvent`. If the
// `NetworkBehaviour`'s `OutEvent` is provided by the user, use the corresponding `From`
// implementation.
let into_out_event = if out_event_definition.is_some() {
quote! { #out_event_name::#event_variant(event) }
} else {
quote! { event.into() }
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I wanted to use From in both cases, though Rust has issues with the implementation below:

impl From<<SomeBehaviour as NetworkBehaviour>::OutEvent> for GeneratedOutEvent

Namely with <SomeBehaviour as NetworkBehaviour>::OutEvent not being specific enough.

Comment on lines 37 to 38
#[derive(NetworkBehaviour)]
struct MyBehaviour {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the reference to the custom out event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 thanks for the catch.

@@ -201,18 +166,27 @@ fn custom_event_no_polling() {
fn custom_event_and_polling() {
#[allow(dead_code)]
#[derive(NetworkBehaviour)]
#[behaviour(poll_method = "foo", out_event = "String", event_process = true)]
#[behaviour(poll_method = "foo", out_event = "MyEvent")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the poll_method config? I don't think there is a usecase now that we removed NetworkBehaviourEventProcess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need neither poll_method nor #[behaviour(ignore)], though I would start this discussion once this is merged. Does that sound good to you @thomaseizinger or would you prefer removing the two within this pull request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right.

We might also want to consider getting a release out first. With NetworkBehaviourEventProcess being removed, the removal of poll_method and ignore should be easy, i.e. most people will have changed their code already.

Actually, what do you think of kicking out a release before this PR to deprecate NetworkBehaviourEventProcess? Depending on the design of the application, changing this could be a serious effort and just removing it may block users from upgrading for a while.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what do you think of kicking out a release before this PR to deprecate NetworkBehaviourEventProcess? Depending on the design of the application, changing this could be a serious effort and just removing it may block users from upgrading for a while.

How would the two scenarios below differ in terms of user experience?

  1. We deprecate NetworkBehaviourEventProcess, cut a release, users upgrade removing their NetworkBehaviourEventProcess implementations, we merge here, cut a second release, users update to the second release.
  2. We merge here, cut a release, users upgrade removing their NetworkBehaviourEventProcess implementations.

Is (2) not just simpler than (1)? Are there any changes in master that users need in a timely fashion where the removal of NetworkBehaviourEventProcess would add a painful delay @thomaseizinger?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that deprecation warnings can be resolved incrementally whereas compile errors have to be fixed in one go.

I don't have a strong opinion on it but I do think that deprecating first will be easier for users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the application, I could imagine this change to require some serious work. I think you actually once mentioned a case where a libp2p-powered app basically had its entire app logic in these callbacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 makes sense to me. Thanks @thomaseizinger. In addition, it might surface a use-case that we have not thought about and that might not be supported by out_event.

Want to take a look at #2784?

mxinden added a commit to mxinden/rust-libp2p that referenced this pull request Jul 31, 2022
mxinden added a commit that referenced this pull request Aug 16, 2022
mxinden added a commit to mxinden/rust-libp2p that referenced this pull request Aug 24, 2022
Removes the `NetworkBehaviourEventProcess` and all its associated logic.

See deprecation pull request libp2p#2784.

Find rational in libp2p#2751.
@mxinden
Copy link
Member Author

mxinden commented Aug 24, 2022

Replacing this pull request with #2840 on top of the latest v0.47.0 release.

@mxinden mxinden closed this Aug 24, 2022
mxinden added a commit that referenced this pull request Aug 26, 2022
Removes the `NetworkBehaviourEventProcess` and all its associated logic.

See deprecation pull request #2784.

Find rational in #2751.
umgefahren pushed a commit to umgefahren/rust-libp2p that referenced this pull request Mar 8, 2024
…2p#2840)

Removes the `NetworkBehaviourEventProcess` and all its associated logic.

See deprecation pull request libp2p#2784.

Find rational in libp2p#2751.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants