Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CX_CLEANUP] Integrate new Task, in the code and remove the old #2493

Merged
merged 34 commits into from
Feb 10, 2024

Conversation

bfish713
Copy link
Collaborator

@bfish713 bfish713 commented Jan 31, 2024

Closes #2428

This PR:

The main goal of this PR is to replace the ChannelStream with async_broadcast channels. To do that it means we have to replace everything in crates/task because all of the task infrastructure was heavily tied to ChannelStream. The previous PR: #2500 is for the implementation of the Task replacement.

There were 3 areas that needed refactoring, Initialization and startup, the task_impls crate, and the testing architecture. I'll describe the changes below:

  • Initialization - We can completely get rid of the task builders and generators that were there before. Since a task is just a state struct and a Sender, Receiver pair all we need to do is create the tasks states for each task and give them a clone of the event stream senders and receivers. Further since each TaskState we create implements filter_event and handle_event we don't need to construct those function objects during task creation. Finally the TaskRegistry is simplified such that it just collects the JoinHandles of each task so we only have to do: task_registry.register(task.run()).await which spawns the task and registers the JoinHandle. Register is async not because it's waiting for the task to complete, but because it internally is locking the collection of tasks to insert the new one, the task is async_spawned.
  • Task_impls - The main changes here are to implement TaskState for each ...TaskState types (e.g. ConsensusTaskState). This is pretty simple, just calling handle_event function we already had implemented, and not returning the state anymore. Now we have an &mut self so we just modify the our state during handling. The other thing is to replace the event_stream.pulish with the new event_broadcast. This should is functionally equivalent from the tasks perspective.
  • Testing infra - This was probably the most tricky part. I got rid of the builder pattern of passing closures to create each task. Instead test runner just creates each test task directly using the metadata. Unlike with task_impls the actual test tasks were not all able to be converted to TestTaskState. For the more complex one we implement that trait, but for completion task and transaction task it's simpler to just create simple loop and spawn that. Finally for unit tests we are able finally not have to duplicate the input on the output. We simply create the channels such that the test sends into the task and the task we are testing sends back to the test. This means the task won't receive its own events but I think that's ok since we should be controlling the input events ourselves.

Some other things to note:

  • The mechanism for shutdown is pretty different. We do have the task_registry which could do shutdown but probably we shouldn't need that. In most case sending the the Shutdown event should cause the tasks to exit, may need to do something a bit more complex but right now we don't perform any extra steps on shutdown anyway.
  • There is now no more sleep inside the task architecture. This means tasks should yield to each other and wakeup with new events on the input stream naturally. Theoretically I'd expect this to faster than before. In actuality it's probably negligible since we are sleeping in the webserver between polls which probably dominated the task sleeps.
  • A Sender by default waits for there to be an active receiver. This is fine for the main event stream since every task will have an active receiver (all async_broadcast::Recerivers are active when created). However for the output events we have a choice. If we keep an active Receiver around then it will queue up messages until it overflows (or blocks until the somebody calls recv on it, we can choose this mode). The next person to receive from it (really clone from it will get an overflow error first then the oldest message in the queue. Or we can keep and InactiveReceiver and the sender will get an error if there is no active receiver. I choose this approach because it means if the application clones the receiver (calls get_event_stream) then it won't have to deal with an error and a bunch of arbitrarily old events. It's a bit in the weeds but seeing InactiveReceiver might be otherwise confusing.
  • I got rid of two test files that were empty/commented out. I think this changes things enough that if we want to implement those we're probably just was well starting from a clean slate then some broken commented out code
  • Task IDs and Names are not a thing anymore. The registry could handle those but I don't really see a use for them.
  • This will only slightly break the sequencer. Receiver impls Stream so they can continue to using the event_stream from get_event_stream like a stream, but it won't have an ID anymore. That's fine since they don't use that or could create their own ids if they really wanted to when they get the stream.

I was hoping to be able to break this up but it's kind of all or nothing since ChannelStream was the fabric that tied everything together and now it's replaced.

This PR does not:

The internal logic of the tests and consensus should not be changed logically, it should only be how we run tasks and pass messages between them.

Key places to review:

All the asks in crates/task_impl.rs and crates/testing/source.
All the places we create and spawn tasks: SystemContex::run_tasks, crates/hotshot/src/tasks/mods.rs, crates/testing/src/test_runner.rs, and the new event stream interface in crates/hotshot/src/types/handle.rs

DO NOT REVIEW /crates/task HERE. REVIEW THEM IN: #2500

@bfish713 bfish713 changed the title Bf/new task design [CX_CLEANUP] Integrate new Task, in the code and remove the old Feb 1, 2024
@bfish713 bfish713 marked this pull request as ready for review February 1, 2024 20:49
Copy link
Member

@shenkeyao shenkeyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Not really requesting changes, but having some questions I'd like to ask before approving the PR.

crates/task-impls/src/consensus.rs Show resolved Hide resolved
crates/task-impls/src/consensus.rs Show resolved Hide resolved
crates/task-impls/src/harness.rs Outdated Show resolved Hide resolved
crates/task-impls/src/network.rs Outdated Show resolved Hide resolved
crates/task-impls/src/network.rs Show resolved Hide resolved
crates/testing/src/test_launcher.rs Outdated Show resolved Hide resolved
crates/hotshot/src/tasks/mod.rs Outdated Show resolved Hide resolved
crates/hotshot/src/lib.rs Outdated Show resolved Hide resolved
crates/testing/tests/consensus_task.rs Show resolved Hide resolved
crates/hotshot/src/lib.rs Outdated Show resolved Hide resolved
crates/hotshot/src/tasks/mod.rs Show resolved Hide resolved
crates/hotshot/src/tasks/mod.rs Show resolved Hide resolved
crates/task-impls/src/consensus.rs Show resolved Hide resolved
crates/task-impls/src/consensus.rs Outdated Show resolved Hide resolved
crates/task-impls/src/view_sync.rs Show resolved Hide resolved
crates/task/Cargo.toml Outdated Show resolved Hide resolved
crates/task/src/dependency.rs Show resolved Hide resolved
crates/task/src/task.rs Show resolved Hide resolved
crates/task/src/task.rs Outdated Show resolved Hide resolved
@DieracDelta
Copy link
Contributor

DieracDelta commented Feb 9, 2024

Generally looks great!! Love the simplification, and the dependency trait logic. I only had a couple of small nits

@bfish713 bfish713 requested review from ss-es and DieracDelta February 9, 2024 16:48
shenkeyao
shenkeyao previously approved these changes Feb 9, 2024
Copy link
Member

@shenkeyao shenkeyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, though some nit comments from Justin aren't resolved yet.

event: Self::Event,
task: &mut Task<Self>,
) -> Option<HotShotTaskCompleted> {
// TODO: Don't clone the sender
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm resolving this conversation since the TODO is still here.

shenkeyao
shenkeyao previously approved these changes Feb 9, 2024
@bfish713 bfish713 merged commit 5503a2c into main Feb 10, 2024
14 checks passed
@bfish713 bfish713 deleted the bf/new-task-design branch February 10, 2024 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CX-CLEANUP] - Replace ChannelStream And Task with new async_broadcast and new HotShotTask
4 participants