[CX_CLEANUP] Integrate new `Task`, in the code and remove the old #2493

bfish713 · 2024-01-31T22:41:15Z

This PR:

The main goal of this PR is to replace the ChannelStream with async_broadcast channels. To do that it means we have to replace everything in crates/task because all of the task infrastructure was heavily tied to ChannelStream. The previous PR: #2500 is for the implementation of the Task replacement.

There were 3 areas that needed refactoring, Initialization and startup, the task_impls crate, and the testing architecture. I'll describe the changes below:

Initialization - We can completely get rid of the task builders and generators that were there before. Since a task is just a state struct and a Sender, Receiver pair all we need to do is create the tasks states for each task and give them a clone of the event stream senders and receivers. Further since each TaskState we create implements filter_event and handle_event we don't need to construct those function objects during task creation. Finally the TaskRegistry is simplified such that it just collects the JoinHandles of each task so we only have to do: task_registry.register(task.run()).await which spawns the task and registers the JoinHandle. Register is async not because it's waiting for the task to complete, but because it internally is locking the collection of tasks to insert the new one, the task is async_spawned.
Task_impls - The main changes here are to implement TaskState for each ...TaskState types (e.g. ConsensusTaskState). This is pretty simple, just calling handle_event function we already had implemented, and not returning the state anymore. Now we have an &mut self so we just modify the our state during handling. The other thing is to replace the event_stream.pulish with the new event_broadcast. This should is functionally equivalent from the tasks perspective.
Testing infra - This was probably the most tricky part. I got rid of the builder pattern of passing closures to create each task. Instead test runner just creates each test task directly using the metadata. Unlike with task_impls the actual test tasks were not all able to be converted to TestTaskState. For the more complex one we implement that trait, but for completion task and transaction task it's simpler to just create simple loop and spawn that. Finally for unit tests we are able finally not have to duplicate the input on the output. We simply create the channels such that the test sends into the task and the task we are testing sends back to the test. This means the task won't receive its own events but I think that's ok since we should be controlling the input events ourselves.

Some other things to note:

The mechanism for shutdown is pretty different. We do have the task_registry which could do shutdown but probably we shouldn't need that. In most case sending the the Shutdown event should cause the tasks to exit, may need to do something a bit more complex but right now we don't perform any extra steps on shutdown anyway.
There is now no more sleep inside the task architecture. This means tasks should yield to each other and wakeup with new events on the input stream naturally. Theoretically I'd expect this to faster than before. In actuality it's probably negligible since we are sleeping in the webserver between polls which probably dominated the task sleeps.
A Sender by default waits for there to be an active receiver. This is fine for the main event stream since every task will have an active receiver (all async_broadcast::Recerivers are active when created). However for the output events we have a choice. If we keep an active Receiver around then it will queue up messages until it overflows (or blocks until the somebody calls recv on it, we can choose this mode). The next person to receive from it (really clone from it will get an overflow error first then the oldest message in the queue. Or we can keep and InactiveReceiver and the sender will get an error if there is no active receiver. I choose this approach because it means if the application clones the receiver (calls get_event_stream) then it won't have to deal with an error and a bunch of arbitrarily old events. It's a bit in the weeds but seeing InactiveReceiver might be otherwise confusing.
I got rid of two test files that were empty/commented out. I think this changes things enough that if we want to implement those we're probably just was well starting from a clean slate then some broken commented out code
Task IDs and Names are not a thing anymore. The registry could handle those but I don't really see a use for them.
This will only slightly break the sequencer. Receiver impls Stream so they can continue to using the event_stream from get_event_stream like a stream, but it won't have an ID anymore. That's fine since they don't use that or could create their own ids if they really wanted to when they get the stream.

I was hoping to be able to break this up but it's kind of all or nothing since ChannelStream was the fabric that tied everything together and now it's replaced.

This PR does not:

The internal logic of the tests and consensus should not be changed logically, it should only be how we run tasks and pass messages between them.

Key places to review:

All the asks in crates/task_impl.rs and crates/testing/source.
All the places we create and spawn tasks: SystemContex::run_tasks, crates/hotshot/src/tasks/mods.rs, crates/testing/src/test_runner.rs, and the new event stream interface in crates/hotshot/src/types/handle.rs

DO NOT REVIEW /crates/task HERE. REVIEW THEM IN: #2500

shenkeyao

Looks good! Not really requesting changes, but having some questions I'd like to ask before approving the PR.

crates/task-impls/src/consensus.rs

crates/task-impls/src/harness.rs

crates/task-impls/src/network.rs

crates/testing/src/test_launcher.rs

crates/hotshot/src/tasks/mod.rs

crates/hotshot/src/lib.rs

crates/testing/tests/consensus_task.rs

crates/hotshot/src/tasks/mod.rs

crates/hotshot/src/lib.rs

crates/hotshot/src/tasks/mod.rs

crates/task-impls/src/consensus.rs

crates/task-impls/src/view_sync.rs

crates/task/Cargo.toml

crates/task/src/dependency.rs

crates/task/src/task.rs

DieracDelta · 2024-02-09T10:29:49Z

Generally looks great!! Love the simplification, and the dependency trait logic. I only had a couple of small nits

shenkeyao

Looks good, though some nit comments from Justin aren't resolved yet.

shenkeyao · 2024-02-09T19:38:30Z

crates/task-impls/src/vid.rs

+        event: Self::Event,
+        task: &mut Task<Self>,
+    ) -> Option<HotShotTaskCompleted> {
+        // TODO: Don't clone the sender


Nit: I'm resolving this conversation since the TODO is still here.

bfish713 added 22 commits January 25, 2024 15:37

lint up to missing docs in task-impls

3cb5652

All builds but the tests~

1500c9e

All builds but the tests in testing

fa647ed

linting

ad49d46

Test can run, but they fail

4def208

some fixes

76e1590

cargo fix

da50e0a

replace broadcast with broadcast_direct

224fdb1

test almost running correctly, maybe not passing

016b36a

Add Tasks to project, success test passes

b30effd

test can pass

81a8e4d

fixing tests, basic passes except unit tests

4e1718f

fixed consensus task test

d37c03d

fixing other task tests

31dfea7

Remove more cruft

2b39c50

replace event sending with helper for better errors

a7cdad9

Merge remote-tracking branch 'origin/main' into bf/new-task-design

81871b4

lint after merge

b56f870

fixing unit tests and lints for both async types

ca58069

remove superfluous log and fix doc

7350604

Remove Cargo.lock from tasks

1c24942

cleanup

992f6b6

bfish713 requested review from DieracDelta, shenkeyao and rob-maron February 1, 2024 04:05

bfish713 mentioned this pull request Feb 1, 2024

[CX_CLEANUP] Create new Task Abstraction async_broadcast Channels #2500

Closed

bfish713 changed the title ~~Bf/new task design~~ [CX_CLEANUP] Integrate new Task, in the code and remove the old Feb 1, 2024

Merge remote-tracking branch 'origin/main' into bf/new-task-design

303aca3

bfish713 marked this pull request as ready for review February 1, 2024 20:49

bfish713 requested a review from elliedavidson as a code owner February 1, 2024 20:49

shenkeyao reviewed Feb 6, 2024

View reviewed changes

bfish713 added 2 commits February 6, 2024 15:09

Addressing comments

8a0dec0

Merge remote-tracking branch 'origin/main' into bf/new-task-design

5c9c19d

bfish713 requested a review from shenkeyao February 7, 2024 15:38

bfish713 mentioned this pull request Feb 7, 2024

[Tech Debt] - Remove # Panics from add_*_task functions #2542

Closed

ss-es reviewed Feb 7, 2024

View reviewed changes

crates/hotshot/src/tasks/mod.rs Outdated Show resolved Hide resolved

bfish713 added 4 commits February 7, 2024 13:19

Merge remote-tracking branch 'origin/main' into bf/new-task-design

e276c1b

Update tasks based on feedback

e0065a0

remove outdated # Panics doc comments

a02f0ec

missed on # Panics

93e7913

DieracDelta reviewed Feb 9, 2024

View reviewed changes

bfish713 added 5 commits February 9, 2024 05:56

Reveiw comments minus filters

e8cba7f

Lower a log level, fix task/cargo.toml

f200a95

Merge remote-tracking branch 'origin/main' into bf/new-task-design

edc820a

Remove todo

2fefff5

Merge remote-tracking branch 'origin/main' into bf/new-task-design

6c28e15

bfish713 requested review from ss-es and DieracDelta February 9, 2024 16:48

shenkeyao previously approved these changes Feb 9, 2024

View reviewed changes

bfish713 dismissed shenkeyao’s stale review via 2fefff5 February 9, 2024 20:32

shenkeyao previously approved these changes Feb 9, 2024

View reviewed changes

bfish713 dismissed shenkeyao’s stale review via 6c28e15 February 9, 2024 21:19

DieracDelta approved these changes Feb 9, 2024

View reviewed changes

bfish713 merged commit 5503a2c into main Feb 10, 2024
14 checks passed

bfish713 deleted the bf/new-task-design branch February 10, 2024 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CX_CLEANUP] Integrate new `Task`, in the code and remove the old #2493

[CX_CLEANUP] Integrate new `Task`, in the code and remove the old #2493

bfish713 commented Jan 31, 2024 •

edited

Loading

shenkeyao left a comment

DieracDelta commented Feb 9, 2024 •

edited

Loading

shenkeyao left a comment

shenkeyao Feb 9, 2024

[CX_CLEANUP] Integrate new Task, in the code and remove the old #2493

[CX_CLEANUP] Integrate new Task, in the code and remove the old #2493

Conversation

bfish713 commented Jan 31, 2024 • edited Loading

This PR:

This PR does not:

Key places to review:

shenkeyao left a comment

Choose a reason for hiding this comment

DieracDelta commented Feb 9, 2024 • edited Loading

shenkeyao left a comment

Choose a reason for hiding this comment

shenkeyao Feb 9, 2024

Choose a reason for hiding this comment

[CX_CLEANUP] Integrate new `Task`, in the code and remove the old #2493

[CX_CLEANUP] Integrate new `Task`, in the code and remove the old #2493

bfish713 commented Jan 31, 2024 •

edited

Loading

DieracDelta commented Feb 9, 2024 •

edited

Loading