-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CHORE] Refactor shuffles to use a unified ShuffleExchange PhysicalPlan variant #3083
Conversation
CodSpeed Performance ReportMerging #3083 will not alter performanceComparing Summary
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3083 +/- ##
==========================================
+ Coverage 78.73% 78.77% +0.03%
==========================================
Files 623 619 -4
Lines 74710 74621 -89
==========================================
- Hits 58826 58784 -42
+ Misses 15884 15837 -47
|
Self::Aggregate(Aggregate { input, groupby, .. }) => { | ||
let input_clustering_spec = input.clustering_spec(); | ||
if groupby.is_empty() { | ||
ClusteringSpec::Unknown(Default::default()).into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: The previous implementation seems to be a bug which was masked somehow by the previous implementation of FanoutMap -> ReduceMerge
. I implemented a fix which I think is appropriate.
b73745a
to
5ea4356
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it!
/// | ||
/// This provides an abstraction where we can select the most appropriate strategies based on various | ||
/// heuristics such as number of partitions and the currently targeted backend's available resources. | ||
pub struct ShuffleExchangeBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if Builder is the most appropriate pattern for this, given that:
- Each ShuffleExchange configuration is distinct.
- There's no incremental steps involved in building a ShuffleExchange
I think you could instead have distinct creation methods on the struct impl itself:
impl ShuffleExchange {
pub fn hash_partitioned(
input: PhysicalPlanRef,
by: Vec<ExprRef>,
num_partitions: usize,
) -> Self {
...
}
pub fn range_partitioned(
input: PhysicalPlanRef,
by: Vec<ExprRef>,
descending: Vec<bool>,
num_partitions: usize,
) -> Self {
...
}
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout. I changed this to be a Factory instead of adding a bunch of ::new
methods directly on ShuffleExchange, because I do foresee that we will want the factory to be configurable.
E.g. the factory should be parametrized by config variables as well as other things such as statistics/estimations on the number of partitions. Then when you create the shuffle exchange, it will dynamically choose the correct ShuffleExchange implementation based on the data available to it.
@@ -262,6 +262,8 @@ fn physical_plan_to_partition_tasks( | |||
psets: &HashMap<String, Vec<PyObject>>, | |||
actor_pool_manager: &PyObject, | |||
) -> PyResult<PyObject> { | |||
use daft_plan::physical_ops::{ShuffleExchange, ShuffleExchangeStrategy}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to lump this with the other imports at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only because the stuff in here is #[cfg(feature = "python")]
. Not crucial though,
target_num_partitions, | ||
} => { | ||
match target_num_partitions.cmp(&input_num_partitions) { | ||
std::cmp::Ordering::Equal => Ok(upstream_iter), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be unreachable? I believe these should have been dropped by drop repartition rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically yes, but I thought that having a branch here wouldn't hurt just in case something weird happened upstream.
src/daft-scheduler/src/scheduler.rs
Outdated
py, | ||
"daft.execution.rust_physical_plan_shim" | ||
))? | ||
.getattr(pyo3::intern!(py, "split_by_hash"))? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we rename it to fanout_by_hash
?
bd3dd9b
to
bf7e0e4
Compare
Summary
Refactors our
PhysicalPlan
by adding a newPhysicalPlan::ShuffleExchange
variant.Why?
This PR makes it easier to add new shuffle implementations, by simply extending the
PhysicalPlan::ShuffleExchange
enumShuffleExchangeBuilder
builder can let us direct clients to use more complex shuffles by consulting environment variables, available cluster resources, number of partitions etc)Elaboration
Our previous PhysicalPlan was a bit of a leaky abstraction, expressing 2 types of shuffles by invoking an unnecessarily low-level chain of operations to users of the abstraction, as well as users of Daft:
FanoutHash/Range/Random -> Flatten -> ReduceMerge
: this pattern was used to express a "NaiveFullyMaterializingMapReduce" shuffle (materialize all Map tasks, perform a "flatten" which makes no sense to users, then reduce and merge)Split/Coalesce -> Flatten
: this pattern was used to express a "SplitOrCoalesceToTargetNum" shuffleThis leaky abstraction makes it difficult to add new types of shuffles (e.g. the push-based shuffle implemented in #2883) as it involves adding new PhysicalPlan variant(s). Additionally, all of these shuffles share similar characteristics during plan optimization, and the actual implementation of "how" to execute these shuffles should be highly dependent on factors such as available cluster resources, expected complexity of the shuffle and more.