-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename primitive functions. #2234
Conversation
rerun tests |
* vertex_partition_range_offsets[row_comm_size * i + row_comm_rank] and b_i = | ||
* vertex_partition_range_offsets[row_comm_size * i + row_comm_rank + 1]. c is | ||
* vertex_partition_range_offsets[row_comm_size * col_comm_rank] and d = | ||
* vertex_partition_range_offsets[row_comm_size * (col_comm_rank + 1)]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename this to specify that it is a range of vertex ids?
vertex_id_ranges perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more consistent with the rest, we may need to rename vertex_partition_range
to vertex_id_partition_range
but I feel like this is a bit too lengthy (I thought vertex_partition
somewhat implies partitioning of vertex ID ranges but it seems like this implication is not clear enough).
I added additional comments below to make this point clearer. Do you think this is sufficient or we need to further discuss about renaming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this explanation is clear enough.
EdgeOp e_op, | ||
T init, | ||
VertexValueOutputIterator vertex_value_output_first) | ||
void transform_reduce_e_of_v(raft::handle_t const& handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just use transform_reduce_edges
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
@@ -985,7 +982,7 @@ template <typename GraphViewType, | |||
typename EdgeOp, | |||
typename T, | |||
typename VertexValueOutputIterator> | |||
void copy_v_transform_reduce_out_nbr( | |||
void transform_reduce_outgoing_e_of_v( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transform_reduce_outgoing_edges
?
The of_v part seems unnecessary to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is for per vertex reduction (previously copy_v_transform_reduce_in|out_nbr).
I thought this two level naming scheme is confusing,
So, for global reduction, we use _e
or _v
and per vertex reduction, we use incoming/outgoing_e_of_v
Besides this clarification, do you have other suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would read transform_reduce_outgoing_e_of_v as 'transform reduce outgoing edges of vertices'.
The 'of vertices' doesn't really communicate that a per source reduction is being done. We should think of something that does make this clear.
We could name it exactly as transform_reduce_e to denote that every edge is being iterated on. The fact that this variant is doing per source reduction could be explained by differing function signature. This could be confusing to some users.
We could also use transform_reduce_outgoing_edges_per_source
or '..._e_per_src'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point was that if we are talking in terms of edges, in my opinion we should use the term source/destination (src/dst) instead of vertex.
Perhaps I am overlooking some nuances when dealing with transposed (csc) graphs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I like the per_src
idea. I will make this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will using src
nomenclature still make sense even when the graph is transposed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume so.
But transform_reduce_outgoing_e_of_v
and transform_reduce_e_per_src
may have slightly different mental models.
I initially hoped outgoing_e_of_v
to be interpreted as "for every vertex, iterate over outgoing edges and reduce the edge functor outputs per vertex"; so this is more vertex centric.
"e_per_src" might be interpreted as iterate over every directed edge, apply a functor to each edge, than reduce functor outputs per source vertex ID; so this is more edge centric.
Both are pretty much doing the same thing, but nuances might be slightly different, and I feel like the latter is a bit clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether the graph is transposed or not affects efficiency of the primitive implementation. Per source reduction can be more efficient with a source-major format while per destination reduction can be more efficient with a destination-major format (by avoiding atomics, and some primitives allow only a source-major format... this is more to avoid implementing something that will not be used in the foreseeable future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern about per_src
nomenclature is about 0-(in|out)-degree vertices.
So these primitives are designed to serve properties_of_v = some_initial_value + sum_over_incoming|outgoing_e (calculated per edge value)
(e.g. PageRank, Katz centrality, HITS, summing per vertex incoming/outgoing edge weights).
This doesn't align very well with the edge-centric mental model.
@@ -204,7 +202,7 @@ template <typename GraphViewType, | |||
typename ReduceOp, | |||
typename T, | |||
typename VertexValueOutputIterator> | |||
void copy_v_transform_reduce_key_aggregated_out_nbr( | |||
void transform_reduce_key_aggregated_outgoing_e_of_v( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transform_reduce_key_aggregated_outgoing_edges
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
cpp/src/link_analysis/hits_impl.cuh
Outdated
@@ -105,7 +105,7 @@ std::tuple<result_t, size_t> hits(raft::handle_t const& handle, | |||
} | |||
for (size_t iter = 0; iter < max_iterations; ++iter) { | |||
// Update current destination authorities property | |||
copy_v_transform_reduce_in_nbr( | |||
transform_reduce_incoming_e_of_v( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transform_reduce_incoming_edges
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
cpp/src/traversal/bfs_impl.cuh
Outdated
@@ -222,7 +222,7 @@ void bfs(raft::handle_t const& handle, | |||
e_op.prev_visited_flags = prev_visited_flags.data(); | |||
} | |||
|
|||
update_frontier_v_push_if_out_nbr( | |||
update_frontier_from_outgoing_e_of_v( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are using the sources/destinations nomenclature then this could be renamed to be
update_frontier_from_destinations
because the frontier will only contain vertices that are destinations as opposed to sources.
Having said that update_frontier_from_outgoing_edges
is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about update_frontier_from_outgoing_e
to be more consistent with the rest? (we are using e
to denote edges
so I hope this abbreviation is not too confusing).
I initially used e_of_v
instead of just e
borrowing this convention from other primitives performing per vertex reduction, but here, we are not performing per vertex reduction. Starting from an input frontier, iterate over all the outgoing edges from the vertices in the frontier, then, compute a new frontier by filtering out destination vertices of the outgoing edges.
Something like compute_new_frontier_from_outgoing_e_of_v_in_input_frontier
might be more descriptive (but too verbose) and I assume update_frontier
part reasonably covers both compute_new_frontier
and of_v_in_input_frontier
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about update_frontier_from_outgoing_e to be more consistent with the rest?
Yes, this works
@ChuckHastings @kaatish Added "do not merge" label. Still needs to figure out how to properly re-name
I'm now thinking more about |
Codecov Report
@@ Coverage Diff @@
## branch-22.06 #2234 +/- ##
================================================
+ Coverage 70.82% 71.30% +0.48%
================================================
Files 170 171 +1
Lines 11036 11169 +133
================================================
+ Hits 7816 7964 +148
+ Misses 3220 3205 -15
Continue to review full report at Codecov.
|
@gpucibot merge |
Partially address #2003
Rename