-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SG TSP #1360
Add SG TSP #1360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the TSP details as much, but did notice a few things that'll need to be addressed.
cpp/include/algorithms.hpp
Outdated
* @param[in] verbose Logs configuration and iterative improvement. | ||
* | ||
*/ | ||
float traveling_salesman(raft::handle_t &handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a float return type versus void?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently to get my outputs I return the minimum cost and the path is passed as a pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This isn't documented
- How do you pick what gets returned as an argument and what doesn't? It seems we are mixing both at the moment. Consider using a consistent output mechanism.
- Another possible option for routes: New APIs are returning rmm::uvector which is often easier to deal with than raw pointers.
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few more items I noticed.
cpp/src/traversal/tsp_solver.hpp
Outdated
best_route = nullptr; | ||
} | ||
|
||
__global__ __launch_bounds__(2048, 2) void two_opt_search(int *mylock, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a quite large hand-written kernel. Two issues:
(1.) Couldn't be this designed using Thrust/CUB primitives?
(2.) Is there a possibility to split this kernel into smaller __device__
functions, so that reading and maintaining this code later would become possible / easier ?
Codecov Report
@@ Coverage Diff @@
## branch-0.18 #1360 +/- ##
===============================================
+ Coverage 60.38% 60.75% +0.37%
===============================================
Files 67 70 +3
Lines 3029 3134 +105
===============================================
+ Hits 1829 1904 +75
- Misses 1200 1230 +30
Continue to review full report at Codecov.
|
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool code. Just reviewed C++ part for now.
In addition to the comments, I'd like to see some performance data and a basic profile (ie. what's the perf bottleneck).
Kernels are nice but I would have preferred if there was more of Graph Prims, Thrust, and RAFT in this PR. API/Doc needs polishing. I also added some comments regarding integration and testing, for instance, I see an API that doesn't take graphs but graphs are passed in tests in a custom format that really looks like regular mtx/csv, so I think this should be reconciled.
cpp/include/algorithms.hpp
Outdated
* handle, the multi GPU version will be selected. | ||
* @param[in] vtx_ptr Device array containing the vertex identifiers used | ||
* to initialize the route. | ||
* @param[out] route Device array containing the returned route. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No graph_t
API?
cpp/include/algorithms.hpp
Outdated
* @param[in] verbose Logs configuration and iterative improvement. | ||
* | ||
*/ | ||
float traveling_salesman(raft::handle_t &handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider const raft::handle_t &handle
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a PR to make the brute_force_knn
handle const for 0.19
cpp/include/algorithms.hpp
Outdated
* @param[in] verbose Logs configuration and iterative improvement. | ||
* | ||
*/ | ||
float traveling_salesman(raft::handle_t &handle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This isn't documented
- How do you pick what gets returned as an argument and what doesn't? It seems we are mixing both at the moment. Consider using a consistent output mechanism.
- Another possible option for routes: New APIs are returning rmm::uvector which is often easier to deal with than raw pointers.
h_y_pos.data(), soln + nodes_ + 1, sizeof(float) * (nodes_ + 1), cudaMemcpyDeviceToHost)); | ||
cudaDeviceSynchronize(); | ||
|
||
for (int i = 0; i < nodes_; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nodes_ is expected to always be smaller than 2B in the future or is this something we should template like in the experimental APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the problem size we target is up to 2k nodes
return tokens; | ||
} | ||
|
||
int load_tsp(const char* fname, Route* input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use csv files and I/O?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to compare against TSPLIB which solves TSP as a city problem, so the format corresponds to 2D positions as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how it would be relevant in the context of a demo/benchmark. Not sure if that's a strong enough motivation in the context of cugraph testing though. It seems to me it is going in the opposite direction than what @rlratzel is trying to do for making testing more generic. I am ok not blocking the PR for this but we should probably have a FiXme/Todo for test refactoring work (if that's ok with Joseph/Rick).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A FIXME as Alex suggested isn't a bad idea, at the very least to capture that TSP deviates from the standard testing I/O pattern and we may want to revisit adopting a more universal approach (instead of or in addition to comparing using TSPLIB's input format).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the fixme in this commit: f85cb0d
datasets/tsplib/a280.tsp
Outdated
DIMENSION: 280 | ||
EDGE_WEIGHT_TYPE : EUC_2D | ||
NODE_COORD_SECTION | ||
1 288 149 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like all these files are really graph/sparse matrices rather than general euclidian space positions.
Are we supporting only positive integers within the same range?
If yes this should be mentioned in the API doc.
If not other test data should be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're supporting float data, the tsp225 file contains purely floats. There are no constraints on the range between the cities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Should we test negative coordinates? parallel edges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added file with negative weights
* @param[in] vtx_ptr Device array containing the vertex identifiers used | ||
* to initialize the route. | ||
* @param[out] route Device array containing the returned route. | ||
* @param[in] x_pos Device array containing starting x-axis positions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the argument ordering logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following a priority ordering on the parameters for the Python API. The cpp layer follows the Python API except for the route that is a returned parameter. I will work on updating the return and use a float and rmm::uvector tuple instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool code. Just reviewed C++ part for now.
In addition to the comments, I'd like to see some performance data and a basic profile (ie. what's the perf bottleneck).
Kernels are nice but I would have preferred if there was more of Graph Prims, Thrust, and RAFT in this PR. API/Doc needs polishing. I also added some comments regarding integration and testing, for instance, I see an API that doesn't take graphs but graphs are passed in tests in a custom format that really looks like regular mtx/csv, so I think this should be reconciled.
Thanks for the review. I'll work on adding timers. The current TSP implementation considers the graph is fully connected when looking for the best path. If we used a graph representation this means we would be O(V^2) in memory. We also consider than recomputing the distances is faster than storing them in shared memory and allows us to run on problems with larger size (more than 2k cities). One of the next step is indeed to reconcile with the rest of the API and take a graph as input but this would mean we have to refactor the kernel logic to look only at specific edges and have an efficient look up table (Cuco should have that) to get the weights. These optimizations are on the roadmap for next releases.
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few minor change requests.
return tokens; | ||
} | ||
|
||
int load_tsp(const char* fname, Route* input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A FIXME as Alex suggested isn't a bad idea, at the very least to capture that TSP deviates from the standard testing I/O pattern and we may want to revisit adopting a more universal approach (instead of or in addition to comparing using TSPLIB's input format).
rerun test |
rerun tests |
@gpucibot merge |
This PR implements an approximated solution to the Traveling Salesperson Problem (TSP).
The algorithm is exposed under
traversal
through a Python API taking 2D pos as input and returning a route.This PR relies on RAFT KNN: rapidsai/raft#126
Solves: #1185