-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is dask.grblas
ready for a connected components algorithm?
#3
Comments
Hey, thanks for leaving a message @ParticularMiner, this is very interesting! To answer your most pressing questions first: no, Other important missing functionality is assignment. I think And, yes, I think the next step (regardless of the state of Once we have a working example, we could consider what to do next. If I just need to add a couple things to Btw, feel free to join and find me on the GraphBLAS slack channel: https://thegraphblas.slack.com |
Many thanks for your interest @eriknw! That is better than I could have hoped for.
It is gnarly isn’t it? Actually, to begin with, I intend to ignore a large part (≈40%) of that code, namely, the ‘sample phase’ part of the code (lines 400 - 692), since I’m not sure what exactly it is for, nor is it referred to in any of the authors’ papers. 396 //--------------------------------------------------------------------------
397 // sample phase
398 //--------------------------------------------------------------------------
399
400 if (sampling)
401 {
402
403 // et cetera At first glance though, it seems to be a way of reducing the number of edges if that number is deemed to be too large (line 349). 349 bool sampling = (n * FASTSV_SAMPLES * 2 < nnz) ; But the crucial part of the code, I believe, is barely 15 lines long and is what they call the “final phase” (lines 694-), which is really what I’m keen to implement using 694 //--------------------------------------------------------------------------
695 // final phase
696 //--------------------------------------------------------------------------
697
698 GrB_TRY (GrB_Matrix_nvals (&nnz, T)) ;
699
700 bool change = true ;
701 while (change && nnz > 0)
702 {
703 // hooking & shortcutting
704 GrB_TRY (GrB_mxv (mngp, NULL, GrB_MIN_UINT32,
705 GrB_MIN_SECOND_SEMIRING_UINT32, T, gp, NULL)) ;
706 // et cetera
Good idea. I'll try using
I always find it instructive discovering alternative ways of solving a problem. So yes, of course it would be great if you/we could explore a custom
Excellent! Admittedly, my knowledge/skills in
Sure. I'll be creating an account there soon. Looking forward to a fruitful collaboration! 😄 |
Hi @eriknw,
Many thanks for pioneering this and for all your impressive work on
grblas
!Searching for "dask" and "GraphBLAS" together brought me here.
As part of an open-source home-project of mine I'm attempting to write a "connected components algorithm" for graphs too large to fit into the RAM of a standard laptop.
This makes
dask.arrays
the natural choice of backend.My starting point is "FastSV" (a distributed memory connected components algorithm) which can be found at LAGraph and written using the GraphBLAS API (the C implementation).
I'm about to peruse the source code of
dask.grblas
to figure out how it works. Still, if you don't mind me asking: in your opinion, isdask.grblas
already at a stage to be used for my purposes? If not, and you don't mind me possibly contributing in the future, could you then share with me roughly what is outstanding?Also, I'm guessing
dask-grblas
is intended to be used together withgrblas
, right?If on the other hand, you feel there are other better ways of approaching my problem please do feel free to advise me.
Thanks!
The text was updated successfully, but these errors were encountered: