-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Count number of connected components more efficiently than length(connected_components(g))
#407
Open
thchr
wants to merge
11
commits into
JuliaGraphs:master
Choose a base branch
from
thchr:connected_components
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 7 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
59270de
implement `count_connected_components(g)` as faster version of `lengt…
thchr a8f8d19
export `connected_components!`
thchr 2316b0f
allow resetting `label` inside `count_connected_components`
thchr 05e3b7e
fix unsaved test
thchr 0153724
fix copy bug and nits
thchr ce66be5
JuliaFormatter
thchr f440525
lingering copy typo
thchr da1d31e
updates from code-review
thchr 21b1854
simplify `count_unique`
thchr ead687e
improve comments for `count_unique`
thchr d5e0e37
use `BitSet` in `count_unique`
thchr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for performance improvements. But I am a bit skeptical if it is worth making the interface more complicated.
Almost all graph algorithms need some kind of of work buffer, so we could have something like in al algorithms but in the end it should be the job for Julia's allocator to verify if there is some suitable piece of memory lying around. We can help it by using
sizehint!
with a suitable heuristic.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this will usually not be relevant; in my case it is though, and is the main reason I made the changes. I also agree that there is a trade off between performance improvements and complications of the API. On the other hand, I think passing such work buffers as optional arguments is a good solution to such trade-offs: for most users, the complication can be safely ignored and shouldn't complicate their lives much.
As you say, there are potentially many algorithms in Graphs.jl that could take a work buffer; in light of that, maybe this could be more palatable if we settle on a unified name for these kinds of optional buffers, so that it lowers the complications by standardizing across methods.
Maybe just
work_buffer
(and, if there are multiple,work_buffer1
,work_buffer2
, etc?)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this then all functions should take exactly one
work_buffer
(possibly a tuple) and have an appropriate function to initialize the buffer. I think it is a major change which should be discussed separately.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think if this is really important for your use case you can either
Experimental
submodule. Currently we don't guarantee semantic versioning there - this allows use to remove things in the future without breaking the API.But just to clarify - your problem is not that you are building graphs by adding edges until they are connected? Because if that is the issue, there is a much better algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be able to find the time for factoring this out into the Experimental submodule, unfortunately.
I'm happy to e.g., add an admonition to the docstring, indicating that the work buffer arguments are unstable API which are subject to breakage though. Factoring this into a submodule, piecing it back together, and adding multiple doc-strings across modules, and eventually loading this behind a
Graphs.Experimental
call is more fiddling than I'm up for.Indeed, I have and will just continue to do that, yep.
No, I'm not doing that; appreciate the check though.
As a general side note - and please know that I appreciate these reviews very much (!) and your efforts on what is no doubt spare time (!!) - but I wonder if the level of scrutiny and optimization that many PRs here go through is optimal: I understand the intent and the aim of making stable API and of getting good, maintainable code. But I think there's a risk of trading off too much towards these goals, at the cost of vibrancy and community engagement. From my experience here, there's room for leaning more towards "is this better than what we previously had" over "could this be even better".
PRs like this usually happen on time "stolen away" from our day-jobs, and the odds of returning to PRs for edits, however small, go down very quickly with time; similarly, the expectation that a PR will be a multi-iteration process reduces the likelihood it will be made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the feedback, and I lean in the same direction. The JuliaGraphs community calls have pretty much dried up recently and we don't have much of a community to start with, so it's in our best interest to make engagement rewarding instead of tiresome.
I'm expecting an answer this month on funding I applied to which could serve to revitalize this ecosystem, hopefully I'll have good news to share soon.