Sparse vector/matrix: add fast implementation of find_next and find_prev (fixed) #23317

tkluck · 2017-08-18T09:22:25Z

(Opening a new pull request instead of re-opening #23312 because I force-pushed and pull requests don't seem to like that.)

Before this commit, find_next() will just use the default implementation of looping over each element. When find_next is called without a function filter as first argument, we know that semantics are to find elements x satisfying x != 0, so for sparse matrices/vectors, we may only loop over the stored elements.

Some care must be taken for stored zero values; that's the reason for the indirection of _sparse_find_next (which only finds the next stored element) and the actual find_next (which does actual non-zero checks).

Before this commit, find_next() will just use the default implementation of looping over each element. When find_next is called without a function filter as first argument, we *know* that semantics are to find elements x satisfying x != 0, so for sparse matrices/vectors, we may only loop over the stored elements. Some care must be taken for stored zero values; that's the reason for the indirection of _sparse_find_next (which only finds the next stored element) and the actual find_next (which does actual non-zero checks).

fredrikekre · 2017-08-18T09:28:06Z

(Opening a new pull request instead of re-opening #23312 because I force-pushed and pull requests don't seem to like that.)

For next time, it is fine to leave incomplete PR's open :)

fredrikekre · 2017-08-18T09:35:24Z

base/sparse/sparsematrix.jl

+   end
+   row, col = ind2sub(m, i)
+   lo, hi = m.colptr[col], m.colptr[col+1]
+   n = searchsortedfirst(@view(m.rowval[lo:hi-1]), row)


You can specify start and stop index to searchsortedfirst instead of using a view:

julia/base/sort.jl

Lines 160 to 172 in a945af3

function searchsortedfirst(v::AbstractVector, x, lo::Int, hi::Int, o::Ordering)

lo = lo-1

hi = hi+1

@inbounds while lo < hi-1

m = (lo+hi)>>>1

if lt(o, v[m], x)

lo = m

else

hi = m

end

end

return hi

end

Ooooh good find, thanks. Pushing an update.

…ted(...,lo=,hi=)

tkelman · 2017-08-19T05:34:05Z

base/sparse/sparsematrix.jl

@@ -1332,6 +1332,42 @@ function findnz(S::SparseMatrixCSC{Tv,Ti}) where {Tv,Ti}
    return (I, J, V)
 end

+function _sparse_findnext(m::SparseMatrixCSC, i::Int)
+   if i > length(m)


these are slightly under-indented, should be 4 spaces rather than 3

tkelman · 2017-08-19T05:35:32Z

test/sparse/sparse.jl

@@ -1899,3 +1899,26 @@ end
        @test isfinite.(cov_sparse) == isfinite.(cov_dense)
    end
 end
+
+@testset "sparse findprev/findnext operations" begin


would be good to include some sparse test arrays with stored zeros

(Use git blame -w for finding the non-whitespace edits to this code.)

tkluck · 2017-08-20T09:11:42Z

That AppVeyor failure doesn't seem to be related at all; the code that's failing is

    tfile = tempname()
    io = open(tfile, "w")
    ccall(:jl_dump_compiles, Void, (Ptr{Void},), io.handle)
    eval(expand(Main, :(for i in 1:10 end)))
    ccall(:jl_dump_compiles, Void, (Ptr{Void},), C_NULL)
    close(io)
    tstats = stat(tfile)
    tempty = tstats.size == 0
    rm(tfile)
    @test tempty == true

My guess would be that it's a race condition between close() and stat(). If so, we may be able to fix it by just calling stat(io) instead, which will call fstat (2) under the hood.

tkluck · 2017-08-20T09:37:34Z

Just opened #23365 for that.

tkluck · 2017-08-29T19:51:46Z

Given the resolution of #23365, this should be good to go!

KristofferC · 2017-08-29T20:09:56Z

Needs a rebase.

tkluck · 2017-08-29T20:30:55Z

@KristofferC you got it :) let's see what CI says.

KristofferC · 2017-08-29T20:31:57Z

Seems like there is still a conflict here.

tkluck · 2017-08-29T20:32:34Z

Oops, merged master instead of origin/master. Testing+pushing that now.

Sacha0 · 2017-10-01T03:43:28Z

Linking to #23812 and #23120 in case this pull request need be updated in accord. Best!

This fixes a few merge conflicts resulting from other additions to the sparse codebase.

…e explicit Since we now need explicit predicates [1], this optimization only works if we know that the predicate is a function that is false for zero values. As suggested in that pull request, we could find out by calling `f(zero(eltype(array)))` and hoping that `f` is pure, but I like being a bit more conservative and only applying this optimization only to the case where we *know* `f` is equal to `!iszero`. For clarity, this commit also renames the helper method _sparse_findnext() to _sparse_findnextnz(), because now that the predicate-less version doesn't exist anymore, the `nz` part isn't implicit anymore either. [1]: JuliaLang#23812

tkluck · 2017-11-03T17:02:10Z

Updated this branch to align with the new semantics for find and friends as merged to master from #23812. Very interested to hear your thoughts!

@fredrikekre I accidentally merged from current master without first pulling your prior merge from 30th of September. That merge commit got lost in the process. As far as I can tell, no work of yours got lost through that (it was a clean merge) but if I missed anything, let me know and I'll recover.

tkluck · 2017-11-05T11:14:00Z

CI failures are libgit related and almost surely unrelated to the work in this branch. Should be good to merge!

fredrikekre · 2017-11-05T11:27:27Z

@Sacha0 wanna take a look? :)

Sacha0 · 2017-11-05T18:44:06Z

@Sacha0 wanna take a look? :)

I have blocked some time this evening for reviews, and will try to work this pull request in then :). Best!

Sacha0 · 2017-11-06T04:13:05Z

base/sparse/abstractsparse.jl

+        end
+    end
+    return j
+end


Perhaps the following? :)

function findnext(f::typeof(!iszero), v::AbstractSparseArray, i::Int) j = _sparse_findnextnz(v, i) while j != 0 && iszero(v[j]) j = _sparse_findnextnz(v, j+1) end return j end

Looks good! In the interest of expediency, feel free to commit+merge that if you have time.

Sacha0 · 2017-11-06T04:14:46Z

base/sparse/abstractsparse.jl

+        end
+    end
+    return j
+end


Similarly? :)

function findprev(f::typeof(!iszero), v::AbstractSparseArray, i::Int) j = _sparse_findprevnz(v, i) while j != 0 && iszero(v[j]) j = _sparse_findprevnz(v, j-1) end return j end

Sacha0 · 2017-11-06T04:51:30Z

base/sparse/sparsevector.jl

+    else
+        return v.nzind[n]
+    end
+end


A compact alternative:

function _sparse_findnextnz(v::SparseVector, i::Int) n = searchsortedfirst(v.nzind, i) return n <= length(v.nzind) ? v.nzind[n] : 0 end

Sacha0

lgtm modulo the minor comments! :)

@Sacha0

Thanks to @Sacha0 for the suggestion!

tkluck · 2017-11-10T17:19:53Z

Just updated with your comments @Sacha0 . I hope you don't mind I skipped replacing if/else/end by ?: (which is arguably a matter of taste), but the code duplication removal is an obvious improvement in clarity. Thanks again!

Sacha0 · 2017-11-13T20:50:12Z

base/sparse/sparsevector.jl

+function _sparse_findnextnz(v::SparseVector, i::Int)
+    n = searchsortedfirst(v.nzind, i)
+    if n > length(v.nzind)
+        return 0


For type stability, return zero(indtype(v))? (Likewise below.)

Sacha0

Modulo the minor type stability comment above, lgtm! Thanks @tkluck! :)

…types This mostly means returning `zero(indtype(...))` instead of `0` in the not-found case. In addition, this commit replaces a few `== 0` checks by `iszero()` to avoid unnecessary promotions. We could similarly replace `+ 1` by `+ one(...)` but that becomes cumbersome very quickly.

tkluck · 2017-11-14T19:40:56Z

Thanks for the review @Sacha0 ! Pushed those updates just now.

tkluck · 2018-01-04T21:58:24Z

Just added a commit that should resolve the merge conflicts. Do you think we could merge this?

Sacha0 · 2018-01-04T22:22:07Z

Looks like! Perhaps @fredrikekre and I should make a brief sweep of the rebased version and then merge?

Sacha0

Test failures appear related?

tkluck · 2018-01-06T11:04:47Z

Looks like. Keeping a pull request open for half a year really makes master a bit of a moving target 😕 I'm happy to fix it, but what can we do to make sure it gets merged quickly after that?

This is needed in response to JuliaLang#24715.

fredrikekre · 2018-01-06T13:03:26Z

Sorry this has taken so much time, it looks good to me but @Sacha0 should probably take a look and then we should merge!

Sacha0 · 2018-01-06T18:39:48Z

If you do not see movement within a few days on a pull request that is approved, passing CI, and not otherwise blocked, please feel welcome to bump! :) Chances are it simply fell off the collective radar and a bump would be appreciated. (Tangentially, having a bot that bumps pull requests under those conditions could work well.)

Sacha0

Thanks for seeing this work through @tkluck! :)

…#31354)" This seems to duplicate work from JuliaLang#23317 and it causes performance degradation in the cases that one was designed for. See JuliaLang#31354 (comment) This reverts commit e0bef65.

@mbauman

…artesian coordinates (#32007) Revert "sparse findnext findprev hash performance improved (#31354)" This seems to duplicate work from #23317 and it causes performance degradation in the cases that one was designed for. See #31354 (comment) This reverts commit e0bef65. Thanks to @mbauman for spotting this issue in #32007 (comment).

@mbauman

…artesian coordinates (#32007) Revert "sparse findnext findprev hash performance improved (#31354)" This seems to duplicate work from #23317 and it causes performance degradation in the cases that one was designed for. See #31354 (comment) This reverts commit e0bef65. Thanks to @mbauman for spotting this issue in #32007 (comment). (cherry picked from commit ec797ef)

fredrikekre added the sparse Sparse arrays label Aug 18, 2017

fredrikekre reviewed Aug 18, 2017

View reviewed changes

Sparse findprev/findnext: replace searchsorted(@view...) by searchsor…

14f443a

…ted(...,lo=,hi=)

fredrikekre requested a review from Sacha0 August 18, 2017 22:43

tkelman reviewed Aug 19, 2017

View reviewed changes

tkluck added 2 commits August 19, 2017 09:13

_sparse_findnext/prev: fix indentation

92558be

(Use git blame -w for finding the non-whitespace edits to this code.)

sparse findprev/findnext: add doctest with stored zeros

4b54020

Merge branch 'master' into sparse-find-next

5dde4af

Merge remote-tracking branch 'origin/master' into sparse-find-next

daee267

fredrikekre requested review from Sacha0 and removed request for Sacha0 September 30, 2017 21:42

tkluck added 2 commits November 3, 2017 09:26

Merge remote-tracking branch 'origin/master' into sparse-find-next

1ac4141

This fixes a few merge conflicts resulting from other additions to the sparse codebase.

tkluck force-pushed the sparse-find-next branch from 026e479 to fe4b76e Compare November 3, 2017 17:00

Sacha0 reviewed Nov 6, 2017

View reviewed changes

Sacha0 approved these changes Nov 6, 2017

View reviewed changes

sparse findnext()/findprev(): remove code duplication

132ff27

Thanks to @Sacha0 for the suggestion!

Sacha0 reviewed Nov 13, 2017

View reviewed changes

Sacha0 approved these changes Nov 14, 2017

View reviewed changes

fredrikekre approved these changes Nov 14, 2017

View reviewed changes

Merge branch 'master' into sparse-find-next

a33abbe

Sacha0 reviewed Jan 6, 2018

View reviewed changes

Fix sparse findprev()/findnext() for sub2ind deprecation

db62ae4

This is needed in response to JuliaLang#24715.

Sacha0 approved these changes Jan 6, 2018

View reviewed changes

Sacha0 merged commit 2cfa6a5 into JuliaLang:master Jan 6, 2018

tkluck mentioned this pull request Nov 14, 2018

Implement optimizations for sparse findnext/findprev #28313

Closed

tkluck mentioned this pull request May 11, 2019

sparse findnext findprev hash performance improved #31354

Merged

tkluck mentioned this pull request May 12, 2019

Sparse matrix: fix fast implementation of findnext and findprev for cartesian coordinates #32007

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse vector/matrix: add fast implementation of find_next and find_prev (fixed) #23317

Sparse vector/matrix: add fast implementation of find_next and find_prev (fixed) #23317

tkluck commented Aug 18, 2017

fredrikekre commented Aug 18, 2017

fredrikekre Aug 18, 2017

tkluck Aug 18, 2017

tkelman Aug 19, 2017

tkelman Aug 19, 2017

tkluck commented Aug 20, 2017

tkluck commented Aug 20, 2017

tkluck commented Aug 29, 2017

KristofferC commented Aug 29, 2017

tkluck commented Aug 29, 2017

KristofferC commented Aug 29, 2017

tkluck commented Aug 29, 2017

Sacha0 commented Oct 1, 2017

tkluck commented Nov 3, 2017 •

edited

Loading

tkluck commented Nov 5, 2017

fredrikekre commented Nov 5, 2017

Sacha0 commented Nov 5, 2017

Sacha0 Nov 6, 2017

tkluck Nov 6, 2017

Sacha0 Nov 6, 2017

Sacha0 Nov 6, 2017

Sacha0 left a comment

tkluck commented Nov 10, 2017

Sacha0 Nov 13, 2017

Sacha0 left a comment

tkluck commented Nov 14, 2017

tkluck commented Jan 4, 2018

Sacha0 commented Jan 4, 2018

Sacha0 left a comment

tkluck commented Jan 6, 2018

fredrikekre commented Jan 6, 2018

Sacha0 commented Jan 6, 2018

Sacha0 left a comment •

edited

Loading

	function searchsortedfirst(v::AbstractVector, x, lo::Int, hi::Int, o::Ordering)
	lo = lo-1
	hi = hi+1
	@inbounds while lo < hi-1
	m = (lo+hi)>>>1
	if lt(o, v[m], x)
	lo = m
	else
	hi = m
	end
	end
	return hi
	end

Sparse vector/matrix: add fast implementation of find_next and find_prev (fixed) #23317

Sparse vector/matrix: add fast implementation of find_next and find_prev (fixed) #23317

Conversation

tkluck commented Aug 18, 2017

fredrikekre commented Aug 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tkluck commented Aug 20, 2017

tkluck commented Aug 20, 2017

tkluck commented Aug 29, 2017

KristofferC commented Aug 29, 2017

tkluck commented Aug 29, 2017

KristofferC commented Aug 29, 2017

tkluck commented Aug 29, 2017

Sacha0 commented Oct 1, 2017

tkluck commented Nov 3, 2017 • edited Loading

tkluck commented Nov 5, 2017

fredrikekre commented Nov 5, 2017

Sacha0 commented Nov 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Sacha0 left a comment

Choose a reason for hiding this comment

tkluck commented Nov 10, 2017

Choose a reason for hiding this comment

Sacha0 left a comment

Choose a reason for hiding this comment

tkluck commented Nov 14, 2017

tkluck commented Jan 4, 2018

Sacha0 commented Jan 4, 2018

Sacha0 left a comment

Choose a reason for hiding this comment

tkluck commented Jan 6, 2018

fredrikekre commented Jan 6, 2018

Sacha0 commented Jan 6, 2018

Sacha0 left a comment • edited Loading

Choose a reason for hiding this comment

tkluck commented Nov 3, 2017 •

edited

Loading

Sacha0 left a comment •

edited

Loading