-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve cat design / performance #49322
Conversation
Seems like SparseArrays has been broken on main for awhile, so we are waiting for this to be fixed before we can move ahead with this PR: JuliaSparse/SparseArrays.jl#363 (comment). It recently switched to using the JLL, but seems to still simultaneously be vendering its own copy of the JLL which is broken. |
I'm going to be trying to fix the SuiteSparse issue this weekend, I'm not sure when Tim Davis would release the update with my fix but he's pretty responsive. |
#48977 is merged. |
364419e
to
92ec9d4
Compare
I am a bit skeptical of these numbers, but initial measurements indicate this saves about 3 seconds of load time (15%)
|
@vtjnash Is this ok to backport to 1.10? We may need to do that since the sparse hvcat changes will get pulled in with all the other SparseArrays PRs on SparseArrays master. |
Sure. It was ready before that, but just was waiting on SparseArrays. The performance boost to loading and running seemed worth it for backport. |
Since we are seeing some errors when testing Oscar with julia nightly after this was merged, is it intended that this will cause anything that
to be converted to We have a custom boolean sparse matrix (it stores only true the values and prints as their indices) in Oscar: julia> IM = IncidenceMatrix([[1, 3, 7], [4, 5, 6]])
2×7 IncidenceMatrix
[1, 3, 7]
[4, 5, 6]
julia> IM isa SparseArrays.AbstractSparseArray
true
julia> vcat(IM,IM)
4×7 SparseMatrixCSC{UInt8, Int64} with 12 stored entries:
0x01 ⋅ 0x01 ⋅ ⋅ ⋅ 0x01
⋅ ⋅ ⋅ 0x01 0x01 0x01 ⋅
0x01 ⋅ 0x01 ⋅ ⋅ ⋅ 0x01
⋅ ⋅ ⋅ 0x01 0x01 0x01 ⋅ Previously this went through the base implementation for abstract arrays and returned an IncidenceMatrix (e.g. on julia 1.10-alpha1):
|
It is expected, though is type-piracy by SparseArrays, so is an issue for that repo, not here: |
Backported PRs: - [x] #50411 <!-- Fix weird dispatch of * with zero arguments --> - [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 --> - [x] #50064 <!-- Fix numbered prompt with input only with comment --> - [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of current directory --> - [x] #50402 <!-- Add CPU feature helper function --> - [x] #50387 <!-- update newpages pointer after actually sweeping pages --> - [x] #50424 <!-- avoid potential type-instability in _replace_(str, ...) --> - [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some cases --> - [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized --> - [x] #50466 <!-- relax assertion involving pg->nold to reflect that it may be a bit in… --> - [x] #50490 <!-- Fix compat annotation for italic printstyled --> - [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` --> - [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot` and handling if current dir is unwritable --> - [x] #50461 <!-- fix typo in the --gcthreads argument description --> - [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic block --> - [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects past freelist pointer are in freelist --> - [x] #49322 <!-- improve cat design / performance --> - [x] #50540 <!-- gc: remove over-eager assertion --> - [x] #50542 <!-- gf: remove unnecessary assert cycle==depth --> - [x] #50559 <!-- Expand kwcall lowering positional default check to vararg --> - [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves --> - [x] #50551 <!-- typeintersect: also record chained `innervars` --> - [x] #50552 <!-- read(io, Char): fix read with too many leading ones --> - [x] #50541 <!-- precompile: ensure globals are not accidentally created where disallowed --> - [x] #50576 <!-- use atomic compare exchange when setting the GC mark-bit --> - [x] #50578 <!-- gf: make method overwrite/delete an error during precompile --> - [x] #50516 <!-- Fix visibility of assert on GCC12/13 --> - [x] #50597 <!-- Fix memory corruption if task is launched inside finalizer --> - [x] #50591 <!-- build: fix various makefile bugs --> - [x] #50599 <!-- faster invalid object lookup in conservative gc --> - [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72 to 99c99b4 --> - [x] #50639 <!-- Backport LLVM patches to fix various issues. --> - [x] #50546 <!-- Revert storage of method instance in LineInfoNode --> - [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better --> - [x] #50525 <!-- only check that values are finite in `generic_lufact` when `check=true` --> - [x] #50587 <!-- isassigned for ranges with BigInt indices --> - [x] #50144 <!-- Page based heap size heuristics --> Need manual backport: - [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` -> `JULIA_USE_FLISP_PARSER` --> Non-merged PRs with backport label: - [ ] #50637 <!-- Remove SparseArrays legacy code --> - [ ] #50618 <!-- inference: continue const-prop' when concrete-eval returns non-inlineable --> - [ ] #50598 <!-- only limit types in stack traces in the REPL --> - [ ] #50594 <!-- Disallow non-index Integer types in isassigned --> - [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` --> - [ ] #50523 <!-- Avoid generic call in most cases for getproperty --> - [ ] #50172 <!-- print feature flags used for matching pkgimage -->
This used to make a lot of references to design issues with the SparseArrays package (#2326 / #20815), which result in a non-sensical dispatch arrangement, and contribute to a slow loading experience do to the illogical Unions that must be checked by subtyping.
Requires similar those issues to be fixed in SparseArrays first (JuliaSparse/SparseArrays.jl#384) before merging this.
It is hard to get a reliable measure of the exact impact, since that measurement fluctuates a bit between builds due to other factors. But we can see this uses a bit less memory now, and I had instrumented it previously to measure that this cost 0.5s of load time, and that cost went do pretty much to zero after this change.