-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unwrapping mechanism for triangular mul and solves #50058
Conversation
@nanosoldier @nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
Your package evaluation job has completed - possible new issues were detected. |
@nanosoldier |
Your package evaluation job has completed - possible new issues were detected. |
@nanosoldier |
Your benchmark job has completed, but no benchmarks were actually executed. Perhaps your tag predicate contains misspelled tags? cc @ |
@nanosoldier |
Your benchmark job has completed - no performance regressions were detected. A full report can be found here. |
Excellent, thank you. So, after the renaming the function to be overloaded is now |
Hm, the "failure" of the exact equality is, in more detail, this: julia> ATa \ sparse(B) - ATa \ B
10×2 Matrix{Float64}:
-1.11022e-16 -1.11022e-16
-1.04083e-16 -1.04083e-16
5.55112e-17 5.55112e-17
-2.77556e-17 -2.77556e-17
0.0 0.0
0.0 0.0
0.0 0.0
0.0 0.0
0.0 0.0
0.0 0.0
julia> norm(ATa \ sparse(B) - ATa \ B)/norm(ATa \ B)
1.451210901831765e-16 How can we make this pass, without breaking |
a215494
to
c533ca7
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
With this PR, we are down to 31 |
b5f2b0f
to
e67ddaa
Compare
Backported PRs: - [x] #50411 <!-- Fix weird dispatch of * with zero arguments --> - [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 --> - [x] #50064 <!-- Fix numbered prompt with input only with comment --> - [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of current directory --> - [x] #50402 <!-- Add CPU feature helper function --> - [x] #50387 <!-- update newpages pointer after actually sweeping pages --> - [x] #50424 <!-- avoid potential type-instability in _replace_(str, ...) --> - [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some cases --> - [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized --> - [x] #50466 <!-- relax assertion involving pg->nold to reflect that it may be a bit in… --> - [x] #50490 <!-- Fix compat annotation for italic printstyled --> - [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` --> - [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot` and handling if current dir is unwritable --> - [x] #50461 <!-- fix typo in the --gcthreads argument description --> - [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic block --> - [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects past freelist pointer are in freelist --> - [x] #49322 <!-- improve cat design / performance --> - [x] #50540 <!-- gc: remove over-eager assertion --> - [x] #50542 <!-- gf: remove unnecessary assert cycle==depth --> - [x] #50559 <!-- Expand kwcall lowering positional default check to vararg --> - [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves --> - [x] #50551 <!-- typeintersect: also record chained `innervars` --> - [x] #50552 <!-- read(io, Char): fix read with too many leading ones --> - [x] #50541 <!-- precompile: ensure globals are not accidentally created where disallowed --> - [x] #50576 <!-- use atomic compare exchange when setting the GC mark-bit --> - [x] #50578 <!-- gf: make method overwrite/delete an error during precompile --> - [x] #50516 <!-- Fix visibility of assert on GCC12/13 --> - [x] #50597 <!-- Fix memory corruption if task is launched inside finalizer --> - [x] #50591 <!-- build: fix various makefile bugs --> - [x] #50599 <!-- faster invalid object lookup in conservative gc --> - [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72 to 99c99b4 --> - [x] #50639 <!-- Backport LLVM patches to fix various issues. --> - [x] #50546 <!-- Revert storage of method instance in LineInfoNode --> - [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better --> - [x] #50525 <!-- only check that values are finite in `generic_lufact` when `check=true` --> - [x] #50587 <!-- isassigned for ranges with BigInt indices --> - [x] #50144 <!-- Page based heap size heuristics --> Need manual backport: - [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` -> `JULIA_USE_FLISP_PARSER` --> Non-merged PRs with backport label: - [ ] #50637 <!-- Remove SparseArrays legacy code --> - [ ] #50618 <!-- inference: continue const-prop' when concrete-eval returns non-inlineable --> - [ ] #50598 <!-- only limit types in stack traces in the REPL --> - [ ] #50594 <!-- Disallow non-index Integer types in isassigned --> - [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` --> - [ ] #50523 <!-- Avoid generic call in most cases for getproperty --> - [ ] #50172 <!-- print feature flags used for matching pkgimage -->
This adds an unwrapping mechanism to triangular matrices, basically following the BLAS example in terms of characters encoding wrappers. It mirrors the
AdjOrTransOrHermOrSym
mechanism closely. Packages that want to overload by storage type can overloadgeneric_trimatmul!
(and potentiallygeneric_matrimul!
). Note the similarity togeneric_matvecmul!
andgeneric_matmatmul!
. There is, unfortunately, some added code due to the fact that lazy conjugate wrappers have a different "wrapper depth" compared to the classic, e.g.,*Triangular{<:Any,<:Adjoint}
. I believe that with this PR we cover all wrappers of typically dense matrices with the unwrapping mechanism.An analogous approach could be applied toldiv!
, if that's of interest and of benefit to the ecosystem.