Skip to content
This repository has been archived by the owner on Dec 16, 2024. It is now read-only.

Allocate umfpack's workspace on julia's side and reuse it #67

Closed
wants to merge 4 commits into from

Conversation

SobhanMP
Copy link
Member

closes JuliaSparse/SparseArrays.jl#112, passes all tests (and some new). The only possible painpoint is that it makes ldvi!(x, l, b) edit l and makes it thread unsafe. I could add a a lock...

@codecov-commenter
Copy link

codecov-commenter commented May 28, 2022

Codecov Report

Merging #67 (8443aeb) into master (8e83ebc) will decrease coverage by 0.12%.
The diff coverage is 91.11%.

@@            Coverage Diff             @@
##           master      JuliaSparse/SuiteSparse.jl#67      +/-   ##
==========================================
- Coverage   89.27%   89.14%   -0.13%     
==========================================
  Files           5        5              
  Lines        1361     1373      +12     
==========================================
+ Hits         1215     1224       +9     
- Misses        146      149       +3     
Impacted Files Coverage Δ
src/umfpack.jl 86.09% <91.11%> (-0.34%) ⬇️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@SobhanMP
Copy link
Member Author

added a test for matrix rhs

@SobhanMP
Copy link
Member Author

and complex rhs with real lhs

@SobhanMP
Copy link
Member Author

maybe we should just add a new ldiv!() with four arguments?
ldiv!(x, working_buffer, factorization, rhs)?

@rayegun
Copy link
Member

rayegun commented May 28, 2022

makes it thread unsafe

umfpack.jl is already not really thread safe, in fact you can't use multiple separate factorizations safely right now (see lines 123 and 125 of that file). I'm not sure how unsafe it is, but it's likely not safe. I will fix that in another PR but we don't need to worry about using the same UMFPACKFactorization object from multiple threads.

I have several changes I'll make directly if that's alright with you.

@SobhanMP
Copy link
Member Author

SobhanMP commented May 28, 2022

sure feel free to add changes, but what do you mean directly?

i do think that the control doesn't change from the umfpack side. though i can see that show_umf_ctrl isn't thread safe. I imagine storing a 20 elem float array isn't a big deal.

@SobhanMP
Copy link
Member Author

rip old solve! :P, i think it would have been good to keep it for testing purposes.

@rayegun
Copy link
Member

rayegun commented May 28, 2022

It's writing to the Info I think that's more problematic. We haven't received any complaints about it yet though so it may be fine. Regardless it should be stored with each one in a separate PR.

I just pushed some changes that I think simplify some things. It seems to speed things up and dramatically reduce the number of allocations the same as before. Which is very nice!

Hmm, you might be right about old solve! I'm not sure. I don't want the user to carry around workspaces that aren't used if they don't pass explicitly. We could add a noworkspace kwarg that does the original solve ccalls.

I don't think there will be any significant differences. It's the same codepath in UMFPACK.

@SobhanMP
Copy link
Member Author

SobhanMP commented May 28, 2022

the solve w/o the argument that i had left behind would have done the job, unless explicitly asked, it wouldn't have done anything, though it's your call. i would opt for keeping the old solve just for testing, it's not exposed so the "normal" user won't even notice that it's there. i wrote a bunch of test that compare the workbench full and workbench less version.
maybe just change the name? like alloc_solve!?

also your change do makes things a bit prettier 👍

@rayegun
Copy link
Member

rayegun commented May 28, 2022

Oh I see what the intent was. I hesitate to keep dead codepaths around for the purposes of testing. If we were to, for instance, add thread-local workspaces, or make other modifications to solve! we have to duplicate those changes.

You could add those solve! methods to the test suite. Let me think about it.

E: I also prefer the workspaces as kwargs so we're fixing the performance issue without changing signatures.

@SobhanMP
Copy link
Member Author

i see, i guess making everyone carry code only used for testing isn't optimal. i'll add them back to to the test file.

i'll give it another shot, this time with the workspace in a separate structure but i have to think a bit. do you think a four arg ldiv! is a good idea?

@rayegun
Copy link
Member

rayegun commented May 29, 2022

I like the current iteration if you add the test methods. If I were to change anything I would just make W and Wi global but thread local (and lazy)

Something like:

const W = Vector{Vector{Float64}}(undef, nthreads())

function getworkspace()
    id = threadid()
    # Ensure W is the right size, expand if necessary
    Wthread = W[id]
    # resize Wthread or create it for the first time if necessary
    return Wthread
end
solve!(...; W = getworkspace())

Same with Wi, does that make sense?

This (with moving Control and Info into UMFPACKFactorization in a separate PR) makes UMFPACK thread safe and fast finally!

What's your proposed ldiv! signature? I think the global/thread-local workspaces are cleaner, and match what CHOLMOD does for instance.

@SobhanMP
Copy link
Member Author

SobhanMP commented May 29, 2022

The problem with this is that after large matrices got removed, it's not gonna free memory. also is this compatible with polyester threads? it somehow feels unsafe as tasks can (will be able to) jump between threads (i'm not sure, i just keep hearing that this local storage per thread pattern is broken).

my suggested ldiv! signature is

ldiv(x, lu, b; workspace=lu.workspace)

and workspace is struct containing both W and Wi.

edit:
workspace can also be nothing for the old solve!?

@SobhanMP
Copy link
Member Author

this should take care of the tests.

@rayegun
Copy link
Member

rayegun commented May 29, 2022

Tests lgtm, the ldiv! seems fine as well. You're right about the thread local stuff, let's keep it within the factorization. I think once you add the ldiv overload this should be ready.

@SobhanMP
Copy link
Member Author

Should be good now :)

@SobhanMP
Copy link
Member Author

Now we either need to add a lock to UmfpackWS or add a big red warning somewhere.

@rayegun
Copy link
Member

rayegun commented May 29, 2022

I think it should just be documented for now (ie, Users must pass in a new workspace to use this factorization from multiple threads). Let's see if someone more senior chimes in, if they don't in the next week or so we can go ahead and merge as is with some extra docs.

@SobhanMP
Copy link
Member Author

looking forward to finally being able to use it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large allocations in ldiv! for sparse matrix (with LU factorization)
3 participants