-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster cache lookup in broadcast!
via nested Dicts and get! macro
#6107
Conversation
FWIW, changing this to use the existing With
With
Diff: diff --git a/base/broadcast.jl b/base/broadcast.jl
index f467df6..89050ae 100644
--- a/base/broadcast.jl
+++ b/base/broadcast.jl
@@ -219,9 +219,15 @@ for (Bsig, Asig, gbf, gbb) in
function broadcast!(f::Function, B::$Bsig, As::$Asig...)
nd = ndims(B)
narrays = length(As)
- cache_f = @get!(broadcast_cache, f, Dict{Int,Dict{Int,Function}}())
- cache_f_nd = @get!(cache_f, nd, Dict{Int,Function}())
- func = @get!(cache_f_nd, narrays, $gbf($gbb, nd, narrays, f))
+ cache_f = get!(broadcast_cache, f ) do
+ Dict{Int,Dict{Int,Function}}()
+ end
+ cache_f_nd = get!(cache_f, nd ) do
+ Dict{Int,Function}()
+ end
+ func = get!(cache_f_nd, narrays) do
+ $gbf($gbb, nd, narrays, f)
+ end
func(B, As...)
B
end |
@kmsquire: Yes, I had tested it at some point; I was under the impression that the gap was bigger, but in fact I see a performance difference of the same order as what you report, i.e. operations on 10x10 matrices with preallocation get a 1.2x to 1.3x penalty for using Maybe that's acceptable. But I actually still lean towards using the macro, since it's for internal use only (until passing around functions becomes faster) and 25% is not really negligible. Opinions? |
It seems fine to me. One small comment: we may want to think about indexing things in terms of Regarding |
You're right, I have updated the PR. |
Travis failure is unrelated (it's in the arpack test). Also, here's a link to the issue where |
Yeah, I think I like the macro syntax better as well (and it is faster). Cheers! |
Since there seem to be no objections, I'll rebase and merge. |
Same as get! function but does not evaluate the default argument unless needed.
Change broadcast! cache to use nested Dict instead of a Dict with Tuple keys; use get! macro to access its fields. Related to #6041. Also reduces code duplication in broadcast! definitions via eval macro.
Faster cache lookup in `broadcast!` via nested Dicts and get! macro
See JuliaLang/LinearAlgebra.jl#89. Uses nested Dicts instead of indexing with a Tuple. Ugly but effective. The commit also avoids some code duplication via
@eval
tricks.My only concern with this is that is the introduction of the
@get!
macro for Dicts (not exported though); however, it is critical for the performance boost.I'll merge if there are no objections.