Batch evaluations #819

termi-official · 2023-10-12T21:06:02Z

This PR introduces internal batched evaluations where possible, which is interesting for matrix-free and high order stuff (and maybe threading?). We might want to expose the functionality to the users in #766 . It should also help with eval times in #790 . In interpolations.jl there is an outline for #389 .

Due to the interaction this PR might precede #790 and #798 . Further is part of the infrastructure for #766 .

TODOs

exports for new functions
better docs

codecov-commenter · 2023-10-13T00:46:15Z

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (e9ae940) 92.77% compared to head (a736e65) 92.33%.
Report is 1 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #819      +/-   ##
==========================================
- Coverage   92.77%   92.33%   -0.44%     
==========================================
  Files          33       33              
  Lines        4952     4955       +3     
==========================================
- Hits         4594     4575      -19     
- Misses        358      380      +22

Files	Coverage Δ
src/FEValues/cell_values.jl	`100.00% <100.00%> (ø)`
src/FEValues/face_values.jl	`98.21% <100.00%> (-0.07%)`	⬇️
src/PointEval/PointEvalHandler.jl	`92.21% <ø> (-0.60%)`	⬇️
src/PointEval/point_values.jl	`96.96% <100.00%> (-0.40%)`	⬇️
src/interpolations.jl	`96.23% <68.75%> (-0.78%)`	⬇️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

AbdAlazezAhmed

LGTM

KnutAM · 2023-10-13T14:01:27Z

src/interpolations.jl

+function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}
+    @inbounds for i in 1:getnbasefunctions(ip)


Shouldn't there be a checkbounds here as default, which can be removed by calling @inbounds shape_values!(...)?

Suggested change

function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}

@inbounds for i in 1:getnbasefunctions(ip)

@propagate_inbounds function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}

@boundscheck checkbounds(values, 1:getnbasefunctions(ip))

@inbounds for i in 1:getnbasefunctions(ip)

KnutAM · 2023-10-13T14:02:54Z

src/interpolations.jl

+# struct TensorProductLagrange{shape, order} <: ScalarInterpolation{shape, order}
+# end
+
+# adjust_dofs_during_distribution(::TensorProductLagrange) = true
+# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 2}) = false
+# adjust_dofs_during_distribution(::TensorProductLagrange{<:Any, 1}) = false
+
+# TODO support these orderings in the dof handler close
+# vertexdof_indices(::TensorProductLagrange{RefLine,order}) where {order} = ((1,),(order+1,))
+# vertexdof_indices(::TensorProductLagrange{RefQuadrilateral,order}) where {order} = ((1,),(order+1,), (order*(order+1),), ((order+1)*(order+1),))
+# ...
+
+# Sum factorized eval for tensor product elements
+# function shape_values!(values::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order}
+#     # Auxillary 1D interpolation
+#     ip_1d = Lagrange{RefLine,order}()
+#     n_basefunctions_1d = getnbasefunctions(ip_1d)
+#     # Evaluate 1D
+#     evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
+#     evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
+#     # Sum factorization
+#     values_2d = reshape(values, (n_basefunctions_1d, n_basefunctions_1d))
+#     for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
+#         values_2d[ix, iy] = evals_x_1d[ix] * evals_y_1d[iy]
+#     end
+#     TODO permutate values_2d
+# end
+
+# function shape_values!(gradients_2d::AT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec{2,T}) where {T, AT <: AbstractArray, order}
+#     # Auxillary 1D interpolation
+#     ip_1d = Lagrange{RefLine,order}()
+#     n_basefunctions_1d = getnbasefunctions(ip_1d)
+#     # Evaluate 1D
+#     evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
+#     evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
+#     evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
+#     evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
+#     # Sum factorization
+#     gradients_2d_2d = reshape(gradients_2d, (n_basefunctions_1d, n_basefunctions_1d))
+#     for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
+#         gradients_2d_2d[ix, iy, 1] = ...
+#         gradients_2d_2d[ix, iy, 2] = ...
+#     end
+#     TODO permutate gradients_2d_2d
+# end
+
+# function shape_gradients_and_values!(gradients::GAT, shapes::SAT, ip::Lagrange{RefHypercube{2},order}, ξ::Vec ) where {T, SAT <: AbstractArray, GAT <: AbstractArray, dim, order}
+#     # Auxillary 1D interpolation
+#     ip_1d = Lagrange{RefLine,order}()
+#     n_basefunctions_1d = getnbasefunctions(ip_1d)
+#     # Evaluate 1D
+#     evals_x_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
+#     evals_y_1d = ntuple(i->shape_value(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
+#     evals_dx_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[1],)), i), order+1)
+#     evals_dy_1d = ntuple(i->shape_gradient(ip_1d, Vec{1,T}((ξ[2],)), i), order+1)
+#     # Sum factorization
+#     values_2d    = reshape(values, (n_basefunctions_1d, n_basefunctions_1d))
+#     gradients_2d = reshape(gradients, (n_basefunctions_1d, n_basefunctions_1d))
+#     for ix in 1:n_basefunctions_1d, iy in 1:n_basefunctions_1d
+#         values_2d[ix, iy, 1] = ...
+#         values_2d[ix, iy, 2] = ...
+#     end
+#     TODO permutate values_2d
+#     TODO permutate gradients_2d_2d
+# end
+
+# TODO implement kernels above for 3D
+# TODO implement vectorized interpolation case
+


Should this be removed before merge?
Seems better to add as a comment to #389 ?

I think so. This is basically a template to guide how the evals might be implemented in #790 .

KnutAM · 2023-10-13T14:03:45Z

src/PointEval/point_values.jl

    ip::IP
 end

 function PointValuesInternal(ξ::Vec{dim, T}, ip::IP) where {dim, T, shape <: AbstractRefShape{dim}, IP <: Interpolation{shape}}
    n_func_basefuncs = getnbasefunctions(ip)
-    N = [shape_value(ip, ξ, i) for i in 1:n_func_basefuncs]
-    return PointValuesInternal{IP, eltype(N)}(N, ip)
+    N = MVector(ntuple(i->shape_value(ip, ξ, i), n_func_basefuncs))


Is there any performance benefit of using MVector here? (Nice to document why since we don't use it elsewhere)

Not sure. I am just trying to get stuff ready for GPU mode. :)

OK, would it make sense to leave that for a separate issue then?
Instead of hard-coding a different vector type?

My impression is that other efforts (at least those I'm involved in) are more about allowing GPU-compatible arrays instead of enforcing them?

KnutAM · 2023-10-13T14:11:08Z

These functions are really nice!
Will use them in #764 if this gets merged!

termi-official · 2023-10-13T14:29:44Z

So this one preceedes #764 ?

KnutAM · 2023-10-13T14:40:50Z

So this one preceedes #764 ?

If this could be merged quickly, then yes.
Most of changes in this PR would get reverted though.
But in that case I can use these functions internally which is nice (I kind-of did something similar but not so nice/general with

Ferrite.jl/src/FEValues/FunctionValues.jl

Lines 65 to 72 in 72f1ea2

    
           function precompute_values!(fv::FunctionValues, qr::QuadratureRule) 
        
               n_shape = getnbasefunctions(fv.ip) 
        
               for (qp, ξ) in pairs(getpoints(qr)) 
        
                   for i in 1:n_shape 
        
                       fv.dNdξ[i, qp], fv.N_ξ[i, qp] = shape_gradient_and_value(fv.ip, ξ, i) 
        
                   end 
        
               end 
        
           end

Seeing this, I'll at least use this internally, and potentially an even better option would be to extend to shape_gradient_value!(dNdxi::AbstractMatrix, N_xi::AbstractMatrix, ip, qr::QuadratureRule) and skip the precompute_values! function.

termi-official · 2023-10-14T01:42:05Z

So this one preceedes #764 ?

If this could be merged quickly, then yes. Most of changes in this PR would get reverted though. But in that case I can use these functions internally which is nice (I kind-of did something similar but not so nice/general with

The purpose of this PR is also more to get the discussion up for the changes, so I don't end up with a 3k lines diff for the GPU PR. Even if large parts are reverted, them I am fine with it, no worries. :)

Seeing this, I'll at least use this internally, and potentially an even better option would be to extend to shape_gradient_value!(dNdxi::AbstractMatrix, N_xi::AbstractMatrix, ip, qr::QuadratureRule) and skip the precompute_values! function.

Out of couriosity, do you have a use case where we need such large batches? But yes, we can extend this one here easily to the full quadrature batch.

KnutAM · 2023-10-14T11:47:02Z

Out of couriosity, do you have a use case where we need such large batches? But yes, we can extend this one here easily to the full quadrature batch.

Nah, I was perhaps overly enthusiastic at first :) (because it is only used in one location, so it would just be moving code around for now). I think just having these should suffice and I'll use views in precompute_values!. In that respect, it doesn't matter much either way which gets merged in first.

One case I thought of, for future work, could be to actually define shape_values(ip) = @inbounds shape_values!(Vector{shape_value_type(ip)}(undef, getnbasefunctions(ip)), ip) and similar for shape_gradients_values and potentially shape_hessians_gradients_values. But I think this should be left for the future for now:) Internally, this would be used in the creation of FunctionValues and GeometryMapping, but would make things convenient for users wanting to extend things.

fredrikekre · 2023-10-16T07:24:57Z

I added the boundschecks and removed the extra stuff. I backed up the PR on the branch fe/do/ip-batched-eval in case you need it. Looks like shape_gradients! is untested btw, it used an undefined variable before my push.

KnutAM · 2023-10-16T15:39:02Z

src/interpolations.jl

+Evaluate all shape function gradients and values of `ip` at once at the reference point `ξ`
+and store them in `values`.
+"""
+function shape_gradients_and_values!(gradients::GAT, values::SAT, ip::IP, ξ::Vec) where {IP <: Interpolation, SAT <: AbstractArray, GAT <: AbstractArray}


Isn't @propagate_inbounds required here (and for shape_gradients! above) if we want to avoid the boundschecks, @fredrikekre ?

Yes, missed these ones. Would also be good benchmark whether it is useful or not I suppose.

termi-official added 6 commits October 12, 2023 19:53

Add batched eval for shape functions.

2d42845

Static sector.

4107b3e

Leave reminder for internal non-batched eval.

f463509

MVector...

a920d64

Outline sum factorization.

6957a76

WS

fa0b7c4

termi-official requested a review from AbdAlazezAhmed October 12, 2023 21:06

termi-official added 8 commits October 13, 2023 00:28

Fix faulty interpolation.

44844b7

Test context info.

b1fd0e8

Fix faulty type constraints.

9ab1316

Add functions to devdocs.

2bf549c

Fix faulty views.

f2ee9fc

Better context info.

44b7720

Faulty end.

c91d610

Fix CI.

bf7e597

termi-official marked this pull request as ready for review October 13, 2023 01:10

AbdAlazezAhmed approved these changes Oct 13, 2023

View reviewed changes

termi-official requested a review from fredrikekre October 13, 2023 13:46

KnutAM reviewed Oct 13, 2023

View reviewed changes

Add boundschecks, remove unrelated things

a736e65

fredrikekre merged commit fe44e7f into master Oct 16, 2023
7 checks passed

fredrikekre deleted the do/ip-batched-eval branch October 16, 2023 07:43

fredrikekre removed their request for review October 16, 2023 07:44

KnutAM reviewed Oct 16, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch evaluations #819

Batch evaluations #819

termi-official commented Oct 12, 2023 •

edited

Loading

codecov-commenter commented Oct 13, 2023 •

edited

Loading

AbdAlazezAhmed left a comment

KnutAM Oct 13, 2023

KnutAM Oct 13, 2023 •

edited

Loading

termi-official Oct 13, 2023

KnutAM Oct 13, 2023

termi-official Oct 13, 2023

KnutAM Oct 13, 2023

KnutAM commented Oct 13, 2023

termi-official commented Oct 13, 2023

KnutAM commented Oct 13, 2023

termi-official commented Oct 14, 2023

KnutAM commented Oct 14, 2023 •

edited

Loading

fredrikekre commented Oct 16, 2023

KnutAM Oct 16, 2023

fredrikekre Oct 16, 2023

		function shape_values!(values::AT, ip::IP, ξ::Vec) where {IP <: Interpolation, AT <: AbstractArray}
		@inbounds for i in 1:getnbasefunctions(ip)

Batch evaluations #819

Batch evaluations #819

Conversation

termi-official commented Oct 12, 2023 • edited Loading

TODOs

codecov-commenter commented Oct 13, 2023 • edited Loading

Codecov Report

AbdAlazezAhmed left a comment

Choose a reason for hiding this comment

KnutAM Oct 13, 2023

Choose a reason for hiding this comment

KnutAM Oct 13, 2023 • edited Loading

Choose a reason for hiding this comment

termi-official Oct 13, 2023

Choose a reason for hiding this comment

KnutAM Oct 13, 2023

Choose a reason for hiding this comment

termi-official Oct 13, 2023

Choose a reason for hiding this comment

KnutAM Oct 13, 2023

Choose a reason for hiding this comment

KnutAM commented Oct 13, 2023

termi-official commented Oct 13, 2023

KnutAM commented Oct 13, 2023

termi-official commented Oct 14, 2023

KnutAM commented Oct 14, 2023 • edited Loading

fredrikekre commented Oct 16, 2023

KnutAM Oct 16, 2023

Choose a reason for hiding this comment

fredrikekre Oct 16, 2023

Choose a reason for hiding this comment

termi-official commented Oct 12, 2023 •

edited

Loading

codecov-commenter commented Oct 13, 2023 •

edited

Loading

KnutAM Oct 13, 2023 •

edited

Loading

KnutAM commented Oct 14, 2023 •

edited

Loading