You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In both v1.6.3 and v1.7-rc1 the dot product between a dense matrix and a matched size SubArray produced by a view that slices along the last dimension falls back on the general dot product for AbstractArrays rather than efficiently using BLAS.dot. This can be improved by making a new dot method tailored for such SubArray views, as illustrated in the code below. However the hack below seems not yet worthy of a PR because it is very specific to slicing a 3D array along the last dimension like @view array3d[:,:,slice]. That is a pretty common way to slice, but it would be better for any new method to be more general. I would make a PR if I knew how to make a type like SubArray{T, N-1, Array{T, N}, Tuple{Base.Slice{Base.OneTo{Int}}, ..., Base.Slice{Base.OneTo{Int}}, Int}, true) to express "last dimension sliced".
using LinearAlgebra: dot
import LinearAlgebra # BLAS.dotusing BenchmarkTools:@btime
x =rand(100,200)
y =rand(100,200,2)
y =@view y[:,:,1] # view of slice along last dim - a common use casefunctionf1(x, y) # basic dot productdot(x, y)
endfunctionf2(x, y) # dot product with vec()dot(vec(x), vec(y)) # this is almost optimally fast, but allocates per `@btime`endfunctionf3(x, y) # call BLAS.dot directly
LinearAlgebra.BLAS.dot(length(x), x, 1, y, 1) # this works because the SubArray data is contiguousend
Slice2{T} = SubArray{T, 2, Array{T, 3}, Tuple{Base.Slice{Base.OneTo{Int}}, Base.Slice{Base.OneTo{Int}}, Int}, true}
mydot(x::Array{S,2}, y::Slice2) where {S} = LinearAlgebra.BLAS.dot(length(x), x, 1, y, 1) # a hack that solves itfunctionf4(x, y) # proposedmydot(x, y)
end@assertf1(x, y) ≈f2(x, y) ≈f3(x, y) ≈f4(x, y)
@assertf1(x, y) !=f2(x, y) # they call different dot methods !?@assertf2(x, y) ==f3(x, y)
@btimef1($x, $y) # 17.9 μs (0 allocations: 0 bytes)@btimef2($x, $y) # 1.2 μs (2 allocations: 80 bytes)@btimef3($x, $y) # 1.1 μs (0 allocations: 0 bytes)@btimef4($x, $y) # 1.1 μs (0 allocations: 0 bytes) <= this is the goal
Pinging @dkarrasch as being one of the most recent people who committed to the general dot methods, albeit 2 years ago in JuliaLang/julia#32739 😄
The text was updated successfully, but these errors were encountered:
We can extend the BLAS.dot to arbitrary StridedArray.
If IndexStyle(x, y) isa IndexLinear, then we can call low level api safely.
Otherwise invoke the general version.
Since IndexStyle(x, y) is type based, this won't do harm to (runtime) performance.
BTW, view(randn(100, 100), 1:2:100, :) can also be calculated with BLAS.dot theoretically.
The above solution does not help with it.
Maybe a faster general dot, with @simd, shared iterater etc. , is better.
(I doubt whether it's worth doing layout checks at run time.)
In both v1.6.3 and v1.7-rc1 the
dot
product between a dense matrix and a matched sizeSubArray
produced by aview
that slices along the last dimension falls back on the generaldot
product forAbstractArray
s rather than efficiently usingBLAS.dot
. This can be improved by making a newdot
method tailored for suchSubArray
views, as illustrated in the code below. However the hack below seems not yet worthy of a PR because it is very specific to slicing a 3D array along the last dimension like@view array3d[:,:,slice]
. That is a pretty common way to slice, but it would be better for any new method to be more general. I would make a PR if I knew how to make a type likeSubArray{T, N-1, Array{T, N}, Tuple{Base.Slice{Base.OneTo{Int}}, ..., Base.Slice{Base.OneTo{Int}}, Int}, true)
to express "last dimension sliced".Pinging @dkarrasch as being one of the most recent people who committed to the general
dot
methods, albeit 2 years ago in JuliaLang/julia#32739 😄The text was updated successfully, but these errors were encountered: