Hi guys,
I’m computing a derivative from a summation. I thought vectorizing the function would be faster than a loop, but it isn’t the case.
using BenchmarkTools, Random
ωc = 0.05
chi = 0.01
const k = ωc^2
const couple = sqrt(2/ωc^3) * chi
const coeff = reshape([1.359e+00, 8.143e-01, 3.142e+00, 7.304e-02, 3.391e+00,
3.142e+00, 3.099e-01, 2.009e+00, -3.142e+00, 1.714e-02, 4.847e+00,
3.142e+00, 4.495e-03, 6.291e+00, 3.142e+00], 3, 5)
function fitted(x::T) where T<:Real
u = 0.0
for i in 1:5
u += coeff[1, i] * sin(coeff[2, i] * x + coeff[3, i])
end
return u
end
function ∂U∂q1(x::AbstractVector{T}) where T<:Real
∑μ = 0.0
for i in 1:length(x)-1
∑μ += fitted(x[i])
end
return k * (couple * ∑μ + x[end])
end
function ∂U∂q2(x::AbstractVector{T}) where T<:Real
∑μ = sum(fitted.(view(x, 1:length(x)-1)))
return k * (couple * ∑μ + x[end])
end
n = 100
x = Random.rand(n)
@btime ∂U∂q1(x)
@btime ∂U∂q2(x)
Ouput for n = 10
254.728 ns (1 allocation: 16 bytes)
302.381 ns (3 allocations: 224 bytes)
Ouput for n = 100
2.711 μs (1 allocation: 16 bytes)
2.767 μs (3 allocations: 960 bytes)
Ouput for n = 100
5.567 μs (1 allocation: 16 bytes)
5.667 μs (3 allocations: 1.83 KiB)
So the first version with a hand-written loop is both faster (very slightly maybe) and more memory efficient.
Is there a way to make sum
faster and also avoid additional allocations? Or if there is any even better way of implementing the summation?
12 posts - 7 participants