Quantcast
Channel: First steps - JuliaLang
Viewing all articles
Browse latest Browse all 2795

Sum isn't faster than a loop?

$
0
0

Hi guys,

I’m computing a derivative from a summation. I thought vectorizing the function would be faster than a loop, but it isn’t the case.

using BenchmarkTools, Random

ωc = 0.05
chi = 0.01
const k = ωc^2
const couple = sqrt(2/ωc^3) * chi
const coeff = reshape([1.359e+00, 8.143e-01, 3.142e+00, 7.304e-02, 3.391e+00,
    3.142e+00, 3.099e-01, 2.009e+00, -3.142e+00, 1.714e-02, 4.847e+00,
    3.142e+00, 4.495e-03, 6.291e+00, 3.142e+00], 3, 5)

function fitted(x::T) where T<:Real
    u = 0.0
    for i in 1:5
        u += coeff[1, i] * sin(coeff[2, i] * x + coeff[3, i])
    end
    return u
end

function ∂U∂q1(x::AbstractVector{T}) where T<:Real
    ∑μ = 0.0
    for i in 1:length(x)-1
        ∑μ += fitted(x[i])
    end
    return k * (couple * ∑μ + x[end])
end

function ∂U∂q2(x::AbstractVector{T}) where T<:Real
    ∑μ = sum(fitted.(view(x, 1:length(x)-1)))
    return k * (couple * ∑μ + x[end])
end


n = 100
x = Random.rand(n)
@btime ∂U∂q1(x)
@btime ∂U∂q2(x)

Ouput for n = 10

  254.728 ns (1 allocation: 16 bytes)
  302.381 ns (3 allocations: 224 bytes)

Ouput for n = 100

  2.711 μs (1 allocation: 16 bytes)
  2.767 μs (3 allocations: 960 bytes)

Ouput for n = 100

  5.567 μs (1 allocation: 16 bytes)
  5.667 μs (3 allocations: 1.83 KiB)

So the first version with a hand-written loop is both faster (very slightly maybe) and more memory efficient.

Is there a way to make sum faster and also avoid additional allocations? Or if there is any even better way of implementing the summation?

12 posts - 7 participants

Read full topic


Viewing all articles
Browse latest Browse all 2795

Trending Articles