Quantcast
Channel: First steps - JuliaLang
Viewing all articles
Browse latest Browse all 2795

Inefficiency of add_edge! in SimpleWeightedGraphs

$
0
0

I have a master graph with 250,000 nodes, and a variable called homes which is a Vector of Vectors of size 4, with values between 1 and 100,000. The idea is that many vector of size 4 has a corresponding dense graph, which is embeded in the master graph. The master graph can either be a SimpleGraph or a SimpleWeightedGraph. In the former case, the code runs super fast in about 0.15 sec. In the latter case (with SimpleWeightedGraph), the code takes 24 sec. And yet, when there are only a low number of lists of 4 elements, the SimpleWeightGraphs take only about twice as long as the SimpleGraphs. My conclusion: add_edge! for SimpleWeightedGraph becomes less and less efficient as more and more edges are added. If this is not the case, I have no idea why the code becomes so inefficient. Needless to say, I cannot accept this for my use case.

function ccreateHomeGraph(nb_nodes)
    #mgh = SimpleWeightedGraph(nb_nodes)
    mgh = SimpleGraph(nb_nodes)   # <<<< Should be weighted graph?
    homes = rand(1:100000, 100000)
    homes = reshape(homes, div(length(homes), 4), 4)
    homes = [homes[i,:] for i in 1:size(homes,1)]
    @show size(homes)
    #homes = [[1,2,3,4,5] for i in 1:100000]
    weight = 0.3  # need more generality
    for r in 1:length(homes)
        person_ids = homes[r]
        sz = length(person_ids)
        @show r, sz
        for i in 1:sz
            for j in i+1:sz
                #add_edge!(mgh, person_ids[i]+1, person_ids[j]+1, weight)
                add_edge!(mgh, person_ids[i]+1, person_ids[j]+1)
            end
        end
    end
    return mgh
end
# SimpleGraph about 2x speed of Weighted SimpleGraphs
@time graph = ccreateHomeGraph(250000)

Here is the add_edge! function that is called. Does it become more and more inefficient the more edges are added? If so, why? This function is found in simplegraphs.jl:

# add_edge! will overwrite weights.
function add_edge!(g::SimpleWeightedGraph, e::SimpleWeightedGraphEdge)
    T = eltype(g)
    U = weighttype(g)
    s_, d_, w = Tuple(e)

    if w == zero(U)
        @warn "Note: adding edges with a zero weight to this graph type has no effect." maxlog=1 _id=:swg_add_edge_zero
        return false
    end

    s = T(s_)
    d = T(d_)
    (s in vertices(g) && d in vertices(g)) || return false
    @inbounds g.weights[d, s] = w
    @inbounds g.weights[s, d] = w
    return true
end

After reading more threads on this subject, I recoded to work with MetaGraphs, which runs the full problem in about 40 sec, although it took about 3 seconds with SimpleGraphs. So scaling as a function of the number of edges is improved with MetaGraphs; however, the time is substantially more expensive than with SimpleGraphs. Looking at !add_edge() in MetaGraphs.jl, I fell there is lots of room for improvement if the library contained optimized versions of the add_edge! function. For example, in many use cases, mine, in particular, I know that my end nodes are in the vertex set, so there is no need to check. How expensive is the check? Is it an O(1) check? Retrieving a property from each edge also strikes me as expensive. Just keeping a dictionary{Edge, Property} strikes me as far more efficient. Perhaps that is what MetaGraphs does already?

8 posts - 2 participants

Read full topic


Viewing all articles
Browse latest Browse all 2795

Trending Articles