Embedding Graph in Euclidean Space

Question

I have a total undirected graph where nodes represent points on points on a plane, and edges are approximate euclidean distances between the points. I would like to "embed" this graph in a two dimensional space. That is, I want to convert each vertex to an (x,y) position tuple so that for any two two vertices v and w, the edge (v,w) has weight close to dist(v,w).

For example, if I had the graph with nodes A, B, C, and D and edges with weights (A,B): 20; (A,C): 22; (A,D): 26; (B, C): 30; (B, D): 20, (C, D): 19, then you could assign the points A: (0,0); B: (10, 0); C: (0, 10); D: (10, 10). Clearly this is imperfect, but it is a reasonable approximation.

I don't care about getting the best possible solution, I just want a reasonable one in a reasonable amount of time.

(In case you want the motivation for this problem. I have a physical system where I have noisy measurements of distances from all pairs of points. Distance measurements are noisy, but tend to be within a factor of two of the true value. I have made all of these measurements, and now have a graph with several thousand nodes, and several million edges, and want to place the points on a plane.)

Darren Engwirda · Accepted Answer · 2013-01-15 00:50:06Z

You may be able to adapt the force-based graph drawing algorithm for your needs.

This algorithm attempts to find a good layout for an undirected graph G(V,E) by treating each vertex in V as a Cartesian point and each edge in E as a linear spring. Additionally, a pair-wise repulsive force (i.e. Coulomb's law) is calculated between vertices globally - this prevents the clustering of vertices in Cartesian space that are non-adjacent in G(V,E).

In your case you could set the equilibrium length of the springs equal to your edge weights - this should give a layout with pair-wise Euclidean vertex distances close to your edge weights.

The algorithm updates an initial distribution (possibly random) in a pseudo-time stepping fashion based on the sum of forces at each vertex. The algorithm terminates when an approximate steady-state is reached. A simplified pseudo-code:

while(not converged)
    for i = vertices in V
        F(i) = sum of spring + repulsive forces on ith vertex
    endfor
    Update vertex positions based on force vector F 
    if (vertex positions not changing much)
        converged = true
    endif
endwhile

There are a number of optimisations that can be applied to reduce the complexity of the algorithm. For instance, a spatial index (such as a quadtree) can be used to allow for efficient calculation of an approximate repulsive force between "near-by" vertices, rather than the slow global calculation. It's also possible to use multi-level graph agglomeration techniques to improve convergence and optimality.

Finally, note that there are several good libraries for graph drawing that implement optimised versions of this algorithm - you might want to check out Graphviz for instance.

amit · Accepted Answer · 2013-01-14 22:51:59Z

For starters, I think I'd go for a heuristic search approach.

You actually want to find a set of point p1,p2,...,p_n that minimizes the function:

f(X) = Sum (|dist(p_i,p_j) - weight(n_i,n_j)|) [for each i,j ]

The problem can be heuristically solved by some algorithms including Hill Climbing and Genetic Algorithms.

I personally like Hill Climbing, and the approach is as follows:

best <- [(0,0),(0,0),...,(0,0)]
while there is still time:
    S <- random initialized vector of points
    flag <- true
    while (flag):
        flag <- false
        candidates <- next(S) (*)
        S <- X in candidates such that f(X) <= f(Y) for each X in candidates (**)
        if f(S) was improved:
            flag <- true
    if f(S) <= f(best):
        best <- S
return best

(*) next() generates a list of candidates. It can utilize information about gradient of function (and basically decay into something similar to gradient descent) for example, or sample a few random 'directions' and put them as candidates (all in the multi-dimensional vector, where each point is a dimension).
(**) In here, you basically chose the "best" candidate, and store it in S, so you will continue with it in next iteration.

Note, the algorithm is any-time, so it is expected to get better the more time you have to give it. This behavior is achieved by the random initialization of starting point - which is likely to change the ending result, and by the random selection of points for candidates.

Collectives™ on Stack Overflow

Embedding Graph in Euclidean Space

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related