6

I have a total undirected graph where nodes represent points on points on a plane, and edges are approximate euclidean distances between the points. I would like to "embed" this graph in a two dimensional space. That is, I want to convert each vertex to an (x,y) position tuple so that for any two two vertices v and w, the edge (v,w) has weight close to dist(v,w).

For example, if I had the graph with nodes A, B, C, and D and edges with weights (A,B): 20; (A,C): 22; (A,D): 26; (B, C): 30; (B, D): 20, (C, D): 19, then you could assign the points A: (0,0); B: (10, 0); C: (0, 10); D: (10, 10). Clearly this is imperfect, but it is a reasonable approximation.

I don't care about getting the best possible solution, I just want a reasonable one in a reasonable amount of time.

(In case you want the motivation for this problem. I have a physical system where I have noisy measurements of distances from all pairs of points. Distance measurements are noisy, but tend to be within a factor of two of the true value. I have made all of these measurements, and now have a graph with several thousand nodes, and several million edges, and want to place the points on a plane.)

2 Answers 2

3

You may be able to adapt the force-based graph drawing algorithm for your needs.

This algorithm attempts to find a good layout for an undirected graph G(V,E) by treating each vertex in V as a Cartesian point and each edge in E as a linear spring. Additionally, a pair-wise repulsive force (i.e. Coulomb's law) is calculated between vertices globally - this prevents the clustering of vertices in Cartesian space that are non-adjacent in G(V,E).

In your case you could set the equilibrium length of the springs equal to your edge weights - this should give a layout with pair-wise Euclidean vertex distances close to your edge weights.

The algorithm updates an initial distribution (possibly random) in a pseudo-time stepping fashion based on the sum of forces at each vertex. The algorithm terminates when an approximate steady-state is reached. A simplified pseudo-code:

while(not converged)
    for i = vertices in V
        F(i) = sum of spring + repulsive forces on ith vertex
    endfor
    Update vertex positions based on force vector F 
    if (vertex positions not changing much)
        converged = true
    endif
endwhile

There are a number of optimisations that can be applied to reduce the complexity of the algorithm. For instance, a spatial index (such as a quadtree) can be used to allow for efficient calculation of an approximate repulsive force between "near-by" vertices, rather than the slow global calculation. It's also possible to use multi-level graph agglomeration techniques to improve convergence and optimality.

Finally, note that there are several good libraries for graph drawing that implement optimised versions of this algorithm - you might want to check out Graphviz for instance.

Sign up to request clarification or add additional context in comments.

Comments

1

For starters, I think I'd go for a heuristic search approach.

You actually want to find a set of point p1,p2,...,p_n that minimizes the function:

f(X) = Sum (|dist(p_i,p_j) - weight(n_i,n_j)|) [for each i,j ]

The problem can be heuristically solved by some algorithms including Hill Climbing and Genetic Algorithms.

I personally like Hill Climbing, and the approach is as follows:

best <- [(0,0),(0,0),...,(0,0)]
while there is still time:
    S <- random initialized vector of points
    flag <- true
    while (flag):
        flag <- false
        candidates <- next(S) (*)
        S <- X in candidates such that f(X) <= f(Y) for each X in candidates (**)
        if f(S) was improved:
            flag <- true
    if f(S) <= f(best):
        best <- S
return best

(*) next() generates a list of candidates. It can utilize information about gradient of function (and basically decay into something similar to gradient descent) for example, or sample a few random 'directions' and put them as candidates (all in the multi-dimensional vector, where each point is a dimension).
(**) In here, you basically chose the "best" candidate, and store it in S, so you will continue with it in next iteration.

Note, the algorithm is any-time, so it is expected to get better the more time you have to give it. This behavior is achieved by the random initialization of starting point - which is likely to change the ending result, and by the random selection of points for candidates.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.