4

I'm looking for an efficient connection grouping (I'm not sure this is proper name..) algorithm or implementation by python.

For example, I have this nested list:

connection_data = [
   ...:     ["A", "B", "C"],
   ...:     ["B", "D"],
   ...:     ["A", "C"],
   ...:     ["E", "F"],
   ...:     ["C", "D"],
   ...:     ]

This data means each list in the nested list shows connections. For example, the first connection ["A", "B", "C"] means A and B and C are having connection each other. The nested list has multiple connections information.

I would like to calculate connection groupings from a nested list. For example, when I have the upper connection_data, I would like to get

grouped_connection = [
   ...:     ["A", "B", "C", "D"],
   ...:     ["E", "F"],
   ...:     ]

Because, A, B, C, D have connections in these connection data in the connection_data: ["A", "B", "C"], ["B", "D"], ["A", "C"], ["C", "D"], E and F have connection by ["E", "F"].

To summarize my questions:

  1. What is this type of problem called in general?
  2. I think I can implement a many for-loop based solver. But is there any efficient algorithm or implementation in python for this type of problem?
1

2 Answers 2

5

Networkx provides an implementation of a union find datastructure [1] [2] that solves this problem efficiently:

from networkx.utils.union_find import UnionFind

groups = [
    ["A", "B", "C"],
    ["B", "D"],
    ["A", "C"],
    ["E", "F"],
    ["C", "D"],
]

ds = UnionFind()
for gp in groups:
  ds.union(*gp)
for s in ds.to_sets():
  print(s)

# {'B', 'C', 'D', 'A'}
# {'E', 'F'}
Sign up to request clarification or add additional context in comments.

Comments

2

Note: This answer is actually slower than the union-find algorithm given by hilberts_drinking_problem. If all of the input connections are pairs (i.e. of size 2), the runtime of both algorithms is essentially the same. However, this is not the case for OP's question. See the comments for more details.


You can construct a graph where the nodes are lettered A, B, C ..., and place an undirected, unweighted edge between two nodes if the initial groupings dictate that they should be in the same group. Then, the final groups are the connected components of the constructed graph. (This is the closest thing to what your problem is generally called.)

This can be done using a graph traversal algorithm, such as BFS or DFS, but if you don't want to code this up by hand, networkx can take care of everything once you've made the graph. You need to do a little bit of preprocessing on the groupings since some of them are of size greater than two, but otherwise networkx is well-suited for this problem:

from itertools import combinations
import networkx as nx

groups = [
    ["A", "B", "C"],
    ["B", "D"],
    ["A", "C"],
    ["E", "F"],
    ["C", "D"],
]

G = nx.Graph()

# Handle initial groupings of size greater than two by iterating over
# each possible pair of letters in the group.
for group in groups:
    G.add_edges_from(combinations(sorted(group), 2))

# Result should look something like [['B', 'C', 'A', 'D'], ['F', 'E']],
# but the ordering may be nondeterministic.
print(list(list(group) for group in nx.connected_components(G)))

9 Comments

I suggest from itertools import combinations then you can replace your triple loop with for group in groups: G.add_edges_from(combinations(sorted(group), 2))
Also note that you can write nx.connected_components directly without knowing that it's in nx.algorithms.components. This has three advantages: you don't need to know where it is exactly; it's shorter and more readable; if in a future version, networkx is organized differently, your code won't break.
Also, personally when I read answers on StackOverflow, I really like to be able to see the output. So in this case I suggest adding the result # [['E', 'F'], ['B', 'A', 'C', 'D']] just below the final print statement.
I think union find is the way to approach this problem, although this reference provides some relevant discussion.
@Stef Revised answer.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.