1

I am trying to create an undirected graph from a DataFrame formatted_unique_edges - the 'weight' column will purely be used for edge colouring in downstream visualisation using plotly:

    source      target      weight
0   protein_2   protein_3   3
1   protein_2   protein_6   2
2   protein_3   protein_6   2
3   protein_2   protein_4   2
4   protein_2   protein_5   2
5   protein_3   protein_4   2
6   protein_3   protein_5   2
7   protein_4   protein_5   2
8   protein_4   protein_6   1
9   protein_5   protein_6   1

The first lines in the linked plotly example, which I am trying to emulate, is:

G = nx.random_geometric_graph(200, 0.125)
edge_x = []
edge_y = []
for edge in G.edges():
    x0, y0 = G.nodes[edge[0]]['pos']
    x1, y1 = G.nodes[edge[1]]['pos']
    edge_x.append(x0)
    edge_x.append(x1)
    edge_x.append(None)
    edge_y.append(y0)
    edge_y.append(y1)
    edge_y.append(None)

I first convert formatted_unique_edges to a Graph, then try to emulate the code above, with some diagnostic print statements:

G = nx.from_pandas_edgelist(formatted_unique_edges, 
                            edge_attr=True) 
#also tried G = nx.random_geometric_graph(200, 0.125) as per plotly example

edge_x = []
edge_y = []
for edge in G.edges():
    print(edge) #('proteinN', 'proteinM')
    print(G.nodes[edge[0]]) #{}
    print(G.nodes[edge[1]]) #{}
    x0, y0 = G.nodes[edge[0]]['pos']
    #####
    #THROWS KeyError: 'pos' if G is from formatted_unique_edges
    #####
    #prints {'pos': [float, float]} if G is from nx.random_geometric_graph
    x1, y1 = G.nodes[edge[1]]['pos']
    edge_x.append(x0)
    edge_x.append(x1)
    edge_x.append(None)
    edge_y.append(y0)
    edge_y.append(y1)
    edge_y.append(None)

As stated in the comments, I am getting a KeyError from G.nodes[edge[0]]['pos']. I had a look in the spyder variable explorer and G.nodes._nodes from nx.random_geometric_graph has the format:

{0   : {'pos' : [pos_float, pos_float]}, 
 1   : {'pos' : [pos_float, pos_float]},
 ...
 199 : {'pos' : [pos_float, pos_float]}
}

Whereas as G.nodes._nodes from formatted_unique_edges has the format:

{'protein_2' : {},
 'protein_3' : {},
 'protein_4' : {},
 'protein_5' : {},
 'protein_6' : {}}

This all suggests I am making my Graph object from formatted_unique_edges incorrectly with nx.from_pandas_edgelist - can someone advise how I should be doing it?

Thanks! Tim

1 Answer 1

4

You missed to generate a layout for your graph. random_geometric_graph generate a graph but not only. It also call a layout to generate the coordinates (pos).

# Convert your dataframe to graph
G = nx.from_pandas_edgelist(formatted_unique_edges, edge_attr=True)

# Generate the layout and set the 'pos' attribute
pos = nx.drawing.layout.spring_layout(G)
nx.set_node_attributes(G, pos, 'pos')

edge_x = []
edge_y = []
for edge in G.edges():
    x0, y0 = G.nodes[edge[0]]['pos']
    x1, y1 = G.nodes[edge[1]]['pos']
    edge_x.append(x0)
    edge_x.append(x1)
    edge_x.append(None)
    edge_y.append(y0)
    edge_y.append(y1)
    edge_y.append(None)

Output:

>>> G.nodes._nodes
{'protein_2': {'pos': array([0.5830424, 0.0301945])},
 'protein_3': {'pos': array([-0.42158911,  0.33654032])},
 'protein_6': {'pos': array([0.30069049, 1.        ])},
 'protein_4': {'pos': array([-0.71990583, -0.51877307])},
 'protein_5': {'pos': array([ 0.25776204, -0.84796174])}}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.