0

I'm trying to make first steps in implementing graph theory concepts in Python using networkx library.

So I've uploaded the xlsx file with two columns in pandas dataframe. There are the users which liked each other (for example, in some social network).

Afterwards, the graph structure was created, main measures were calculated (degree, pagerank, betweenness) and the plot's made.

Here is the deal:

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

users=pd.read_excel('users.xlsx')
users.head()

user1 user2
Bob   Adam  
Adam  John
John  Bob

g=nx.Graph()
a=g.add_edges_from(zip(users.user1,users.user2))

cc=sorted(nx.connected_components(g),key=len, reverse=True)
G = g.subgraph(cc[0])

centrality = pd.DataFrame({'user':G.nodes()})

centrality['degree'] = centrality.user.map(nx.degree(G))
centrality['pagerank'] = centrality.user.map(nx.pagerank(G))
centrality['betweenness'] = centrality.user.map(nx.betweenness_centrality(G))

nx.draw(G)
plt.show()
plt.savefig("path.png")

So, everything goes fine until now. But my goal is to create more complex structures. For example, my idea is to do something like in linkedin, for example the user1 connected to user2 because of the same working place.

Another words, I think I should somehow add the third column to dataframe and append it into graph. But when trying to do that and use zip function (like for two columns) method add_edges_from gives me an error and says that it can only handle two parameters.

Can you please help me understand how can I apply the graph the structure like this:

User1   User2   Company
Bob     Adam     Vilco
Adam    John     Darrel
John    Bob      Vilco
2
  • please give us the code needed to reproduce your error Commented Oct 27, 2016 at 15:17
  • g=nx.Graph() a=g.add_edges_from(zip(users.user1,users.user2, users.user3)). The error begins when I add the third column in users dataframe and also in the zip function Commented Oct 27, 2016 at 15:21

1 Answer 1

1

The problem is that you are trying to generate one edge between three elements.

The add_edges_from() function takes a list of tuples and creates edges between the two elements of each tuple. For example

g = networkx.Graph()
g.add_edges_from([(1,2), (3,4)])

would generate two edges: one between nodes 1 and 2 and one between nodes 3 and 4.

The zip function, as called in your code over the sets user.user1 and user.user2, returns such a list of tuples (to be precise, it is a zip object, but in this case it is treated exactly like a list). In your example, the list would look like this:

[('Bob', 'Adam'), ('Adam', 'John'), ('John', 'Bob')]

This is no problem for add_edges_from. It just generates an edge between both names of each tuple.

As you have stated in a comment, you are now trying to execute

g.add_edges_from(zip(users.user1,users.user2, users.user3))

This however generates a "list" of triples:

[('Bob', 'Adam', 'Vilco'), ('Adam', 'John', 'Darrel'), ('John', 'Bob', 'Vilco')]

This is what causes the problem. We cannot generate one edge between three elements; only between two.

One possibility to achieve what you are looking for:

for example the user1 connected to user2 because of the same working place

would be to add the name of the working place to the edge between the two users as an attribute:

g.add_edge('Bob', 'Adam', {'working_place': 'Vilco'})
Sign up to request clarification or add additional context in comments.

2 Comments

Cool, thanks for the explaining-but how can it be done for the whole dataframe? should we use zip also or not?
It depends on what you want to do with the graph afterwards. Does the third value (working place) have to be its own node in the graph? Or is it enough to label the edges between users with the working place? In any case, you can use zip to get all triples and to later iterate over them to perform whatever you want to do (add edges between the users, label the edge with the working place). That's no problem. Just trying to add an edge directly over all three elements is not possible.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.