I'm trying to make first steps in implementing graph theory concepts in Python using networkx library.
So I've uploaded the xlsx file with two columns in pandas dataframe. There are the users which liked each other (for example, in some social network).
Afterwards, the graph structure was created, main measures were calculated (degree, pagerank, betweenness) and the plot's made.
Here is the deal:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
users=pd.read_excel('users.xlsx')
users.head()
user1 user2
Bob Adam
Adam John
John Bob
g=nx.Graph()
a=g.add_edges_from(zip(users.user1,users.user2))
cc=sorted(nx.connected_components(g),key=len, reverse=True)
G = g.subgraph(cc[0])
centrality = pd.DataFrame({'user':G.nodes()})
centrality['degree'] = centrality.user.map(nx.degree(G))
centrality['pagerank'] = centrality.user.map(nx.pagerank(G))
centrality['betweenness'] = centrality.user.map(nx.betweenness_centrality(G))
nx.draw(G)
plt.show()
plt.savefig("path.png")
So, everything goes fine until now. But my goal is to create more complex structures. For example, my idea is to do something like in linkedin, for example the user1 connected to user2 because of the same working place.
Another words, I think I should somehow add the third column to dataframe and append it into graph. But when trying to do that and use zip function (like for two columns) method add_edges_from gives me an error and says that it can only handle two parameters.
Can you please help me understand how can I apply the graph the structure like this:
User1 User2 Company
Bob Adam Vilco
Adam John Darrel
John Bob Vilco