Reading a networkx graph from a csv file with row and column header

Question

I have a CSV file that represents the adjacency matrix of a graph. However the file has as the first row the labels of the nodes and as the first column also the labels of the nodes. How can I read this file into a networkx graph object? Is there a neat pythonic way to do it without hacking around?

My trial so far:

x = np.loadtxt('file.mtx', delimiter='\t', dtype=np.str)
row_headers = x[0,:]
col_headers = x[:,0]
A = x[1:, 1:]
A = np.array(A, dtype='int')

But of course this doesn't solve the problem since I need the labels for the nodes in the graph creation.

Example of the data:

Attribute,A,B,C
A,0,1,1
B,1,0,0
C,1,0,0

A Tab is the delimiter, not a comma tho.

So these labels are duplicated in the first row and column so are redundant? You could just use pandas which will use the labels as column names and then build the graph — EdChum
– EdChum, Commented Jul 15, 2014 at 10:46

unutbu · Accepted Answer · 2014-07-15 12:10:03Z

You could read the data into a structured array. The labels can be obtained from x.dtype.names, and then the networkx graph can be generated using nx.from_numpy_matrix:

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt

# read the first line to determine the number of columns
with open('file.mtx', 'rb') as f:
    ncols = len(next(f).split('\t'))

x = np.genfromtxt('file.mtx', delimiter='\t', dtype=None, names=True,
                  usecols=range(1,ncols) # skip the first column
                  )
labels = x.dtype.names

# y is a view of x, so it will not require much additional memory
y = x.view(dtype=('int', len(x.dtype)))

G = nx.from_numpy_matrix(y)
G = nx.relabel_nodes(G, dict(zip(range(ncols-1), labels)))

print(G.edges(data=True))
# [('A', 'C', {'weight': 1}), ('A', 'B', {'weight': 1})]

The nx.from_numpy_matrix has a create_using parameter you can use to specify the type of networkx Graph you wish to create. For example,

G = nx.from_numpy_matrix(y, create_using=nx.DiGraph())

makes G a DiGraph.

EdChum · Accepted Answer · 2014-07-15 11:21:49Z

2

This would work, not sure it is the best way:

In [23]:

import pandas as pd
import io
import networkx as nx
temp = """Attribute,A,B,C
A,0,1,1
B,1,0,0
C,1,0,0"""
# for your case just load the csv like you would do, use sep='\t'
df = pd.read_csv(io.StringIO(temp))
df
Out[23]:
  Attribute  A  B  C
0         A  0  1  1
1         B  1  0  0
2         C  1  0  0

In [39]:

G = nx.DiGraph()
for col in df:
    for x in list(df.loc[df[col] == 1,'Attribute']):
        G.add_edge(col,x)

G.edges()
Out[39]:
[('C', 'A'), ('B', 'A'), ('A', 'C'), ('A', 'B')]

In [40]:

nx.draw(G)

enter image description here

answered Jul 15, 2014 at 11:21

EdChum

397k204 gold badges836 silver badges583 bronze badges

Collectives™ on Stack Overflow

Reading a networkx graph from a csv file with row and column header

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related