create an adjacency matrix in python

Question

I want to load CSV or text file of signed (weighted) graph and create an adjacency matrix. The CSV file contains three columns named "FromNodeId", "ToNodeId" and "Sign". The code I used is as follows:

G = nx.read_edgelist('soc-sign-epinions.txt', data = [('Sign', int)])
#print(G.edges(data = True))

A = nx.adjacency_matrix(G)
print(A.todense())

I encountered the following error

ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than 
the maximum possible size

How can I solve this problem? Please suggest me a way to create the adjacency matrix.

Is nx the networkx library? If so, what does len(G.nodes()) print, i.e. how many nodes are in the graph? Also what is len(G.edges())? Also which line of your code above gives the error? The adjacency_matrix() call, or the todense() call? — GaryO
– GaryO, Commented Sep 8, 2018 at 11:11
nx is networkx library. the graph has 131828 nodes and 711783 edges. the todense() call give the error. — mina
– mina, Commented Sep 8, 2018 at 11:37
Right, you'll never be able to make a dense matrix of that size (131k^2). Think of how many cells that would be! Keep it sparse. — GaryO
– GaryO, Commented Sep 8, 2018 at 13:42

jan1892 · Accepted Answer · 2018-09-09 21:17:15Z

5

The memory needed to store a big matrix can easily get out of hand, which is why nx.adjacency_matrix(G) returns a "sparse matrix" which is stored more efficiently (exploiting that many entries will be 0).

Since your graph has 131000 vertices, the whole adjacency matrix will use around 131000^2 * 24 bytes(an integer takes 24 bytes of memory in python), which is about 400GB. However, your graph has less than 0.01% of all edges, in other words it is very sparse and sparse matrices will work for you.

In order to get the sparse matrix, just use A = nx.adjacency_matrix(G) without calling A.todense() after it (this tries to store it normally again).

There is an inbuild function of scipy.sparse to efficiently save and load sparse matrices, see here. For example, to save your sparse matrix A, use

scipy.sparse.save_npz('filename.npz', A)

If it is important for you to use txt or CSV, you will have to do it manually. This can be done by iterating through every row of your matrix and writing these one by one to your file:

for i in range(A.shape[0]): row = A.getrow(i).todense() [write row to file using your preferred method]

This might take a few minutes to run, but should work (I tested with a path of the same size).

edited Sep 9, 2018 at 21:17

answered Sep 8, 2018 at 11:36

jan1892

864 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

mina Over a year ago

My system is Core i5 and 6GB RAM. I tested on a system with Core i7 and 8GB RAM, but The problem was not resolved. My graph is sparse. I used A = nx.to_numpy_matrix(G) ,but The problem was not resolved.

mina Over a year ago

The weight is 1 or -1(here, weight is sign of edge). How can I create a Sparse Matrix from a CSV file? And it's adjacency matrix?

jan1892 Over a year ago

The method you originally used, nx.adjacency_matrix(G), will return the adjacency matrix stored as sparse matrix. However, A.todense() trys to convert it to a normal array, which will not be possible due to its size. My recommendation is to try to work with A as a sparse matrix. Your weights are already stored very efficiently if they are +-1, no way to save here.

mina Over a year ago

Now I have adjacency matrix stored as sparse matrix in variable A. How can i store variable A in text or CSV file?

jan1892 Over a year ago

Is CSV or txt necessary? If yes, you will have to do it manually. Otherwise, this might be useful. Have you read through here?

|

Collectives™ on Stack Overflow

create an adjacency matrix in python

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related