Make NetworkX node attributes into Pandas Dataframe columns

Question

I have a Networkx graph called G created below:

import networkx as nx
G = nx.Graph()
G.add_node(1,job= 'teacher', boss = 'dee')
G.add_node(2,job= 'teacher', boss = 'foo')
G.add_node(3,job= 'admin', boss = 'dee')
G.add_node(4,job= 'admin', boss = 'lopez')

I would like to store the node number along with attributes, job and boss in separate columns of a pandas dataframe.

I have attempted to do this with the below code but it produces a dataframe with 2 columns, 1 with node number and one with all of the attributes:

graph = G.nodes(data = True)
import pandas as pd
df = pd.DataFrame(graph)

df
Out[19]: 
    0                                      1
0  1  {u'job': u'teacher', u'boss': u'dee'}
1  2  {u'job': u'teacher', u'boss': u'foo'}
2  3    {u'job': u'admin', u'boss': u'dee'}
3  4  {u'job': u'admin', u'boss': u'lopez'}

Note: I acknowledge that NetworkX has a to_pandas_dataframe function but it does not provide a dataframe with the output I am looking for.

ndmeiri · Accepted Answer · 2018-06-09 16:06:47Z

35

Here's a one-liner.

pd.DataFrame.from_dict(dict(graph.nodes(data=True)), orient='index')

edited Jun 9, 2018 at 16:06

ndmeiri

5,03912 gold badges39 silver badges47 bronze badges

answered Jun 9, 2018 at 15:46

iamjli

4864 silver badges3 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Ahmed Al-haddad Over a year ago

This is the more pythonic answer.

Mitar Over a year ago

This does not work though if nodes have no attributes, then you get an empty DataFrame out.

Ufos Over a year ago

@Mitar what would be the expected output for a graph with no attributes? A dataframe with only index?

Mitar Over a year ago

Yes, ideally only index then.

Mitar · Accepted Answer · 2019-04-18 10:54:33Z

6

I think this is even simpler:

pandas.DataFrame.from_dict(graph.nodes, orient='index')

Without having to convert to another dict.

answered Apr 18, 2019 at 10:54

Mitar

7,2708 gold badges69 silver badges94 bronze badges

2 Comments

Mitar Over a year ago

This does not work though if nodes have no attributes, then you get an empty DataFrame out.

Ben Lindsay Over a year ago

I know this answer came 2 years late, but it should be the accepted answer

EdChum · Accepted Answer · 2016-12-25 09:37:11Z

2

I don't know how representative your data is but it should be straightforward to modify my code to work on your real network:

In [32]:
data={}
data['node']=[x[0] for x in graph]
data['boss'] = [x[1]['boss'] for x in graph]
data['job'] = [x[1]['job'] for x in graph]
df1 = pd.DataFrame(data)
df1

Out[32]:
    boss      job  node
0    dee  teacher     1
1    foo  teacher     2
2    dee    admin     3
3  lopez    admin     4

So here all I'm doing is constructing a dict from the graph data, pandas accepts dicts as data where the keys are the column names and the data has to be array-like, in this case lists of values

A more dynamic method:

In [42]:
def func(graph):
    data={}
    data['node']=[x[0] for x in graph]
    other_cols = graph[0][1].keys()
    for key in other_cols:
        data[key] = [x[1][key] for x in graph]
    return data
pd.DataFrame(func(graph))

Out[42]:
    boss      job  node
0    dee  teacher     1
1    foo  teacher     2
2    dee    admin     3
3  lopez    admin     4

edited Dec 25, 2016 at 9:37

answered Jan 27, 2016 at 19:58

EdChum

397k204 gold badges837 silver badges583 bronze badges

3 Comments

BeeGee Over a year ago

Thank you for your solution. The only part of the solution I do not understand is the x[0] for x in graph. I understand that graph is a list but what is happening in x[0] of x in graph?

EdChum Over a year ago

You have a list of tuples, the first element in the tuple is the node value, hence x[0] the second element is a dict x[1]

MERose Over a year ago

There is a mistake. It should be def func(graph):.

LuisZaman · Accepted Answer · 2017-09-26 15:36:13Z

1

I updated this solution to work with my updated version of NetworkX (2.0) and thought I would share. I also had the function return a Pandas DataFrame.

def nodes_to_df(graph):
    import pandas as pd
    data={}
    data['node']=[x[0] for x in graph.nodes(data=True)]
    other_cols = graph.nodes[0].keys()
    for key in other_cols:
        data[key] = [x[1][key] for x in graph.nodes(data=True)]
    return pd.DataFrame(data)

answered Sep 26, 2017 at 15:36

LuisZaman

1044 bronze badges

Comments

Aneho · Accepted Answer · 2023-01-05 16:48:37Z

0

I have solved this with a dictionary comprehension.

d = {n:dag.nodes[n] for n in dag.nodes}

df = pd.DataFrame.from_dict(d, orient='index')

Your dictionary d maps the nodes n to dag.nodes[n]. Each value of that dictionary dag.nodes[n] is a dictionary itself and contains all attributes: {attribute_name:attribute_value}

So your dictionary d has the form:

{node_id : {attribute_name : attribute_value} }

The advantage I see is that you do not need to know the names of your attributes.

If you wanted to have the node-IDs not as index but in a column, you could add as the last command:

df.reset_index(drop=False, inplace=True)

edited Jan 5, 2023 at 16:48

answered Dec 14, 2022 at 12:30

Aneho

7912 bronze badges

Collectives™ on Stack Overflow

Make NetworkX node attributes into Pandas Dataframe columns

5 Answers 5

4 Comments

2 Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

4 Comments

2 Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related