0

I have some data in excel which represents information about a graph and it looks like this:

1  2  4.5
1  3  6.6
2  4  7.3
3  4  5.1

The first two elements in each row are edges of the graph and the last element is the weight of the arc between those two edges. For example, edge "1" is connected to edge "2" and the weight is 4.5

I import this data into python by the following code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

training_data_x = pd.read_excel("/Users/mac/Downloads/navid.xlsx",header=None)

x= training_data_x.as_matrix()

So "x" here is the adjacency matrix of the graph. What I am trying to do is converting x to list of dictionaries in python which I need in another code. I am kind of new to python but I think a dictionary that suits here kind of looks like this

gr = {'1': {'2': 4.5, '3': 6.6},
      '2': {'4': 7.3},
      '3': {'4':5.1}}

In fact "gr" should be output of my code here. I think I should use ""pandas.DataFrame.to_dict"' but I have hard time using this command. I really appreciate your help here.

2
  • I'm not sure x is actually an adjacency matrix, as it is commonly understood. Commented Jan 2, 2017 at 21:35
  • Yes. I see what you mean. But my question still exists which is how to convert x here to dictionary as above? Commented Jan 2, 2017 at 21:41

2 Answers 2

2

In case you want to rely on pandas' great groupby/split/combine functionality (see more here) in addition to the pandas.DataFrame.to_dict method you could actually do the following:

import pandas as pd

file_path = "/Users/mac/Downloads/navid.xlsx"
gr = pd.read_excel(file_path, header=None, index_col=0) \ 
   .groupby(level=0) \ 
   .apply(lambda x: dict(x.to_records(False))) \
   .to_dict()

This should work for all pandas versions above 0.17.

Sign up to request clarification or add additional context in comments.

1 Comment

Your apply was very clever.
0

My advice: save your xlsx file as a csv. Now, using vanilla Python:

import csv
gr = {}
with open('data.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        e1, e2, w = row
        gr.setdefault(e1, {})[e2] = float(w)

Perhaps even better, use a defaultdict:

import csv
from collections import defaultdict
gr = defaultdict(dict)
with open('data.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        e1, e2, w = row
        gr[e1][e2] = float(w)

EDIT: Note, I have converted to float manually, but you can probably get away with simply passing the following argument to csv.reader: csv.reader(f, quoting=csv.QUOTE_NONNUMERIC) if you don't mind having your keys be floats as well.

3 Comments

Thank you for your answer. I missed the last part that you mentioned about the float. What do you mean by converting to float? Also I appreciate if you explain to me about bypassing this conversion and what should I do?
@navid because you want string keys and float values for the innermost dictionaries, right? That is exactly what you have written. The csv module doesn't do automatic conversion, so you can either get everything converted to floats by passing the quoting parameter or do it manually as I demonstrated. I would use the second version.
@navid also, you should check out the current version, cleaned up a bug that you might not have noticed

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.