1

What is an elegant way to convert a list of tuples into tables in the following form?

Input:

from pandas import DataFrame
mytup = [('a','b',1), ('a','c',2), ('b','a',2), ('c','a',3), ('c','c',1)]

a       b       1
a       c       2
b       a       2
c       a       3
c       c       1

mydf = DataFrame(mytup, columns = ['from', 'to', 'val'])

output: - may be replaced with blank or nan

     a    b    c
a    -    1   2
b    2    -   -
c    3    -   1
2
  • 1
    str.format() is your new best friend Commented Oct 28, 2014 at 21:40
  • @wnnmaw, The tables are different. I modified the OP Commented Oct 28, 2014 at 21:57

2 Answers 2

7

pivot and fillna are what you want:

import pandas as pd

mytup = [('a','b',1), ('a','c',2), ('b','a',2), ('c','a',3), ('c','c',1)]
mydf = pd.DataFrame(mytup, columns=['from', 'to', 'val'])
mydf.pivot(index='from', columns='to', values='val').fillna(value='-')

to    a  b  c
from         
a     -  1  2
b     2  -  -
c     3  -  1
Sign up to request clarification or add additional context in comments.

3 Comments

thanks. I was trying the pivot thing but somehow couldnt find the suitable example. Thanks again
@learner, glad it helps :) you may read more about pivot here.
as a side note, if you don't mind showing NaN, you can simply omit fillna(...)
0

Hopefully I'm wrong and there's a more direct way to do this, but if not, you can always loop over the tuples:

>>> df = DataFrame([['-'] * 3]*3, columns=['a', 'b', 'c'], index=['a', 'b', 'c'])
>>> for row, col, val in mytup:
...     df[col][row] = val
>>> df
   a  b  c
a  -  1  2
b  2  -  -
c  3  -  1

If you were just dealing with numpy/scipy rather than pandas, I'd note that your tuple format is pretty close to the COO sparse matrix format, so:

>>> tup = [(ord(x)-ord('a'), ord(y)-ord('a'), z) for x,y,z in mytup]
>>> x, y, values = zip(*tup)
>>> m = np.array(scipy.sparse.coo_matrix((values, (x, y))).todense())
>>> print(m)
[[0 1 2]
 [2 0 0]
 [3 0 1]]

However, I don't think pandas has the equivalent of "sparse data frames", and I don't know that it would be more "elegant" to convert to a raw array just to build the resulty array to convert back to a data frame. (It might be more efficient if you could do the letter-to-number mapping vectorized, but that likely doesn't matter here.)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.