convert data table format in python

Question

Let say I have a table in python in the following format:

id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 300
id2 V2 400
id2 V3 1

and I want to convert in to this format:

id1 100 200 0
id2 300 400 1

V1, V2 and V3 are variable names and I do not need to keep them. I just need ids and the values of each variable. Number of variables (Vs) is also unknown but the order of variables is always same. I'm looking for a smarter way to convert rather than looping. Is there any built-in function?

Any builtin will use looping. No?

Jason
– Jason

2016-04-06 00:50:12 +00:00
Commented Apr 6, 2016 at 0:50 — Jason
– Jason, Commented Apr 6, 2016 at 0:50

Community · Accepted Answer · 2017-05-23 11:59:41Z

There is no other builtin method but you can group using a dict:

from collections import defaultdict

d  = defaultdict(list)

s = """id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 30
id2 V2 400
id2 V3 1"""

for a, _, c in map(str.split, s.splitlines()):
        d[a].append(c)

print(["{},{}".format(k,",".join(v)) for k,v in d.items()]
['id2,30,400,1', 'id1,100,200,0']

If you want the output aligned nicely, you can do something like this

if you want to maintain order, use an OrderedDict:

from collections import OrderedDict

d  = OrderedDict()

s = """id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 30
id2 V2 400
id2 V3 1"""

for a, _, c in map(str.split, s.splitlines()):
        d.setdefault(a, []).append(c)

If it is in a file, you can parse it with the csv lib:

import  csv
from collections import OrderedDict   
d  = OrderedDict()

with open("in.txt") as f:
    for k, _, v in csv.reader(f, delimiter=" "):
        d.setdefault(k, []).append(v)

The OrderedDict output:

OrderedDict([('id1', ['100', '200', '0']), ('id2', ['30', '400', '1'])])

Collectives™ on Stack Overflow

convert data table format in python

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related