2

Let say I have a table in python in the following format:

id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 300
id2 V2 400
id2 V3 1   

and I want to convert in to this format:

id1 100 200 0
id2 300 400 1

V1, V2 and V3 are variable names and I do not need to keep them. I just need ids and the values of each variable. Number of variables (Vs) is also unknown but the order of variables is always same. I'm looking for a smarter way to convert rather than looping. Is there any built-in function?

1
  • Any builtin will use looping. No? Commented Apr 6, 2016 at 0:50

1 Answer 1

3

There is no other builtin method but you can group using a dict:

from collections import defaultdict

d  = defaultdict(list)

s = """id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 30
id2 V2 400
id2 V3 1"""

for a, _, c in map(str.split, s.splitlines()):
        d[a].append(c)

print(["{},{}".format(k,",".join(v)) for k,v in d.items()]
['id2,30,400,1', 'id1,100,200,0']

If you want the output aligned nicely, you can do something like this

if you want to maintain order, use an OrderedDict:

from collections import OrderedDict

d  = OrderedDict()

s = """id1 V1 100
id1 V2 200
id1 V3 0
id2 V1 30
id2 V2 400
id2 V3 1"""

for a, _, c in map(str.split, s.splitlines()):
        d.setdefault(a, []).append(c)

If it is in a file, you can parse it with the csv lib:

import  csv
from collections import OrderedDict   
d  = OrderedDict()

with open("in.txt") as f:
    for k, _, v in csv.reader(f, delimiter=" "):
        d.setdefault(k, []).append(v)

The OrderedDict output:

OrderedDict([('id1', ['100', '200', '0']), ('id2', ['30', '400', '1'])])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.