3

I have at my disposal huge amount of data, in the form of a list of tuples. Each tuple has a specified format like (a, b, c, d, e). The list of tuples looks like:

tupleList = [('a1', 'b1', 'c1', 'd1', 'e1'),
             ('a2', 'b2', 'c2', 'd2', 'e2'),
             ...
             ('a10000', 'b10000', 'c10000', 'd10000', 'e100000')]

What I want is, to convert each of these tuples to a dictionary, and append the dictionary to a a final list of dictionaries. Can all this be done in a loop? The final list of dictionaries should look like:

finalDictList = [{'key1': 'a1', 'key2': 'b1', 'key3': 'c1', 'key4': 'd1', 'key5': 'e1'},
                 {'key1': 'a2', 'key2': 'b2', 'key3': 'c2', 'key4': 'd2', 'key5': 'e2'},
                 {'key1': 'a3', 'key2': 'b3', 'key3': 'c3', 'key4': 'd3', 'key5': 'e3'},
                 ...
                 {'key1': 'a10000', 'key2': 'b10000', 'key3': 'c10000', 'key4': 'd10000', 'key5': 'e10000'}]

The format of the tuples is fixed. I want to compare afterwords, value of each key of a dictionary with all others. This is why the conversion of tuple to dictionary made sense to me. Please correct me if the design paradigm itself seems wrong. Also, there are >10000 tuples. Declaring that many dictionaries is just not done.

Is there anyway to append dictionary to a list in a loop? Also, if that is possible, can we access each dictionary by it's key values, say, like finalDictList[0]['key1']?

7 Answers 7

10

We're going to mix three important concepts to make this code really small and beautiful. First, a list comprehension, then, the zip method, and finally, the dict method, to build a dictionary out of a list of tuples:

my_list = [('a1', 'b1', 'c1', 'd1', 'e1'), ('a2', 'b2', 'c2', 'd2', 'e2')]
keys = ('key1', 'key2', 'key3', 'key4', 'key5')
final = [dict(zip(keys, elems)) for elems in my_list]

After that, the value of the final variable is:

>>> final
[{'key3': 'c1', 'key2': 'b1', 'key1': 'a1', 'key5': 'e1', 'key4': 'd1'},
{'key3': 'c2', 'key2': 'b2', 'key1': 'a2', 'key5': 'e2', 'key4': 'd2'}]

Also, you can get elements of a certain dictionary using the position of the dictionary in the list and the key you're looking for, i.e.:

>>> final[0]['key1']
'a1'
Sign up to request clarification or add additional context in comments.

1 Comment

@sneha: glad I helped, if this post solved your problem, you could choose it as accepted (green check on the left)
6

Use zip to combine a pre-defined list of key names with each tuple in your input list, then pass the results to dict to make them into dicts. Wrap the whole thing in a list comprehension to process them all in one batch:

keys = ('key1', 'key2', 'key3', 'key4', 'key5')
finalDictList = [dict(zip(keys, values)) for values in tupleList]

Comments

4

I'm not sure I see why you need to convert everything to a dictionary, when you've already got a list of tuples.

>>> tupleList = [('a1', 'b1', 'c1', 'd1', 'e1'),
...              ('a2', 'b2', 'c2', 'd2', 'e2'),
...              ('a10000', 'b10000', 'c10000', 'd10000', 'e100000')]
>>> [x[1] for x in tupleList]
['b1', 'b2', 'b10000']

Using Python's list comprehension syntax, you can get a list of all the n-th elements of each tuple.

3 Comments

I know of this, but the entries in the tuple are related to each other - that is, (a,b,c,d,e) are values that make sense together only. I understand that for comparing this is fine, but i have to work on the complete tuple in case during comparison a match occurs/doesn't occur. I can make this work, but it'll be in a round about way. Thank you! :)
@sneha: Accessing elements of a tuple is almost exactly the same syntax as accessing elements of a dictionary: x[2] vs x["key2"]. If you want, you can define constants like KEY2 = 2, so you can do x[KEY2]. Converting to a list of dictionaries will take up a lot more memory for little benefit.
Hmmm...I guess you have a pretty good point there. Let me reconsider my design. :)
3

If the fields are fix you can do this:

fields = ['key1', 'key2', 'key3', 'key4', 'key5']

newList = [dict(zip(fields, vals)) for vals in oldList]

Comments

2

If as you say you have a lot of entries, remember that python has namedtuples:

>>> tupleList = [('a1', 'b1', 'c1', 'd1', 'e1'),
...              ('a2', 'b2', 'c2', 'd2', 'e2'),
...              ('a10000', 'b10000', 'c10000', 'd10000', 'e100000')]
>>>
>>> from collections import namedtuple
>>> fv = namedtuple('fivevals', ('key1', 'key2', 'key3', 'key4', 'key5'))
>>> tuplelist = [fv(*item) for item in tupleList]
>>> 
>>> tuplelist[0].key1
'a1'
>>>

Namedtuples can be accesed by key but at the same time they are lightweight and require no more memory than regular tuples.

1 Comment

Ack! I'm ashamed that I didn't remember that. Good call!
0
from itertools import izip

keys = ['key1', 'key2', 'key3', 'key4', 'key5']
finalDictList = [dict(izip(names, x)) for x in tupleList]

2 Comments

To avoid building a temporary list in each iteration of course.
That seems like you'd be trading a tiny memory allocation for the overhead of a generator.
0
finalDictList = []
for t in tupleList:
    finalDictList.append({
        'key1': t[0],
        'key2': t[1],
        'key3': t[2],
        'key4': t[3],
        'key5': t[4],
    })

Also, if that is possible, can we access each dictionary by it's key values, say, like finalDictList[0]['key1']?

Absolutely, that is exactly how you would do it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.