41

I have a tuple of tuples - for example:

tupleOfTuples = ((1, 2), (3, 4), (5,))

I want to convert this into a flat, one-dimensional list of all the elements in order:

[1, 2, 3, 4, 5]

I've been trying to accomplish this with list comprehension. But I can't seem to figure it out. I was able to accomplish it with a for-each loop:

myList = []
for tuple in tupleOfTuples:
   myList = myList + list(tuple)

But I feel like there must be a way to do this with a list comprehension.

A simple [list(tuple) for tuple in tupleOfTuples] just gives you a list of lists, instead of individual elements. I thought I could perhaps build on this by using the unpacking operator to then unpack the list, like so:

[*list(tuple) for tuple in tupleOfTuples]

or

[*(list(tuple)) for tuple in tupleOfTuples]

... but that didn't work. Any ideas? Or should I just stick to the loop?

0

7 Answers 7

78

it's typically referred to as flattening a nested structure.

>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> [element for tupl in tupleOfTuples for element in tupl]
[1, 2, 3, 4, 5]

Just to demonstrate efficiency:

>>> import timeit
>>> it = lambda: list(chain(*tupleOfTuples))
>>> timeit.timeit(it)
2.1475738355700913
>>> lc = lambda: [element for tupl in tupleOfTuples for element in tupl]
>>> timeit.timeit(lc)
1.5745135182887857

ETA: Please don't use tuple as a variable name, it shadows built-in.

Sign up to request clarification or add additional context in comments.

8 Comments

thanks... do you mind giving i and j more meaningful names so I can more easily follow this logic?
@SilentGhost - thanks! is this widely accepted? I find it harder to understand at first glance than the longer loop... but if programmers recognize the pattern then I would use it
@froadie: it's an idiomatic way of flattening a shallow list.
I wondered how it looks for longer lists, ie tupleOfTuples=tuple(zip(range(0,100,2), range(1,100,2))) -- in that case, chain is slightly faster than the LC (and Fabian's chain.from_iterable is the fastest)
Struggling to understand how it works. Is there any resource you can recommend with a step-by-step explanation of it?
|
47

Just use sum if you don't have a lot of tuples.

>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> sum(tupleOfTuples, ())
(1, 2, 3, 4, 5)
>>> list(sum(tupleOfTuples, ())) # if you really need a list
[1, 2, 3, 4, 5]

If you do have a lot of tuples, use list comprehension or chain.from_iterable to prevent the quadratic behavior of sum.


Micro-benchmarks:

  • Python 2.6

    • Long tuple of short tuples

      $ python2.6 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]'
      10000 loops, best of 3: 134 usec per loop
      $ python2.6 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))'
      1000 loops, best of 3: 1.1 msec per loop
      $ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
      10000 loops, best of 3: 60.1 usec per loop
      $ python2.6 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))'
      10000 loops, best of 3: 64.8 usec per loop
      
    • Short tuple of long tuples

      $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]'
      10000 loops, best of 3: 65.6 usec per loop
      $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))'
      100000 loops, best of 3: 16.9 usec per loop
      $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
      10000 loops, best of 3: 25.8 usec per loop
      $ python2.6 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))'
      10000 loops, best of 3: 26.5 usec per loop
      
  • Python 3.1

    • Long tuple of short tuples

      $ python3.1 -m timeit -s 'tot = ((1, 2), )*500' '[element for tupl in tot for element in tupl]'
      10000 loops, best of 3: 121 usec per loop
      $ python3.1 -m timeit -s 'tot = ((1, 2), )*500' 'list(sum(tot, ()))'
      1000 loops, best of 3: 1.09 msec per loop
      $ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
      10000 loops, best of 3: 59.5 usec per loop
      $ python3.1 -m timeit -s 'tot = ((1, 2), )*500; from itertools import chain' 'list(chain(*tot))'
      10000 loops, best of 3: 63.2 usec per loop
      
    • Short tuple of long tuples

      $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' '[element for tupl in tot for element in tupl]'
      10000 loops, best of 3: 66.1 usec per loop
      $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500)' 'list(sum(tot, ()))'
      100000 loops, best of 3: 16.3 usec per loop
      $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain; ci = chain.from_iterable' 'list(ci(tot))'
      10000 loops, best of 3: 25.4 usec per loop
      $ python3.1 -m timeit -s 'tot = ((1, )*500, (2, )*500); from itertools import chain' 'list(chain(*tot))'
      10000 loops, best of 3: 25.6 usec per loop
      

Observation:

  • sum is faster if the outer tuple is short.
  • list(chain.from_iterable(x)) is faster if the outer tuple is long.

3 Comments

Wow... wouldn't have thought. For others who are also surprised note that if b = ((1, 2,3), (4, 5), (5,6,7,8)) then sum(b) is (1, 2, 3, 4, 5, 5, 6, 7, 8).... and if c = [(1, 2,3), (4, 5), (5,6,7,8)] then sum(c) is (1, 2, 3, 4, 5, 5, 6, 7, 8). I think that this is just extremely useful to know!
This only works for tuples that have summable data types.
This doesn't work if there's a chance tupleOfTuples doesn't contain all tuples :)
13

You're chaining the tuples together:

from itertools import chain
print list(chain(*listOfTuples))

Should be pretty readable if you're familiar with itertools, and without the explicit list you even have your result in generator form.

Comments

9

I like using 'reduce' in this situation (this is what reduce made for!)

lot = ((1, 2), (3, 4), (5,))
print list(reduce(lambda t1, t2: t1 + t2, lot))

 > [1,2,3,4,5]

2 Comments

In before Alex votes you down with his hatred-for-map/reduce-wrath
Interested in knowing how this compares (in efficiency) with a python list comprehension for large lists
9

Most of these answers will only work for a single level of flattening. For a more comprehensive solution, try this (from http://rightfootin.blogspot.com/2006/09/more-on-python-flatten.html):

def flatten(l, ltypes=(list, tuple)):
    ltype = type(l)
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        i += 1
    return ltype(l)

2 Comments

worked for me to flatten a Tree data structure, where the other solutions didn't go deep enough!
I like this solution - I prefer a general solutions to a specific one.
4

Another solution using itertools.chain

>>> tupleOfTuples = ((1, 2), (3, 4), (5,))
>>> from itertools import chain
>>> [x for x in chain.from_iterable(tupleOfTuples)]
[1, 2, 3, 4, 5]

Comments

4

For multilevel, and readable code:

def flatten(bla):
    output = []
    for item in bla:
        output += flatten(item) if hasattr (item, "__iter__") or hasattr (item, "__len__") else [item]
    return output

I could not get this to fit in one line (and remain readable, even by far)

4 Comments

Some code just wasn't meant to be in one line.
Very helpful for list with varying depth.
isinstance(item, tuple) instead of hasattr (item, "iter") or hasattr (item, "len") would make it shorter
Yes, but then it would work only for tuples - the expression above works for any iterable element in the list - including generators.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.