Filter a tuple with another tuple in Python

Question

I have a list of tuples that is created with the zip function. zip is bringing together four lists: narrative, subject, activity, and filer, each of which is just a list of 0s and 1s. Let's say those four lists look like this:

narrative = [0, 0, 0, 0]
subject = [1, 1, 0, 1]
activity = [0, 0, 0, 1]
filer = [0, 1, 1, 0]

Now, I'm ziping them together to get a list of boolean values indicating if any of them are True.

ny_nexus = [True if sum(x) > 0 else False for x in zip(narrative, subject, activity, filer)]

The problem I'm having now, is getting a second list of tuples for which the names of the variables is returned if it had a 1 during the iteration. I imagine it would look something like this:

variables = ("narrative", "subject", "activity", "filer")
reason = [", ".join([some code to filter a tuple]) for x in zip(narrative, subject, activity, filer)]

I just can't figure out how I'd go about this. My desired output would look like this:

reason
# ["subject", "subject, filer", "filer", "subject, activity"]

I'm somewhat new to Python, so I apologize if the solution is easy.

By the way, you can say ny_nexus = [sum(x) > 0 for x in zip...] — zondo
– zondo, Commented Feb 22, 2016 at 16:47
Even better, use the any() built-in function ;) any([0, 0, 0]) == False, any([0, 1, 0]) == True. So, ny_nexus = [any(x) for x in zip...] — AkiRoss
– AkiRoss, Commented Feb 22, 2016 at 17:21

vaultah · Accepted Answer · 2016-02-22 17:39:10Z

3

Store tuples in a dictionary for a cleaner solution:

tups = {'narrative': narrative,
        'subject': subject,
        'activity': activity,
        'filer': filer}

The solution:

reason = [', '.join(k for k, b in zip(tups, x) if b) for x in zip(*tups.values())]

It can also be written using itertools.compress:

from itertools import compress
reason = [', '.join(compress(tups, x)) for x in zip(*tups.values())]

Solutions above do not preserve the order of tuples, e.g. they can return something like

['subject', 'filer, subject', 'filer', 'activity, subject']

If you need the order to be preserved, use collections.OrderedDict as shown below:

from collections import OrderedDict

tups = OrderedDict([
    ('narrative', narrative),
    ('subject', subject),
    ('activity', activity),
    ('filer', filer)
])

# The result is ['subject', 'subject, filer', 'filer', 'subject, activity']

EDIT: The solution that doesn't involve dictionaries:

from itertools import compress
reason = [', '.join(compress(variables, x))
          for x in zip(narrative, subject, activity, filer)]

Consider using dictionaries if the zip(...) call no longer fits on one line.

edited Feb 22, 2016 at 17:39

answered Feb 22, 2016 at 16:57

vaultah

46.9k13 gold badges120 silver badges145 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

tblznbits Over a year ago

Thus far, this is the only solution that is working. However, it is the one I understand the least. Can you explain what compress does, along with what the role * plays in zip(*tups.values()) please?

John Y Over a year ago

@vaultah: I don't see how a dictionary or OrderedDict makes this any cleaner. You've already got compress(), why not just feed it OP's variables tuple?

John Y Over a year ago

@brittenb: If you look at the documentation for compress(), you will see that it does pretty much exactly what you're looking for. It selects elements of one sequence based on whether the corresponding elements of another sequence are true.

tblznbits Over a year ago

@JohnY Yeah, I just pulled up the documentation for it and it's pretty self explanatory. Based on that it seems the right solution to this problem is reason = [", ".join(compress(variables, x)) for x in zip(narrative, subject, activity, filer)]

John Y Over a year ago

@brittenb: Exactly right. No need in this case to mess with dictionaries or the asterisk operator. (It will be handy to know about, and the official tutorial goes over it, but you can worry about it later.)

|

Alexander · Accepted Answer · 2016-02-22 17:44:37Z

1

Using zip(narrative, subject, activity, filer) basically transposes the matrix (your list of lists of equal length make up the matrix). You then enumerate through these to find the location n of where the flag is true and index the appropriate variable.

narrative = [0, 0, 0, 0]
subject = [1, 1, 0, 1]
activity = [0, 0, 0, 1]
filer = [0, 1, 1, 0]
variables = ("narrative", "subject", "activity", "filer")
# ========================================================

new_list = [[variables[n] for n, flag in enumerate(indicators) if flag] 
            for indicators in zip(narrative, subject, activity, filer)]
>>> new_list
[['subject'], ['subject', 'filer'], ['filer'], ['subject', 'activity']]

To see the transpose:

>>> [i for i in zip(narrative, subject, activity, filer)]

edited Feb 22, 2016 at 17:44

answered Feb 22, 2016 at 16:56

Alexander

111k32 gold badges212 silver badges208 bronze badges

Comments

jsbueno · Accepted Answer · 2016-02-22 16:54:04Z

0

You can just use the filtering aspect of the comprehension syntax to get your vaiable English name only if the respective flag is True:

variables = ("narrative", "subject", "activity", "filer")
[tuple (name for flag, name in zip(x, variables) if x)  for x in zip(narrative, subject, activity, filer)]

That said, there is something fishy about your approach - you'd probbly be (far) better of with an object oriented approach there, instead of trying to manually coordinate independent sequences of variables for each of your subjects.

answered Feb 22, 2016 at 16:54

jsbueno

114k11 gold badges159 silver badges239 bronze badges

1 Comment

tblznbits Over a year ago

This approach makes sense to me, but since x will always evaluate as Truthy, it returns all values in variables. Can you elaborate on your object-oriented approach? I'm not married to the way I'm currently doing it and am always willing to try new methods.

Sid · Accepted Answer · 2016-02-22 17:23:04Z

0

    narrative = [0, 0, 0, 0]
    subject = [1, 1, 0, 1]
    activity = [0, 0, 0, 1]
    filer = [0, 1, 1, 0]
    variables = ("narrative", "subject", "activity", "filer")
    ny_nexus = [True if sum(x) > 0 else False for x in zip(narrative, subject, activity, filer)]
    output = []
    [[output.append(variables[j]) if t==1 else None for j,t in enumerate(x)] for x in zip(narrative, subject, activity, filer)]
    print ny_nexus
    print output

Of course you could just do the following without using list comprehensions:

    narrative = [0, 0, 0, 0]
    subject = [1, 1, 0, 1]
    activity = [0, 0, 0, 1]
    filer = [0, 1, 1, 0]
    variables = ("narrative", "subject", "activity", "filer")
    ny_nexus = [True if sum(x) > 0 else False for x in zip(narrative, subject, activity, filer)]
    output = []
    for x in zip(narrative, subject, activity, filer):
        for j,t in enumerate(x):
            output.append(variables[j])
    print ny_nexus
    print output

edited Feb 22, 2016 at 17:23

answered Feb 22, 2016 at 16:56

Sid

7,6412 gold badges31 silver badges42 bronze badges

3 Comments

tblznbits Over a year ago

This approach also makes sense to me, but it is returning None for all values. Any idea why?

Sid Over a year ago

The results of ny_nexus in the above code are actually useless. The useful results are only in output. In fact, you could change the name of ny_nexus to tmp or something and disregard it. The above code would be required in addition to your following statement: ny_nexus = [True if sum(x) > 0 else False for x in zip(narrative, subject, activity, filer)]

Sid Over a year ago

Updated answer to reflect

Collectives™ on Stack Overflow

Filter a tuple with another tuple in Python

4 Answers 4

7 Comments

Comments

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

7 Comments

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related