1

So I have one dataframe which has multiple columns and I want to try to find out if the values in my "Linked.." columns is in column called "New Names", and if it is then that specific cell value should be set in such way "cell.value - Yes" or if not that "cell.value - No"

import pandas as pd

d = {'New Names': ['a,b,c','a','c,d,e,f','a'], 'Linked Letter 0': 
['a','b','c','d'],
'Linked Letter 1': ['c','s','v','None'],
'Linked Letter 2': ['None','None','d','s']}

df_new = pd.DataFrame(data=d)

df_new


      Index   New Names   Linked Letter 0   Linked Letter 1   Linked Letter 2  
     ------- ----------- ----------------- ----------------- ----------------- 
        0       a,b,c       a                 c                 None             
        1       a           b                 s                 None             
        2       c,d,e,f     c                 v                 d                
        3       a           d                 None              s      

So expected result should be as following table;

    Index   New Names   Linked Letter 0   Linked Letter 1   Linked Letter 2  
    ------- ----------- ----------------- ----------------- ----------------- 
    0        a,b,c         a - YES           c - YES           None             
    1        a             b - NO            s - NO            None             
    2        c,d,e,f       c - YES           v - NO            d - YES          
    3        a             d - NO            None              s - NO    

One Problem with the solution provided below:

The problem is that mapping to YES and NO to values sometimes don't work as expected. For instance, same value which gets YES at the end can get NO in the next row even though the value in New Names column is same in both rows.

Why do you think this would occur?

1 Answer 1

2

You can use pd.DataFrame.filter to filter your Linked columns, a list comprehension to construct a Boolean array, and finally loc with np.where for your conditional logic:

df = pd.DataFrame(data=d)

for col in df.filter(like='Linked'):
    bools = [link in new_names for link, new_names in zip(df[col], df['New Names'])]
    df.loc[df[col] != 'None', col] += pd.Series(np.where(bools, ' - YES', ' - NO'))

print(df)

  Linked Letter 0 Linked Letter 1 Linked Letter 2 New Names
0         a - YES         c - YES            None     a,b,c
1          b - NO          s - NO            None         a
2         c - YES          v - NO         d - YES   c,d,e,f
3          d - NO            None          s - NO         a
Sign up to request clarification or add additional context in comments.

7 Comments

although it works as it should however there are some cases where the boolean array populates the wrong tag (YES, NO) and this happens to even exactly the same values. for example in some rows you find one item in New Names column and attach "Yes" value and for the same item in different row it attaches "No" even though value is in New Names column
@iSerd, Sorry, I don't follow. Would you like to edit your question with data which does not work with my proposed solution?
Hi @jpp, I just updated the entry with my question thanks for your consideration
@iSerd, Sorry, I meant change your input, give a minimal reproducible example, i.e. input where my logic does not work. It works with the input you've provided.
Actually if i try to re-construct the data by myself then your solution works however I need to process the data before to get this version I provided in my question and I can't share the actual data (confidential) so I believe it must be something related to data type issue or sth (I already checked the type of data where i have the wrong mapping for Yes and No) - do you think of any other possible problem here? because that's so weird that it sometimes works and sometimes doesn't
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.