0

I know how to use np.where() to add one column by 1 condition:

import pandas as pd
import numpy as np
df=pd.read_csv(file,nrows=5)
df['new_col1']= np.where(df['col1'] < '100', 1,2)
df.head()

output:

   col1  col2  new_col1
0     1     3    1
1     2     4    1

what if I want to add 2 columns by the same condition:

df['new_col1'],df['new_col2']= np.where(df['col1'] < '100', (1,2),(3,4))

I want to add new_col1 and new_col2,the result are (1,2),(3,4)

When I tried this code, I received:

ValueError: too many values to unpack (expected 2)

The output should be:

   col1  col2  new_col1 new_col2
0     1     3    1       3
1     2     4    1       3
10
  • 1
    np.where returns one value. Could you elaborate as to how you want to generate two values to add instead? Commented Jun 16, 2021 at 21:14
  • Thank you for your reply ,what if I want to add 2 columns by 1 condition ,what else I need to use? Commented Jun 16, 2021 at 21:14
  • 1
    I don't understand what you mean by 'add 2 columns by 1 condition'. Could you give an example of this? Commented Jun 16, 2021 at 21:15
  • df['column1'],df['column2']= np.where(df['contract'] > '0L000099', 1,2) Commented Jun 16, 2021 at 21:16
  • 1
    Just use df['column2'] = df['column1'] after defining column1 by the np.where above ? Commented Jun 16, 2021 at 21:21

1 Answer 1

1

You can use the condition multiple times:

mask = df['contract'] > '0L000099'
df['column1'] = np.where(mask, 1, 2)
df['column2'] = np.where(mask, 3, 4)

or even invert the condition:

df['column2'] = np.where(~mask, 1, 2)

Since your question was updated, here the updated answer, however I am not sure thats actually usefull:

import pandas as pd
df = pd.DataFrame({'test':range(0,10)})
mask  = df['test'] > 3
m_len = len(mask)

df['column1'], df['column2'] = np.where([mask, mask], [[1]*m_len, [3]*m_len], [[2]*m_len, [4]*m_len])

   test  column1  column2
0     0        2        4
1     1        2        4
2     2        2        4
3     3        2        4
4     4        1        3
5     5        1        3
6     6        1        3
7     7        1        3
8     8        1        3
9     9        1        3
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for your answer ,can you please give me some explain about what dose this line mean :[1]*m_len, [3]*m_len], [[2]*m_len, [4]*m_len]
@William, ofc, it is a numpy specific thing called broadcasting if you want to look it up. Essentially numpy is very efficient because of vectorization, because of this numpy expects certain formats to use that. In this case it needs for example one 1 for each boolean in the condition. Therefore we have to repeat the value, which we can do by multiplying with the length of the condition series. Hope that makes sense.
Hi friend can you help me with this question?stackoverflow.com/questions/68476193/…
@William, hey William, see my answer under your question. Let me know how it went. Happy coding!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.