New column in pandas df based on array

Question

Say I have a df like the one below.

    num value
0   1   229
1   2   203
2   3   244
3   4   243
4   5   230

And an array: array([ 2, 4]).

I would like to create a new column for binary variable, such that it is 1 when num is equal to the value in the array and 0 otherwise.

    num value binary
0   1   229   0
1   2   203   1
2   3   244   0
3   4   243   1
4   5   230   0

Wanted to use: df["binary"] = np.where(df["num"] == dtemp.num.unique(), 1, 0), where dtemp.num.unique() is the aforementioned array. But since the lengths of df and array are different - I get the "Lengths must match to compare" error.

G. Anderson · Accepted Answer · 2021-02-26 00:17:24Z

2

No need to iterate, you can use isin() to check membership in your array

arr=np.array([2,4])

df['bin']=df['num'].isin(arr).astype(int)

    num value   bin
0   1   229     0
1   2   203     1
2   3   244     0
3   4   243     1
4   5   230     0

answered Feb 26, 2021 at 0:17

G. Anderson

5,9652 gold badges16 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

maziyank · Accepted Answer · 2021-02-26 00:13:10Z

1

Using series.apply() and lambda you can do this.

df['binary'] = df['num'].apply(lambda x: 1 if x in [2,4] else 0)

Reference:

answered Feb 26, 2021 at 0:13

maziyank

6163 silver badges10 bronze badges

1 Comment

bajun65537 Over a year ago

I was trying map but it did not work. But apply worked like a charm!

BigHeadEd · Accepted Answer · 2021-02-26 00:10:32Z

1

You should be able to use itertuples() here.

binary = []
for row in df.itertuples():
    if row[1] in array:
        binary.append(1)
    else:
        binary.append(0)

df['binary'] = binary

answered Feb 26, 2021 at 0:10

BigHeadEd

751 gold badge1 silver badge8 bronze badges

2 Comments

bajun65537 Over a year ago

This solution does not provide the correct answer though. I haven't tried in my minimal working example but in the original problem it comes short of two ones.

BigHeadEd Over a year ago

That's pretty curious. The only reason I could think readily as to why that would be is if something is weird with the data types, but I doubt that's the case. Thanks for letting me know.

Collectives™ on Stack Overflow

New column in pandas df based on array

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related