1

Say I have a df like the one below.

    num value
0   1   229
1   2   203
2   3   244
3   4   243
4   5   230

And an array: array([ 2, 4]).

I would like to create a new column for binary variable, such that it is 1 when num is equal to the value in the array and 0 otherwise.

    num value binary
0   1   229   0
1   2   203   1
2   3   244   0
3   4   243   1
4   5   230   0

Wanted to use: df["binary"] = np.where(df["num"] == dtemp.num.unique(), 1, 0), where dtemp.num.unique() is the aforementioned array. But since the lengths of df and array are different - I get the "Lengths must match to compare" error.

3 Answers 3

2

No need to iterate, you can use isin() to check membership in your array

arr=np.array([2,4])

df['bin']=df['num'].isin(arr).astype(int)

    num value   bin
0   1   229     0
1   2   203     1
2   3   244     0
3   4   243     1
4   5   230     0
Sign up to request clarification or add additional context in comments.

Comments

1

Using series.apply() and lambda you can do this.

df['binary'] = df['num'].apply(lambda x: 1 if x in [2,4] else 0)

Reference:

1 Comment

I was trying map but it did not work. But apply worked like a charm!
1

You should be able to use itertuples() here.

binary = []
for row in df.itertuples():
    if row[1] in array:
        binary.append(1)
    else:
        binary.append(0)

df['binary'] = binary

2 Comments

This solution does not provide the correct answer though. I haven't tried in my minimal working example but in the original problem it comes short of two ones.
That's pretty curious. The only reason I could think readily as to why that would be is if something is weird with the data types, but I doubt that's the case. Thanks for letting me know.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.