0

Gurus, we are looking for a pythonic way (python 2.7) to convert categorical values in a column into binary values into a single new column. Example: In the "Loan_status" column,

 Loan_Status
 Charged Off
 Default
 Fully Paid
 Current
 Does not meet the credit policy. Status:1
 Does not meet the credit policy. Status:0

We are trying to make "Charged Off", "Default" into "0", "Fully Paid", "Current" into "1", and delete any row that contains "Does not meet the credit policy. Status:1" and "Does not meet the credit policy. Status:0".

Desired Output:

 Loan_Status
 0
 0
 1
 1

Is there any pythonic way to do it? Pandas get_dummies will generate multiple columns, so it doesn't seem to work. Thanks!

1 Answer 1

2

Let's define a list of positive and negative class labels.

positive = ['Fully Paid', 'Current']
negative = ['Charged Off', 'Default']

First, filter the dataframe for rows that are invalid for your model. We can use isin to for filtering only values in either

filtered_df = df[df['Loan_Status'].isin(positive + negative)].copy()

Second, create a new column for positive labels. If it needs to be 0 or 1 we can cast the boolean result to type int.

filtered_df['Loan_Status'] = filtered_df['Loan_Status'].isin(positive).astype(int)
Sign up to request clarification or add additional context in comments.

2 Comments

great answer! Thank you!
Hmmm encounter a new question: if I have filtered multiple columns like this, any easy way to combine them into a big and clean table?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.