0

I keep getting warning "A value is trying to be set on a copy of a slice from a DataFrame". How could I fix it? Any alternative

 //check for NAN
 //capitalise first letter
//assign 'Male' for 'm',
//assign 'Female for 'f'

myDataFrame.to_csv('new_H.csv')
genderList = myDataFrame.loc[:,"Gender"] //extract Gender column

for i in range(0, len(genderList)):

    if type(genderList[i]) == float:   #check for empty spaces
         genderList[i] = 'NAN'
    elif genderList[i].startswith('f'):
          genderList[i] = 'Female'    
    elif genderList[i].startswith('m'):
          genderList[i] = 'Male'    

1 Answer 1

1
for row in myDataFrame.itertuples():
    if type(row["Gender"]) == float:
        row["Gender"] = 'NAN'
    elif row["Gender"].startswith('f'):
        row["Gender"] = 'Female'
    elif row["Gender"].startswith('m'):
        row["Gender"] = 'Male'

The line genderList = myDataFrame.loc[:,"Gender"] cause warning since you are assigning a piece of your data frame, which could result a copy so update may not be applied to original dataframe. In code above, I used itertuples method which is a more "correct" way to iterate through rows in pandas. If you want to perform an action on a specific row, you do need to create a slice of it first - you just update the value of this column in every row.

From what I see, you goal is to replace values on Gender based on previous values. In that case I recommend to check pandas's replace method which is made for that exact reason together with filter. But, since your filter is quite simple, you can do the following:

myDataFrame[myDataFrame["Gender"].str.contains('^f')] = "Female"

To update all female. I used slicing of dataframe (myDataFrame[...]) and the condition is myDataFrame["Gender"].str.contains('^f').

Sign up to request clarification or add additional context in comments.

7 Comments

I used myDataFrame['Gender'] = myDataFrame['Gender'].replace('f', 'Female') and it worked . However, I can't find a way to replace White Spaces with NaN . This will not work myDataFrame['Gender'].apply(lambda element: element.replace('', np.NaN))
@bibscy Are you trying to replace white spaces with nan? cause in code you checked values is float, and right now your replace has no chars in it (replace('', np.NaN)).
Yes, I want to replace white spaces with nan. To do this also tried this and it still does not work myDataFrame['Gender'] = myDataFrame['Gender'].replace('', np.nan)
It works for me. Notice in your comment you wrote '' and not ' ' (missing white space). Maybe it is the problem? also, are you sure it's white space?
Just to clarify, I want to replace empty string which in theory has a length of zero. This is not working myDataFrame['Gender'] = myDataFrame['Gender'].replace('', 'NaN') How do I fix this?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.