0

Dataframe stu_alcol looks like following:

school  sex age address famsize Pstatus Medu    Fedu    Mjob    Fjob    reason  guardian
0   GP  F   18  U   GT3 A   4   4   at_home teacher course  mother
1   GP  F   17  U   GT3 T   1   1   at_home other   course  father
2   GP  F   15  U   LE3 T   1   1   at_home other   other   mother
3   GP  F   15  U   GT3 T   4   2   health  services    home    mother
4   GP  F   16  U   GT3 T   3   3   other   other   home    father

Goal is to multiply all integer values with 10 (playing with data)

This code however throws 'invalid syntax' error

stu_alcol.transform(lambda x: x*10 if isinstance(x, int))

Can anyone help? Please understand that I am aware of other possible solutions. I just want to understand what can be possibly wrong here.

3 Answers 3

2

You can update the entire df to numeric, and let 'coerce' conver the non-numerics to NaN. Multiply that by 10 and update the original df.

This should allow you to handle mixed-type columns properly as well.

df.update(df.apply(pd.to_numeric, errors='coerce').mul(10))
Sign up to request clarification or add additional context in comments.

Comments

1

You can select the columns by name and multiply them by a value.

stu_alcol[['age', 'Medu', 'Fedu']] *= 10
# stu_alcol[['age', 'Medu', 'Fedu']] = stu_alcol[['age', 'Medu', 'Fedu']]*10
# stu_alcol[['age', 'Medu', 'Fedu']] = stu_alcol[['age', 'Medu', 'Fedu']].multiply(10)

All three examples give the same result but using different notations.

Comment

You can perform a apply() function to all rows like below:

stu_alcol = stu_alcol.apply(lambda x: [xx*10 if isinstance(xx,int) else xx for xx in x])

but this is not easy to read and can have some performance problems.

1 Comment

Thank you! The apply( ) function worked and I found it more flexible. Individually selecting columns will also work. But when there are many columns, it will get strenuous.
0

The reason this isn't working is that a lambda function can only have one expression. Your if makes the lambda function more than one expression, hence the 'invalid syntax' error.

You would have to make the lambda function a single expression, for example by making it its own function, to correct the error (also note that the type that you probably want to be checking for is numpy.int64 not int).

As an example, the following will work (although the mult_ints_by_10 function is just some example code to make the point, and certainly isn't optimised!)

def mult_ints_by_10(data_series):
    return_series = data_series.copy()
    for loop in range(len(data_series)):
        element = data_series[loop]
        return_series[loop] = element * 10 if isinstance(element, numpy.int64) else element
    return return_series

stu_alcol.transform(lambda x: mult_ints_by_10(x))

1 Comment

Thank you! I now understand why that piece of code did not work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.