0

I have a excel file payments with approximately 50,000 rows with the next structure :

enter image description here

I import the data from excel to python through this code:

test= pd.read_excel(r'D:\\Users\\Desktop\\test_stack.xlsx')

However, when I want execute the custom functions indicated below, the following error is generated :

TypeError: unsupported operand type(s) for /: 'str' and 'str'

Apparently, the column E is read as string (str), so these functions cannot be executed. It should be noted that the functions are iterative, so they go through each element enclosed in {} and separated by commas for the column payments, to execute the operations, and later create the corresponding columns.

Likewise, when I execute the functions creating the dataframe manually, the functions execute without problems, but I need to execute them in the format of the mentioned excel file.

test= pd.DataFrame({'id':['F8510004123','A3100002543','Z3510002123'],
                     'product':['retail','retail','others'],
                     'type':['E','E','D'],
                     'quantity':[25,34,150],
                     'nro_ope':[2,3,26],
                     'payments':[[1030.97,1030.97,584.91],[1610.74,1610.74,1610.74,1610.74,1611.14],[1007.52,1007.52,1007.52,1007.52,500,500,852.95]]
                     })

;

def var_payments (x) :
    variation  = [round(abs(a/b -1),3) for a,b in permutations(x,2)]
    return variation 

;

def count_var_pay (x) :
    count = 0
    for element in x:
        if element >= 0.05 :
            count += 1
        else:
             0                    
    return count

;

def flag_var_payments (x) : 
    if x >= 2 :
        return 'Yes'
    else : 
        return 'No'

;

test['var_payments'] = test.payments.apply(lambda x:var_payments(x))

test['count_p'] = test.var_payments.apply(lambda x:count_var_pay(x))

test['flag'] = test.count_p.apply(lambda x:flag_var_payments(x))

How could I do it? Change the format of the column payments from the excel file? Transform column data?

I am attentive to your comments.

Thanks for support.

2 Answers 2

1

Convert {} into [] then use pd.eval to transform as a list:

df['payments'] = pd.eval(df['payments'].replace({r'{': '[', r'}': ']'}, regex=True))

Output:

>>> df
            id product type  quantity  nro_ope                                           payments
0  F8510004123  retail    E        25        2                         [1030.97, 1030.97, 584.91]
1  A3100002543  retail    E        34        3      [1610.74, 1610.74, 1610.74, 1610.74, 1611.14]
2  Z3510002123  others    D       150       26  [1007.52, 1007.52, 1007.52, 1007.52, 500, 500,...

>>> df.iloc[0, 5]
[1030.97, 1030.97, 584.91]

>>> type(df.iloc[0, 5])
list
Sign up to request clarification or add additional context in comments.

2 Comments

It does not allow me to execute pd.eval , it generates the following error: ValueError: unknown type object, the column payments is the type object in the python dataframe.
@Jean. Can you try test['payments'].apply(lambda x: [float(i) for i in x[1:-1].split(',')])?
0

I don't know how do you handle the "{}" when reading the excel file but if you changed payment in your test DataFrame, you can encounter the same error:

'payments':["{1030.97,1030.97,584.91}","{1610.74,1610.74,1610.74,1610.74,1611.14}","{1007.52,1007.52,1007.52,1007.52,500,500,852.95}"]

TypeError: unsupported operand type(s) for /: 'str' and 'str'

Hope that can help something.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.