I have a pandas DataFrame with two float columns, col_x and col_y.
I want to return the sum of col_x * col_y divided by the sum of col_x
Can this be done with a custom aggregate function?
I am trying to do something like this:
import pandas as pd
def aggregation_function(x, y):
return sum(x * y) / sum(x)
df = pd.DataFrame([(0.1, 0.2), (0.3, 0.4), (0.5, 0.6)], columns=["col_x", "col_y"])
result = df.agg(aggregation_function, axis="columns", args=("col_x", "col_y"))
I know that the aggregation function probably doesn't make sense but I can't even get to the point where I can try other things because I am getting this error:
TypeError: apply() got multiple values for keyword argument 'args'
I don't know how else I can specify the args for my aggregation function. I've tried using kwargs, too but nothing I do will work. There is no example in the docs for this but it seems to say that it is possible.
How can you specify the args for the aggregation function?
The desired result of the output aggregation would be a single value