2

I have DataFrame with columns: City, Wind direction, Temperature. Of course each City occures only 1 time!!! and has only 1 data point of Wind direction and Temperature. For instance: 0 New York 252.0 22.0

How can create my own methon and use it in DataFrame ? For example I would like to create my own method "aa" which returns some solution (Temperature in City minus mean Temperature for entire column "Temperature") and use this created method during aggregation my DataFrame. Currently I created method "aa" as you can see below and I use it in aggregation, nevertheless, "aa" method shows "0" everywhere. Could you write me an appropriate code? Did I make mistake id def aa(x) ?

def aa(x):
    return x - np.mean(x)

file.groupby(["City"]).agg({"Wind direction":[np.mean, aa], "Temperature":["mean", aa]})

Sample data: (Taken from comments provided by OP)

file = pd.DataFrame({"City":["New York", "Berlin", "London"], "Wind direction":[225.0, 252.0, 310.0], "Temperature":[21.0, 18.5, 22.0]})
4
  • No I would like to crate my own method "aa" and implement it .agg() function, I need to do it in this way and I know that it is possible, nevertheless, I do not know where is the mistake in my code. Commented Oct 21, 2019 at 9:10
  • 1
    Add some sample data Commented Oct 21, 2019 at 9:13
  • I used x - np.mean(x) in my function beacuse I would like to subtract the average temperature in all cities together from temperature in a given city and then by using .agg() create table like this: City------Temperature in City-----Temperature in City - average temperature and the same in terms of Wind direction Commented Oct 21, 2019 at 9:44
  • For that you need to change this np.mean(x) to get mean of column by np.mean(file[x.name]). Commented Oct 21, 2019 at 9:47

1 Answer 1

1

You are getting zeros because the input that aa receives is the group, not the full series, and the mean of a single-element group is the single element.

Now, it's a bit weird to use groupby when you know that each group has only a single element, but you can force it through using something like

def aa(x):
    return x - file[x.name].mean()

With your given example:

In [23]: file.groupby(["City"]).agg({"Wind direction":[np.mean, aa], "Temperature":["mean", aa]})
Out[23]:
         Wind direction            Temperature
                   mean         aa        mean   aa
City
Berlin            252.0 -10.333333        18.5 -2.0
London            310.0  47.666667        22.0  1.5
New York          225.0 -37.333333        21.0  0.5

Much more straightforward would be to simply operate on the data frame directly, e.g.

In [26]: file['Wind direction aa'] = file['Wind direction'] - file['Wind direction'].mean()

In [27]: file['Temperature aa'] = file['Temperature'] - file['Temperature'].mean()

In [28]: file
Out[28]:
       City  Wind direction  Temperature  Wind direction aa  Temperature aa
0  New York           225.0         21.0         -37.333333             0.5
1    Berlin           252.0         18.5         -10.333333            -2.0
2    London           310.0         22.0          47.666667             1.5
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.