4

I was using a wind speed calculation function from lon and lat components:

def wind_speed(u, v):
    return np.sqrt(u ** 2 + v ** 2)

and calling it to calculate a new pandas column from two existing ones:

df['wspeed'] = map(wind_speed, df['lonwind'], df['latwind'])

Since I changed from Python 2.7 to Python 3.5 the function is not working anymore. Could the change be the cause?

In a single argument (column) function:

def celsius(T):
    return round(T - 273, 1)

I am now using:

df['temp'] = df['t2m'].map(celsius)

And it works fine.

Could you help me?

1
  • But was the function map changed? Commented Jun 25, 2016 at 15:07

2 Answers 2

3

If want to use map, add list:

df = pd.DataFrame({'lonwind':[1,2,3],
                   'latwind':[4,5,6]})

print (df)
   latwind  lonwind
0        4        1
1        5        2
2        6        3

def wind_speed(u, v):
    return np.sqrt(u ** 2 + v ** 2)

df['wspeed'] = list(map(wind_speed, df['lonwind'], df['latwind']))

print (df)
   latwind  lonwind    wspeed
0        4        1  4.123106
1        5        2  5.385165
2        6        3  6.708204

Without list:

df['wspeed'] = (map(wind_speed, df['lonwind'], df['latwind']))
print (df)
   latwind  lonwind                              wspeed
0        4        1  <map object at 0x000000000AC42DA0>
1        5        2  <map object at 0x000000000AC42DA0>
2        6        3  <map object at 0x000000000AC42DA0>

map(function, iterable, ...)

Return an iterator that applies function to every item of iterable, yielding the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted. For cases where the function inputs are already arranged into argument tuples, see itertools.starmap().

Another solution:

df['wspeed'] = (df['lonwind'] ** 2 + df['latwind'] ** 2) **0.5
print (df)
   latwind  lonwind    wspeed
0        4        1  4.123106
1        5        2  5.385165
2        6        3  6.708204
Sign up to request clarification or add additional context in comments.

Comments

1

I would try to stick to existing numpy/scipy functions as they are extremely fast and optimized (numpy.hypot):

df['wspeed'] = np.hypot(df.latwind, df.lonwind)

Timing: against 300K rows DF:

In [47]: df = pd.concat([df] * 10**5, ignore_index=True)

In [48]: df.shape
Out[48]: (300000, 2)

In [49]: %paste
def wind_speed(u, v):
    return np.sqrt(u ** 2 + v ** 2)

## -- End pasted text --

In [50]: %timeit list(map(wind_speed, df['lonwind'], df['latwind']))
1 loop, best of 3: 922 ms per loop

In [51]: %timeit np.hypot(df.latwind, df.lonwind)
100 loops, best of 3: 4.08 ms per loop

Conclusion: vectorized approach was 230 times faster

If you have to write your own one, try to use vectorized math (working with vectors / columns instead of scalars):

def wind_speed(u, v):
    # using vectorized approach - column's math instead of scalar 
    return np.sqrt(u * u + v * v)

df['wspeed'] = wind_speed(df['lonwind'] , df['latwind'])

demo:

In [39]: df['wspeed'] = wind_speed(df['lonwind'] , df['latwind'])

In [40]: df
Out[40]:
   latwind  lonwind    wspeed
0        4        1  4.123106
1        5        2  5.385165
2        6        3  6.708204

same vectorized approach with celsius() function:

def celsius(T):
    # using vectorized function: np.round()
    return np.round(T - 273, 1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.