0
import pandas as pd

prizes = ([1, 100], [2, 50], [3, 25])
prizes = pd.DataFrame(prizes, columns=['Rank', 'Payout'])

ranking = ([1, 3, 2], [2, 2, 1], [3, 1, 3])
ranking = pd.DataFrame(ranking, columns=[1, 2, 3])

payouts = pd.DataFrame(range(1, 4), columns=['Lineup'])
mapper = prizes.set_index('Rank')['Payout'].to_dict()
payouts = pd.concat([payouts, ranking[range(1, 4)].apply(lambda s: s.map(mapper)).fillna(-1)], axis=1)

print(ranking)
print(payouts)

   1  2  3
0  1  3  2
1  2  2  1
2  3  1  3
   Lineup    1    2    3
0       1  100   25   50
1       2   50   50  100
2       3   25  100   25

The lambda function that is just above the print statements, is there any way to write that more efficiently. This is just a small example of what I'm using it for inside a large loop. This one portion of the loop takes roughly about half of the time of the entire loop. Any help would be appreciated.

2 Answers 2

2

You don't need to create a dict for mapper, setting the index and ensuring it is a Series suffices (a Series is a dict in a way); on to your question, you can use replace instead; it should be faster:

mapper = prizes.set_index('Rank')['Payout']

pd.concat([payouts, ranking.replace(mapper)], axis=1)

   Lineup    1    2    3
0       1  100   25   50
1       2   50   50  100
2       3   25  100   25

Your example doesn't show the need for a fillna; you can add extra details to your data for such a scenario. Also, since payouts is just a single column, you could instead create a Series, some performance gain may be had from there

Sign up to request clarification or add additional context in comments.

3 Comments

For some reason when I put this into my code, it doesn't replace the ranking values with the values from mapper. Could it be because the length of mapper doesn't match the length of each column in ranking? All it does is spit back out the original rankings.
Did you assign the new values to ranking?
I apologize I'm not sure what you mean.
1

Here is an even faster (but less concise) solution using the underlying numpy array. There is a ~1.7x gain compared to replace.

a = prizes.set_index('Rank')['Payout'].values
b = ranking.values-1 # get index as 0/1/2
c = a.take(b.flatten()).reshape(b.shape) # index in 1D and reshape to 2D
pd.DataFrame(c, columns=ranking.columns)

NB. I broke the steps down for clarity, but this could be done without the intermediate variables

Output:

     1    2    3
0  100   25   50
1   50   50  100
2   25  100   25

2 Comments

I end up getting "TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'" with this string of code.
This means you probably have float values in ranking, make sure the ranks are integers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.