I am trying to find the highest values of a column in my dataframe. However, as the values contain % they are strings, not integers, which is preventing me from using nlargest. I would like to know if I can convert the strings to integers.
Here is an example of my code:
import pandas as pd
import re
test_data = {
'Animal': ['Otter', 'Turtle', 'Chicken'],
'Squeak Appeal': [12.8, 1.92, 11.4],
'Richochet Chance': ['8%', '30%', '16%'],
}
test_df = pd.DataFrame(
test_data,
columns=[ 'Animal', 'Squeak Appeal','Richochet Chance']
)
My attempts to use nlargest:
r_chance = test_df.nlargest(2, ['Richochet Chance'])
# TypeError: Column 'Richochet Chance' has dtype object, cannot use method 'nlargest' with this dtype
r_chance = test_df.nlargest(2, re.sub("[^0-9]", ""(['Richochet Chance'])))
# TypeError: 'str' object is not callable
If there is no sensible way to do this I shan't remain in denial. I just wondered if I could avoid looping through a large df and converting strings to integers for multiple columns.