7

I have a dataframe with a column with dtype('int64'). The values in the column range from 0-10. The dataframe has 770K rows and 56 columns of different types. When I run the code below, I get dtype('int64'). I would have thought that the result would have been at a minimum to downcast to int32 or int16. Here's a replicable example.

import pandas as pd

df = pd.DataFrame([x for x in range(10)]*77000, columns=['recommendation'])
df.dtypes
df.recommendation.apply(lambda x: pd.to_numeric(x, downcast='integer')).dtypes
1
  • 3
    try pd.to_numeric(df.recommendation,downcast='integer').dtypes Commented Oct 26, 2018 at 13:51

1 Answer 1

6

The apply method works cell-by-cell, so it cannot figure out that the whole column can be downcast. You need to call to_numeric on the whole column, as indicated by Ben in comment:

pd.to_numeric(df.recommendation,downcast='integer')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.