1

I created this function in Python to check the data type of each column of a DF and make sure it becomes the data type it's supposed to be.

def dataType(df):

    '''This function makes sure the data type for all 7 columns of the final dataframe are correct'''

    #Day as date type

    df["Day"] = pd.to_datetime(df["Day"])

    df["Day"] = df["Day"].dt.strftime("%m/%d/%Y")

    #Channel, Partner as string type

    df[["Channel", "Partner"]] = df[["Channel", "Partner"]].astype(str)

    #Spend, Clicks, Impressions, Platform Conversions as float type

    df['Platform Conversions'] = df['Platform Conversions'].where(df['Platform Conversions'] == "null", 0)

    df[["Spend", "Clicks", "Impressions", "Platform Conversions"]] = df[["Spend", "Clicks", "Impressions", "Platform Conversions"]].astype(float)

    return df

And I tested each column with a code like this but they all still output saying all columns are object type. What am I doing wrong?

print(df["Day"].dtypes)
2
  • Maybe you could try building a new df from scratch, adding columns (which will be astyped to their appropriate type) one at a time. Commented Oct 7, 2021 at 22:12
  • @MarkLavin, yes that's actually what I originally did. I'm attempting to do this (create a definition) because it would save at least 200 lines of code. So is there a way to create the desired function? Commented Oct 12, 2021 at 20:18

1 Answer 1

2

Check the documentation: https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.strftime.html

df["Day"] = df["Day"].dt.strftime("%m/%d/%Y")

does not return a datetime object.

I test the dt.strftime function (outside a function, which is equivalent):

d = '''Day
01/01/2001
01/02/2001'''
df = pd.read_csv(StringIO(d))
df.dtypes

Day    object
dtype: object
df["Day"] = pd.to_datetime(df["Day"])
df.dtypes

Day    datetime64[ns]
dtype: object
df["Day"] = df["Day"].dt.strftime("%m/%d/%Y")
df.dtypes

Day    object
dtype: object
Sign up to request clarification or add additional context in comments.

3 Comments

@Claie Kindly check the 'tick' to accept the answer if it answers your question. Thanks.
Hi @EBDS, so how would I do the same for the other columns to their designated types then (string and float type)?
@Claie What you did is correct in coverting to the respective type. The issue here is you expected dt.strftime("%m/%d/%Y") to be a datetime but this operation returns a string. That's why you see them as object.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.