How do we detect date and datetime columns separately from a dataframe?

Question

I currently use pd.read_csv to read the dataframe. It does not detect both date and datetime columns, but instead checks it out as object. So, I use the below snippet to find date/datetime columns:

df = pd.read_csv(filename)
collist = []
for col in df.columns:
    if df[col].dtype == 'object':
        try:
            df[col] = pd.to_datetime(df[col])
            collist.append(col)
        except ValueError:
            pass
return collist

But my use case needs me to have date columns and datetime columns separately. Is there a way how we can filter out date and datetime columns separately?

import pandas as pd
df = pd.DataFrame({
    'date1':['4/10/2021', '4/11/2021','4/12/2021'],
    'date_time1': ['4/11/2021 13:23:45', '4/11/2021 13:23:45', '4/11/2021 13:23:45'],
    'Name': ['Bob', 'Jane', 'Alice'],
    'date_time2': ['4/12/2021 13:23:45', '4/13/2021 13:23:45', '4/14/2021 13:23:45']
})

So, date column list should give me [date1] and datetime column list should give me [date_time1,date_time2]

Can you please provide an example of input and expected output. Also, by the end of the function ^ everything will be datetime, since that's what overwriting columns with pd.to_datetime() does. — user3471881
– user3471881, Commented Sep 26, 2021 at 6:28
So, for example I have a dataframe with a column having value like: 4/11/2021 and another with 4/11/2021 12:30:34 so I want two separate list of columns. One with date and another with datetime. — LearnerJS
– LearnerJS, Commented Sep 26, 2021 at 6:35
Please follow the examples in this thread to create a good reproducible example in pandas: stackoverflow.com/questions/20109391/… — user3471881
– user3471881, Commented Sep 26, 2021 at 6:37
If you could prescribe a set of directives for parsing to date/time, I think this would be much more robust. Also note that pandas' datetime does not distinguish between date and datetime, as opposed to "native Python". — FObersteiner
– FObersteiner, Commented Sep 26, 2021 at 8:00

TSnake · Accepted Answer · 2021-09-26 09:09:09Z

1

Since you have already read the data and converted everything to datetime and were storing it in collist - (datecollist), use the below snippet to parse these timestamps and distinguish between date and datetime.

datetime_col_list = []
df = pd.read_csv(filename, delimiter=delimiter, encoding=encoding, parse_dates=date_collist)
for col in date_collist:
        first_index = df[col].first_valid_index()
        first_valid_value = df[col].loc[first_index]
        if (str(first_valid_value).split(' ')[1]) != '00:00:00':
            datetime_col_list.append(col)

date_list = list (set(date_collist) - set(datetime_col_list))
print(date_list)
print(datetime_col_list)

answered Sep 26, 2021 at 9:09

TSnake

4783 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

LearnerJS Over a year ago

Yes, it works but Will it work all the time?

FObersteiner Over a year ago

I don't think this is reliable; what if the time just happens to be 00:00:00? you would incorrectly classify as date despite the possibility that the rest of the column could have different times

LearnerJS Over a year ago

I thought even you wrote the same logic @MrFuppes, didn't you?

FObersteiner Over a year ago

@LearnerJS, I check all elements (.all()), not just the first valid

Collectives™ on Stack Overflow

How do we detect date and datetime columns separately from a dataframe?

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related