1

I currently use pd.read_csv to read the dataframe. It does not detect both date and datetime columns, but instead checks it out as object. So, I use the below snippet to find date/datetime columns:

df = pd.read_csv(filename)
collist = []
for col in df.columns:
    if df[col].dtype == 'object':
        try:
            df[col] = pd.to_datetime(df[col])
            collist.append(col)
        except ValueError:
            pass
return collist

But my use case needs me to have date columns and datetime columns separately. Is there a way how we can filter out date and datetime columns separately?

import pandas as pd
df = pd.DataFrame({
    'date1':['4/10/2021', '4/11/2021','4/12/2021'],
    'date_time1': ['4/11/2021 13:23:45', '4/11/2021 13:23:45', '4/11/2021 13:23:45'],
    'Name': ['Bob', 'Jane', 'Alice'],
    'date_time2': ['4/12/2021 13:23:45', '4/13/2021 13:23:45', '4/14/2021 13:23:45']
})

So, date column list should give me [date1] and datetime column list should give me [date_time1,date_time2]

7
  • Can you please provide an example of input and expected output. Also, by the end of the function ^ everything will be datetime, since that's what overwriting columns with pd.to_datetime() does. Commented Sep 26, 2021 at 6:28
  • So, for example I have a dataframe with a column having value like: 4/11/2021 and another with 4/11/2021 12:30:34 so I want two separate list of columns. One with date and another with datetime. Commented Sep 26, 2021 at 6:35
  • Please follow the examples in this thread to create a good reproducible example in pandas: stackoverflow.com/questions/20109391/… Commented Sep 26, 2021 at 6:37
  • @user3471881 Edited! Thanks. Commented Sep 26, 2021 at 6:50
  • If you could prescribe a set of directives for parsing to date/time, I think this would be much more robust. Also note that pandas' datetime does not distinguish between date and datetime, as opposed to "native Python". Commented Sep 26, 2021 at 8:00

1 Answer 1

1

Since you have already read the data and converted everything to datetime and were storing it in collist - (datecollist), use the below snippet to parse these timestamps and distinguish between date and datetime.

datetime_col_list = []
df = pd.read_csv(filename, delimiter=delimiter, encoding=encoding, parse_dates=date_collist)
for col in date_collist:
        first_index = df[col].first_valid_index()
        first_valid_value = df[col].loc[first_index]
        if (str(first_valid_value).split(' ')[1]) != '00:00:00':
            datetime_col_list.append(col)

date_list = list (set(date_collist) - set(datetime_col_list))
print(date_list)
print(datetime_col_list)
Sign up to request clarification or add additional context in comments.

4 Comments

Yes, it works but Will it work all the time?
I don't think this is reliable; what if the time just happens to be 00:00:00? you would incorrectly classify as date despite the possibility that the rest of the column could have different times
I thought even you wrote the same logic @MrFuppes, didn't you?
@LearnerJS, I check all elements (.all()), not just the first valid

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.