0

Through the loc and iloc methods, Pandas allows us to slice dataframes. Still, I am having trouble to do this when the columns are datetime objects.

For instance, suppose the data frame generated by the following code:

d = {'col1': [1], 'col2': [2],'col3': [3]}
df = pd.DataFrame(data=d)
dates = ['01-01-2001','02-02-2002','03-03-2003']
dates = pd.to_datetime(dates).date
df.columns= dates

enter image description here

Let us try to slice the first two columns of the dataframe through dfloc:

df.loc[0,'01-01-2001':'02-02-2002']

We get the following TypeError:'<' not supported between instances of 'datetime.date' and 'str'

How could this be solved?

2
  • 1
    change dates = pd.to_datetime(dates).date -> dates = pd.to_datetime(dates) should do it. Commented Mar 5, 2021 at 19:31
  • @Ch3steR, could you please explain why does this work? Commented Mar 5, 2021 at 19:46

4 Answers 4

1


    df.iloc[0,[0,1]]


Use iloc or loc , but give column name in second parameter as index of that columns and you are passing strings, just give index

Sign up to request clarification or add additional context in comments.

2 Comments

What if I want to use column names in the code? For instance, when I have a dataframe of hundreds of dates and do not know the position of each?
First check if that colums exist in dataframe by issubset() method , now if it exists than get column index by get_loc() method, now you have index use it in loc or iloc
1

To piggyback off of @Ch3steR comment from above that line should work.

dates = pd.to_datetime(dates)

At that point the date conversion should allow you to index the columns that fall in that range based on the date as listed below. Just make sure the end date is a little beyond the end date that you're trying to capture.

# Return all rows in columns between date range 1/1/2001 and 2/3/2002
df.loc[:, '1/1/2001':'2/3/2002']

   2001-01-01  2002-02-02
0           1           2

Comments

1

You can call the dates from the list you created earlier and it doesn't give an error.

d = {'col1': [1], 'col2': [2],'col3': [3]}
df = pd.DataFrame(data=d)
dates = ['01-01-2001','02-02-2002','03-03-2003']
dates = pd.to_datetime(dates).date
df.columns= dates

df.loc[0,dates[0]:dates[1]]

The two different formats are here. It's just important that you stick to the one format. Calling from the list works because it guarantees that the format is the same. But as you said, you need to be able to use any dates so the second one is better for you.

>>>dates = pd.to_datetime(dates).date
>>>print("With .date")
With .date
>>>print(dates)
[datetime.date(2001, 1, 1) datetime.date(2002, 2, 2)
 datetime.date(2003, 3, 3)]

>>>dates = pd.to_datetime(dates)
>>>print("Without .date")
Without .date
>>>print(dates)
DatetimeIndex(['2001-01-01', '2002-02-02', '2003-03-03'], dtype='datetime64[ns]', freq=None)

3 Comments

Calling? Could you please detail what did you do?
Also, this was only an example. I actually want to use the date in df.loc. For instance, suppose that I have a dataframe with hundreds of columns, and I want to use dates '10-10-2010' and '12-12-2012', but I don't know their position in the dataframe.
Similar to @Ch3steR answer, it's just keeping the format the same that matters. I've added on examples of the two different formats, one when you use .date, one without.
0

I am a little bit late to this but you are basically slicing a datetime.date using str type, which obviously won't work. Make what sure that you use the same type for slicing. You can use the following one liner:

df.loc[:,datetime(2001,1,1).date():datetime(2003,3,3).date()]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.