2

I currently have a set that doesn't include Christmas. I try to subset the data, but receive a not in index error. How can I set pandas to ignore this error and subset all dates that are still in the index?

error:

KeyError: "DatetimeIndex(['2015-12-25', '2016-12-25'], dtype='datetime64[ns]', freq=None) not in index"

example:

df = df[pd.date_range(date(2015,6,1), date(2017,8,15))]

1 Answer 1

3

You need:

Select columns by intersection:

rng = pd.date_range(date(2015,6,1), date(2017,8,15))
df = df[rng.intersection(df.columns)]

Or by exact indexing:

df = df.loc[:, '2015-06-01':'2017-08-15']

Or by conditions:

df = df.loc[:, (df.columns >= '2015-06-01') & (df.columns <= '2017-08-15')]

rng = pd.date_range(datetime(2015,6,1),datetime(2015,6,7))
df = df.iloc[:, df.columns.isin(rng)]

Or by truncate:

df = df.truncate('2015-06-01','2017-08-15', axis=1)

Sample:

from datetime import datetime

np.random.seed(452)
rng = pd.date_range('2015-06-01', periods=10)
df = pd.DataFrame(np.random.randint(10, size=(10,10)), columns=rng).iloc[:, np.r_[0:2, 5:9]]
print (df)
   2015-06-01  2015-06-02  2015-06-06  2015-06-07  2015-06-08  2015-06-09
0           0           7           0           3           0           7
1           8           9           8           1           0           2
2           5           2           2           0           0           9
3           2           9           3           8           0           6
4           8           8           7           4           9           8
5           9           9           0           4           0           4
6           2           1           4           1           0           1
7           4           1           9           5           6           7
8           5           9           8           1           4           6
9           6           5           2           5           3           1

rng = pd.date_range(datetime(2015,6,1),datetime(2015,6,7))
df1 = df[rng.intersection(df.columns)]

df2 = df.loc[:, '2015-06-01':'2015-06-07']

df3 = df.loc[:, (df.columns >= '2015-06-01') & (df.columns <= '2015-06-07')]

rng = pd.date_range(datetime(2015,6,1),datetime(2015,6,7))
df4 = df.iloc[:, df.columns.isin(rng)]

df5 = df.truncate('2015-06-01','2015-06-07', axis=1)

print (df1)
#print (df2)
#print (df3)
#print (df4)
#print (df5)

   2015-06-01  2015-06-02  2015-06-06  2015-06-07
0           0           7           0           3
1           8           9           8           1
2           5           2           2           0
3           2           9           3           8
4           8           8           7           4
5           9           9           0           4
6           2           1           4           1
7           4           1           9           5
8           5           9           8           1
9           6           5           2           5
Sign up to request clarification or add additional context in comments.

6 Comments

so there is no other way to subset without first setting the range as a list first?
Not 100% sure if unrderstand you, but second should working.
it does not work. I forgot to mention the dates are column headers
I got it to work using a long winded variation: df = df.iloc[:, df.columns.isin(pd.date_range(date(2015,6,1), date(2017,8,15)))]
thank you for your samples. I was able to get the second method to work, which is the cleanest for my eyes.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.