Python pandas select rows based on datetime condition

Question

Here is the code for sample simulated data. Actual data can have varying start and end dates.

import pandas as pd
import numpy as np  

dates = pd.date_range("20100121", periods=3653)   
df = pd.DataFrame(np.random.randn(3653, 1), index=dates, columns=list("A"))    
dfb=df.resample('B').apply(lambda x:x[-1])

From the dfb, I want to select the rows that contain values for all the days of the month. In dfb, 2010 January and 2020 January have incomplete data. So I would like data from 2010 Feb till 2019 December.

For this particular dataset, I could do

df_out=dfb['2010-02':'2019-12']

But please help me with a better solution

Edit-- Seems there is plenty of confusion in the question. I want to omit rows that does not begin with first day of the month and rows that does not end on last day of the month. Hope that's clear.

could you elaborate on "contain values for all the days of the month"? do you mean every day in a month has data? — user7864386
– user7864386, Commented Feb 21, 2022 at 8:17
Yes, everyday in a month has data. So if data starts from 2013-3-13, the subset data should start from next month. It's assumed that the data is continuous after the start date. — Macro Panda
– Macro Panda, Commented Feb 21, 2022 at 8:46
If by "incomplete data" you mean NAN, you can drop rows with NAN values. Doesn't it solve your problem? — Zeynab Rostami
– Zeynab Rostami, Commented Feb 21, 2022 at 8:51
There is no NaN. Someone gives me this data. It starts from mid of some month1 and ends at mid of another month12. I want to subset data from beginning of month2 and end of month 11. — Macro Panda
– Macro Panda, Commented Feb 21, 2022 at 8:59

Pankaj Saini · Accepted Answer · 2022-02-21 11:15:30Z

When you say "better" solution - I assume you mean make the range dynamic based on input data.

OK, since you mention that your data is continuous after the start date - it is a safe assumption that dates are sorted in increasing order. With this in mind, consider the code:

import pandas as pd
import numpy as np  
from datetime import date, timedelta

dates = pd.date_range("20100121", periods=3653)
df = pd.DataFrame(np.random.randn(3653, 1), index=dates, columns=list("A"))
print(df)
dfb=df.resample('B').apply(lambda x:x[-1])

# fd is the first index in your dataframe
fd = df.index[0]
first_day_of_next_month = fd
# checks if the first month data is incomplete, i.e. does not start with date = 1
if ( fd.day != 1 ):
   new_month = fd.month + 1
   if ( fd.month == 12 ):
      new_month = 1
   first_day_of_next_month = fd.replace(day=1).replace(month=new_month)
else:
   first_day_of_next_month = fd

# ld is the last index in your dataframe
ld = df.index[-1]
# computes the next day
next_day = ld + timedelta(days=1)
if ( next_day.month > ld.month ):
   last_day_of_prev_month = ld  # keeps the index if month is changed
else:
   last_day_of_prev_month = ld.replace(day=1) - timedelta(days=1)


df_out=dfb[first_day_of_next_month:last_day_of_prev_month]

There is another way to use dateutil.relativedelta but you will need to install python-dateutil module. The above solution attempts to do it without using any extra modules.

rehaqds · Accepted Answer · 2022-02-21 19:20:48Z

1

I assume that in the general case the table is chronologically ordered (if not use .sort_index). The idea is to extract the year and month from the date and select only the lines where (year, month) is not equal to the first and last lines.

dfb['year'] = dfb.index.year  # col#1
dfb['month'] = dfb.index.month  # col#2

first_month = (dfb['year']==dfb.iloc[0, 1])  & (dfb['month']==dfb.iloc[0, 2])   
last_month  = (dfb['year']==dfb.iloc[-1, 1]) & (dfb['month']==dfb.iloc[-1, 2]) 

dfb = dfb.loc[(~first_month) & (~last_month)]
dfb = dfb.drop(['year', 'month'], axis=1)

answered Feb 21, 2022 at 19:20

rehaqds

2,2452 gold badges6 silver badges16 bronze badges

Collectives™ on Stack Overflow

Python pandas select rows based on datetime condition

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related