0

I am having problems filtering a pandas dataframe - I want to filter all the dates in the "Date" column to return only dates in the last month, but python is mixing up the date and month and returning the wrong answer. Could you let me know how to fix thanks

import pandas as pd
import numpy as np
import datetime as dt
df = pd.DataFrame(pd.read_csv("Dates.csv"))
tday = dt.date.today()
tdelta = dt.timedelta(days=-30)
dt = tday + tdelta
dt = np.datetime64(dt)
print(dt)
df['Date'] = pd.to_datetime(df['Date'])
df = df.loc[df['Date'] >= dt]
print(df)

Dates.csv df is:

DF = 
     Met By        Date
0    David   06/07/2020
1    Philip  22/06/2020
2   Richard  02/04/2020
3   Richard  09/03/2020
4   Richard  17/02/2020
5    Philip  12/02/2020
6    Philip  29/01/2020
7      Jane  20/01/2020
8      Jane  13/01/2020
9   Richard  10/01/2020
10   Philip  06/01/2020

UPDATE: solved by VALDI - must include dayfirst = True on read_csv

import pandas as pd
df = pd.read_csv("Dates.csv", parse_dates=[1], dayfirst = True)
df = df[df['Date'] > pd.Timestamp.today() - pd.Timedelta('30D')]
print(df)
2

2 Answers 2

2

Let's start from the way how you read your DataFrame:

df = pd.DataFrame(pd.read_csv("Dates.csv"))

Note that:

  • pd.read_csv already returns a DataFrame,
  • so there is no need to create another DataFrame from the first one.

A simpler approach is: df = pd.read_csv("Dates.csv").

But this is not all. If you have a column containing a date then convert it to datetime type as early as when your read the DateFrame, so, assuming that your file contains only Met By and Date columns (no index column), the proper formula to read is:

df = pd.read_csv("Dates.csv", parse_dates=[1])

And now how to filter your DataFrame:

The first hint is not to use datetime module, as Pandas has its native today and Timedelta functions. As Date column is now of proper (datetime) type, you don't need any conversions. Just use:

df[df.Date > pd.Timestamp.today() - pd.Timedelta('30D')]

If you have also future dates and want to filter them out, run:

df[df.Date > (pd.Timestamp.today() - pd.Timedelta('30D'))
    and df.Date < pd.Timestamp.today()]
Sign up to request clarification or add additional context in comments.

5 Comments

hi i am using your code: df = pd.read_csv("Dates.csv", parse_dates=[1]) df = df[df['Instruction Date'] > pd.Timestamp.today() - pd.Timedelta('30D')] - but it is still returning "3 Richard 2020-09-03 5 Philip 2020-12-02 9 Richard 2020-10-01" as values - when it should not return any. lmk if there is a way to fix this thanks
also if i add this i get same problem - df['Instruction Date'] = pd.to_datetime(df['Instruction Date'], format='%Y/%m/%d')
Your data sample contains only past dates, so I didn't think about future dates. Now I corrected this detail.
Hi Valdi - this is not a future date - it is returning me "3 Richard 09/03/2020" - the third row in the data sample - and it is swapping the date with the month - which means it is not returning the right responses. Do you know how to fix this? Thanks
Pass also dayfirst=True to read_csv. As I remember, the default value is False, so day / month are swapped.
0

You can use pandas's to_timedelta function and datetime's datetime.now function on date column like

It seems you need to reformat date first

df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')     

then

import datetime
import pandas as pd 


df_filtered = df[df.Date > datetime.datetime.now() - pd.to_timedelta("30day")]

As suggested by @Henry, you can use pd.Timestamp.now() as well

df_filtered = df[df.Date > pd.Timestamp.now() - pd.to_timedelta("30day")]

9 Comments

Or just use pd.Timestamp.now().
@HenryYik, great, thanks for reminding, added to the answer
hi still have same problem after below - python still confusing date with month
can you eloborate more about what do you mean by confusing betwwen date with month, may be put desired and current output
say timestamp is : 2020-07-29, python will return dates including 2020-09-03 - however this is supposed to be ninth of march, not third of september
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.