I have the following dataframe and I want to sort by two columns - first by id and those then need to be sorted in date order, both in ascending order, with the earliest date first:
df = pd.DataFrame(
{
"id": [9733, 9733, 9733, 9733, 9733, 9733, 9733, 9733, 2949, 2949, 2949, 2949, 2949, 2949, 2949, 9765, 9765, 9765, 9765, 9765, 9765],
"date": ["01/05/2008", "01/06/1968", "01/11/2010", "01/12/2016", "09/03/2011", "09/05/1975", "28/04/2011", "29/01/2005", "08/07/1974", "09/03/2021", "10/03/2021", "18/09/1986", "20/07/2021", "23/09/2017", "26/06/2020", "01/04/1963", "01/08/2012", "02/08/2012", "25/06/2021", "30/11/2020", "31/03/1986",],
"status": ["S", "A", "P", "C", "S", "D", "P", "P", "A", "P", "S", "D", "P", "P", "S", "A", "S", "P", "P", "S", "P"],
}
)
First, I change the date column to datetime and apply the dd/mm/yyyy format:
df['date'] = pd.to_datetime(df['date'])
df['date'] = df['date'].dt.strftime('%d/%m/%Y')
Then, I tried using the sort_values function:
df.sort_values(by=['id','date'], ascending=[True,True], inplace=True)
I'm getting the id column sorted in ascending order. However, the date column isn't in ascending order and this is where I can't work out where I'm going wrong:
Ultimately, I want the dataframe sorted by the earliest P status only (I want A status and D status separately which I figure I use .filter to get that, each will be output to an Excel file in separate worksheets), but I'm stuck with the ordering of the id and dates.
Thanks for your help.

status?