1

I have this dataframe

import pandas as pd

df = pd.DataFrame({'Found':['A','A','A','A','A','B','B','B'],
               'Date':['14/10/2021','19/10/2021','29/10/2021','30/09/2021','20/09/2021','20/10/2021','29/10/2021','15/10/2021'],
               'LastDayMonth':['29/10/2021','29/10/2021','29/10/2021','30/09/2021','30/09/2021','29/10/2021','29/10/2021','29/10/2021'],
               'Mark':[1,2,3,4,3,1,2,3]

              })
print(df)

Found        Date LastDayMonth  Mark
0     A  14/10/2021   29/10/2021     1
1     A  19/10/2021   29/10/2021     2
2     A  29/10/2021   29/10/2021     3
3     A  30/09/2021   30/09/2021     4
4     A  20/09/2021   30/09/2021     3
5     B  20/10/2021   29/10/2021     1
6     B  29/10/2021   29/10/2021     2
7     B  15/10/2021   29/10/2021     3

based on this dataframe I need to create a new column that is the "Mark" of the last day of the month to form this new column.

that is, I need the value of the 'Mark' column of the last day of the month of each Found

Result Example

Found        Date LastDayMonth  Mark  Mark_LastDayMonth
0     A  14/10/2021   29/10/2021     1                  3
1     A  19/10/2021   29/10/2021     2                  3
2     A  29/10/2021   29/10/2021     3                  3
3     A  30/09/2021   30/09/2021     4                  4
4     A  20/09/2021   30/09/2021     3                  4
5     B  20/10/2021   29/10/2021     1                  2
6     B  29/10/2021   29/10/2021     2                  2
7     B  15/10/2021   29/10/2021     3                  2

how could I do that? a function? I need to do this for a large amount of data

1
  • df.groupby(['Found', 'LastDayMonth'])['Mean'].transform('last')? Commented Nov 8, 2021 at 18:13

1 Answer 1

1

IIUC, for each Found, you check which is the Mark value at the last day of the month and you assign that value to Mark_LastDayMonth.

You can proceed the following:

# 1. Select last days
mark_last_day = df.loc[df.apply(lambda x: x['Date']==x['LastDayMonth'], 1)]

# 2. Merge them to the original dataset, renaming the right Mark value to Mark_LastDayMonth
df.merge(mark_last_day[['Found', 'LastDayMonth', 'Mark']],
     how='left',
     on=['Found', 'LastDayMonth'],
     suffixes=('', '_LastDayMonth'))

# Output
Found   Date    LastDayMonth    Mark    Mark_LastDayMonth
0   A   14/10/2021  29/10/2021  1       3
1   A   19/10/2021  29/10/2021  2       3
2   A   29/10/2021  29/10/2021  3       3
3   A   30/09/2021  30/09/2021  4       4
4   A   20/09/2021  30/09/2021  3       4 
5   B   20/10/2021  29/10/2021  1       2
6   B   29/10/2021  29/10/2021  2       2
7   B   15/10/2021  29/10/2021  3       2

The first step, that is this line of code df.loc[df.apply(lambda x: x['Date']==x['LastDayMonth'], 1)] will produce the following:

    Found   Date    LastDayMonth    Mark
2   A   29/10/2021  29/10/2021      3
3   A   30/09/2021  30/09/2021      4
6   B   29/10/2021  29/10/2021      2

This output is then merged to the orginal dataframe df

Sign up to request clarification or add additional context in comments.

1 Comment

Very good, taking advantage of how I do this for the previous month

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.