1

Where there are NaN/None values in columns which aren't groupby key columns, when last() is used, it seems groupby is doing some sort of filling:

df = pd.DataFrame({'a': [1, 2, 1, 2], 'b': [23, 43, np.nan, 12], 'c': ['x', 'y', 'z', None]})
   a     b     c
0  1  23.0     x
1  2  43.0     y
2  1   NaN     z
3  2  12.0  None
df.groupby(by='a', as_index=False, dropna=False).last()
   a     b  c
0  1  23.0  z
1  2  12.0  y

where expected output is

   a     b     c
0  1   NaN     z
1  2  12.0  None

dropna=False doesn't help because it only applies to groupby column 'a'. Is there a way to make pandas not ignore NaN/None values without a hack?

0

1 Answer 1

1

last is designed to get the last non-NA value, independently in each column.

What you want (last row per group) is tail:

df.groupby(by='a', as_index=False).tail(1)

Output:

   a     b     c
2  1   NaN     z
3  2  12.0  None
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.