3

I want to find the most common value for each group. UPDATE: If there are real values and NaNs, I want to drop the NaNs. I only want NaN, when that is all the values.

Some of my groups have all their data missing. And I would like the result in these cases to be missing data (NaN) as the most common value.

In these cases the DataFrame.groupby.agg(pd.Series.mode) function returns an empty categorical. What I want is NaN.

A toy example follows ...

data = """
Group, Value
A,      1
A,      1
A,      1
B,      2 
C,      3   
C, 
C, 
D,
D,
"""

from io import StringIO
df = (
    pd.read_csv(StringIO(data),
                skipinitialspace=True)
    .astype('category')
)

df.groupby('Group')['Value'].agg(pd.Series.mode)

Which yields ...

A                                             1.0
B                                             2.0
C                                             3.0
D    [], Categories (3, float64): [1.0, 2.0, 3.0]
Name: Value, dtype: object

My question: is there a way to get NAN, or to detect the empty categorical and make that a NaN. UPDATED: Noting, that I cannot use dropna=False, as that would give me an incorrect answer for C above.

By way of context, my original DataFrame has 27 million rows, and my grouped frame has 6 million rows. So, I want to avoid slow solutions.

2
  • Have you tried df.replace('', 'NaN').groupby('Group')['Value'].agg(pd.Series.mode)? Commented Apr 16, 2021 at 23:28
  • where I have non-NaN values I want NaN to be ignored by the mode method. So mapping NaNs to something else doesn't work in this case. Commented Apr 16, 2021 at 23:30

1 Answer 1

4

You can apply pd.Series.mode and then pd.to_numeric with errors="coerce":

x = df.groupby("Group")["Value"].agg(pd.Series.mode)
print(pd.to_numeric(x, errors="coerce"))

Prints:

Group
A    1.0
B    2.0
C    3.0
D    NaN
Name: Value, dtype: float64
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.