Toy example
Suppose I have the dataframe df shown below
C
L0 L1 L2
0 w P 11
Q 9
R 21
S 4
x P 3
Q 0
R 23
S 20
y P 19
Q 0
R 7
S 13
z P 17
Q 0
R 5
S 1
1 w P 8
Q 2
R 12
S 0
x P 22
Q 14
R 2
S 18
y P 6
Q 0
R 16
S 15
z P 10
Q 0
R 8
S 0
Note that the rows of df are indexed by 3-level multi-index.
I can find the minimum of the C column for each value of the L2 level, as follows:
In [58]: df.groupby(level='L2').min()
Out[58]:
C
L2
P 3
Q 0
R 2
S 0
Similarly, the following expression shows the values of L2 for which this minimum is greater than 0:
In [59]: df.groupby(level='L2').min() > 0
Out[59]:
C
L2
P True
Q False
R True
S False
Question: How can I select the rows of the original dataframe df corresponding to the values of L2 for which the minimum of C is greater than 0?
In this simple example the problem boils down to selecting the rows of df whose L2 value is either 'P' or 'R'. Therefore, it would not be difficult to brute-force the problem by concatenating the rows for L2='P' with those for L2='R'.
In the application I have in mind, however, such a solution becomes unwieldy, since here the L2 level has ~2000 values, and for about half of them, the minimum is greater than 0.
Therefore, I'm looking for a more scalable way to solve this selection problem.
DATA
L0 L1 L2 C
0 w P 11
0 w Q 9
0 w R 21
0 w S 4
0 x P 3
0 x Q 0
0 x R 23
0 x S 20
0 y P 19
0 y Q 0
0 y R 7
0 y S 13
0 z P 17
0 z Q 0
0 z R 5
0 z S 1
1 w P 8
1 w Q 2
1 w R 12
1 w S 0
1 x P 22
1 x Q 14
1 x R 2
1 x S 18
1 y P 6
1 y Q 0
1 y R 16
1 y S 15
1 z P 10
1 z Q 0
1 z R 8
1 z S 0