1

I have a dataframe with two columns that represent coordinates and an additional column in a boolean format:

X    Y    PROB
2    4    False
3    5    False
3    2    False
4    4    True
3    7    True
2    4    False
2    3    False

What I'm trying to do is to select consecutive False and True coordinates and produce 2 new dataframes as follows:

in the case of False

X   Y  PROB
2   4   1
3   5   1
3   2   1
2   4   2  
2   3   2

in the case of True

X   Y  PROB
4   4   1
3   7   1

Right now my approach is using .isin but I get KeyError, some ideas?

2 Answers 2

2

Or you can try this (PS: drop column Group by using .drop('Group',1))

df['Group']=df.PROB.astype(int).diff().fillna(0).ne(0).cumsum()
df_True=df[df.PROB]
df_False=df[~df.PROB]
df_False.assign(PROB=pd.factorize(df_False.Group)[0]+1)
Out[111]: 
   X  Y  PROB  Group
0  2  4     1      0
1  3  5     1      0
2  3  2     1      0
5  2  4     2      2
6  2  3     2      2

df_True.assign(PROB=pd.factorize(df_True.Group)[0]+1)
Out[112]: 
   X  Y  PROB  Group
3  4  4     1      1
4  3  7     1      1
Sign up to request clarification or add additional context in comments.

Comments

1
d1 = df.assign(
    PROB=df.PROB.diff().fillna(False).cumsum()
).groupby(df.PROB).apply(
    lambda d: d.assign(PROB=d.PROB.factorize()[0] + 1)
)

d1

         X  Y  PROB
PROB               
False 0  2  4     1
      1  3  5     1
      2  3  2     1
      5  2  4     2
      6  2  3     2
True  3  4  4     1
      4  3  7     1

d1.xs(True)

   X  Y  PROB
3  4  4     1
4  3  7     1

d1.xs(False)

   X  Y  PROB
0  2  4     1
1  3  5     1
2  3  2     1
5  2  4     2
6  2  3     2

1 Comment

Awesome! but in my hands the column 'PROB' show a value of 1 for all

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.