Pandas keeping certain rows based on strings in other rows

Question

I have the following dataframe

+-------+------------+--+
| index |    keep    |  |
+-------+------------+--+
|     0 | not useful |  |
|     1 | start_1    |  |
|     2 | useful     |  |
|     3 | end_1      |  |
|     4 | not useful |  |
|     5 | start_2    |  |
|     6 | useful     |  |
|     7 | useful     |  |
|     8 | end_2      |  |
+-------+------------+--+

There are two pairs of strings (start_1, end_1, start_2, end_2) that indicate that the rows between those strings are the only ones relevant in the data. Hence, in the dataframe below, the output dataframe would be only composed of the rows at index 2, 6, 7 (since 2 is between start_1 and end_1; and 6 and 7 is between start_2 and end_2)

d = {'keep': ["not useful", "start_1", "useful", "end_1", "not useful", "start_2", "useful", "useful", "end_2"]}
df = pd.DataFrame(data=d)

What is the most Pythonic/Pandas approach to this problem? Thanks

Roy2012 · Accepted Answer · 2020-07-22 11:25:50Z

2

Here's one way to do that (in a couple of steps, for clarity). There might be others:

df["sections"] = 0
df.loc[df.keep.str.startswith("start"), "sections"] = 1
df.loc[df.keep.str.startswith("end"), "sections"] = -1
df["in_section"] = df.sections.cumsum()
res = df[(df.in_section == 1) & ~df.keep.str.startswith("start")]

Output:

   index    keep  sections  in_section
2      2  useful         0           1
6      6  useful         0           1
7      7  useful         0           1

edited Jul 22, 2020 at 11:25

answered Jul 22, 2020 at 11:19

Roy2012

12.7k3 gold badges28 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Roy2012 Over a year ago

True. The result of a copy-paste. Will change.

Collectives™ on Stack Overflow

Pandas keeping certain rows based on strings in other rows

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related