0

I'm not sure which is the most efficient process to achieve this question so I'll be pretty broad. I want to shift and combine a row where == to a specific value. For the df below I want to shift rows up where value is == to X. But I want to combine it with the string above, not overwrite it.

Note: The row I want to shift up is every 14th row. So it may be easier to select every nth row and shift up?

df = pd.DataFrame({
    'Value' : ['Foo','X','00:00','00:00','29:00','30:00','00:00','02:00','15:00','20:00','10:00','15:00','20:00','25:00'],                 
    'Number' : [00,0,1,2,3,4,5,6,7,8,9,10,11,12],                      
    })

val = ['X']

a = df[df.isin(val)].shift(-1)

df[df.isin(val)] = np.nan

out_df = a.combine_first(df)

Out:

    Value  Number
0       X     0.0
1     NaN     0.0
2   00:00     1.0
3   00:00     2.0
4   29:00     3.0
5   30:00     4.0
6   00:00     5.0
7   02:00     6.0
8   15:00     7.0
9   20:00     8.0
10  10:00     9.0
11  15:00    10.0
12  20:00    11.0
13  25:00    12.0

Intended Output:

    Value  Number
0   Foo X     0.0
2   00:00     1.0
3   00:00     2.0
4   29:00     3.0
5   30:00     4.0
6   00:00     5.0
7   02:00     6.0
8   15:00     7.0
9   20:00     8.0
10  10:00     9.0
11  15:00    10.0
12  20:00    11.0
13  25:00    12.0

1 Answer 1

2

You can try something like this:

df.groupby((df['Value'] != val[0]).cumsum())[['Value','Number']].agg({'Value':' '.join, 'Number':'sum'})

Output:

       Value  Number
Value               
1      Foo X       0
2      00:00       1
3      00:00       2
4      29:00       3
5      30:00       4
6      00:00       5
7      02:00       6
8      15:00       7
9      20:00       8
10     10:00       9
11     15:00      10
12     20:00      11
13     25:00      12
​
Sign up to request clarification or add additional context in comments.

5 Comments

wow! wonder ful. Can you explain what is happening here .agg({'Value':' '.join, 'Number':'sum'})? what is this aggregate function doing?
Using string join with a space character to aggregate strings. Try this. ' '.join(['A','B','C'])
got it. I printed out the group object & it is much clear now. Thanks.
Thanks @ScottBoston. Is it also possible to pass more strings into val? For e.g. [X,Y,Z]
Yes. Use isin and the invert operator like this: df.groupby((~df['Value'].isin(['X','y'])).cumsum())[['Value','Number']].agg({'Value':' '.join, 'Number':'sum'})

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.