0

All,

I have a dataframe with repeated indices. I'm trying to update the values using the index for all rows with that index. Here is an example of what I have

  name  x
t        
0    A  5
0    B  2
1    A  7
2    A  5
2    B  9
2    C  3

"A" is present at every time. I want to replace "x" with the current value of "x", minus the value of "x" for "A" at that time. The tricky part is to get with an array or dataframe that is, in this case

array([5, 5, 7, 5, 5, 5])

which is the value for "A", but repeated for each timestamp. I can then subtract this from df['x']. My working solution is below.

temp = df[df['name'] == 'A']
d = dict(zip(temp.index, temp['x']))
df['x'] = df['x'] - df.index.to_frame()['t'].replace(d)


  name  x
t        
0    A  0
0    B -3
1    A  0
2    A  0
2    B  4
2    C -2

This works, but feels a bit hacky, and I can't help but think there is a better (and must faster) solution...

2 Answers 2

1

I will do reindex

df.x-=df.loc[df.name=='A','x'].reindex(df.index).values
df
Out[362]: 
  name  x
t        
0    A  0
0    B -3
1    A  0
2    A  0
2    B  4
2    C -2
Sign up to request clarification or add additional context in comments.

1 Comment

This is EXACTLY what I needed. Thank you! I had only ever used reindex to reset the index to 0..N, I never knew it could be used to essentially duplicate the rows
1

groupby .cumsum() of where name =A and subtract fast value in each group from the rest

 df['x']=df.groupby((df.name=='A').cumsum())['x'].apply(lambda s:s.sub(s.iloc[0]))


 name  x
t        
0    A  0
0    B -3
1    A  0
2    A  0
2    B  4
2    C -2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.