3

In the following pandas dataframe, I want to change each row with a "-1" value with the value of the previous row. So this is the original df:

   position  
0     0        
1     -1        
2     1
3     1
4     -1
5     0    

And I want to transform it in:

   position  
0     0        
1     0        
2     1
3     1
4     1
5     0

I'm doing it in the following way but I think that there should be faster ways, probably vectorizing it or something like that (although I wasn't able to do it).

for i, row in self.df.iterrows():
    if row["position"] == -1:
        self.df.loc[i, "position"] = self.df.loc[i-1, "position"]

So, the code works, but it seems slow, is there any way to speed it up?

1 Answer 1

3

Use replace + ffill:

df.replace(-1, np.nan).ffill()

   position
0       0.0
1       0.0
2       1.0
3       1.0
4       1.0
5       0.0

Replace will convert -1 to NaN values. ffill will replace NaNs with the value just above it.

Use .astype for an integer result:

df.replace(-1, np.nan).ffill().astype(int)

   position
0         0
1         0
2         1
3         1
4         1
5         0 

Don't forget to assign the result back. You could perform the same operation non position if need be:

df['position'] = df['position'].replace(-1, np.nan).ffill().astype(int)

Solution using np.where:

c = df['position'] 
df['position'] = np.where(c == -1, c.shift(), c)
df

   position
0       0.0
1       0.0
2       1.0
3       1.0
4       1.0
5       0.0
Sign up to request clarification or add additional context in comments.

7 Comments

@coldspeed I'm not sure what's going on, but I'm not getting the results changed with the provided solutions. If I do print dfthe column still shows the old values with -1
@churreeero This was a demonstrative example. Please assign the values back like this: df = df.replace(-1, np.nan).ffill().astype(int)
Also I guess that the replace+ffill solution affects to the whole df, not only to df["position"], so probably the np.where solution is better (if it works, right now it doesn't seem to, may be my fault because of something)
@churreeero Okay, try: df['position'] = df['position'].replace(-1, np.nan).ffill().astype(int)?
Oh, I see, shame on me! I see why it wasn't working the first solution, the dataframe wasn't assigned. But then, I don't get why the np.where solution is not working for me, because it is already assigned: c=self.df["position"] self.df["position"] = np.where(c == -1, c.shift(), c)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.