Replace string value with previous row value based on conditionals - Pandas

Question

I am trying to replace the value in the current row based on the previous row given that certain conditions are met.

Conditions:

Current row is 0

Previous row is C

Within Group (preferred, but will likely work without)

Example dataframe similar to mine:

ID  Week value
 4    1     W
 4    2     C
 4    3     0
 4    4     0
24    1     W
24    2     W
24    3     0
24    4     A

Example of what I need it to look like:

ID  Week value
 4    1     W
 4    2     C
 4    3     C
 4    4     C
24    1     W
24    2     W
24    3     0
24    4     A

Questions by others that I cant seem to rework or doesn't quite fit my problem:

Code to build dataframe similar to mine

import pandas as pd

df = pd.DataFrame({'ID': {0:'4', 1:'4', 2:'4', 3:'4', 4:'24', 5:'24', 6:'24', 7:'24'}, 'Week': {0:'1', 1:'2', 2:'3', 3:'4', 4: '1', 5:'2', 6:'3', 7:'4'},  'value': {0:'W', 1:'C', 2:'0', 3:'0', 4: 'W', 5:'W', 6:'0', 7:'A'} })
df[['ID', 'Week']] = df[['ID', 'Week']].astype('int')

Poorly worked attempt to solve the problem (throws errors)

for i in range(1, len(df)):
    if df.value[i] == '0' and df.value[i-1] == 'C':
         df.value[i] = 'C'
     else:
         df.value[i] = df.value[i]

ParalysisByAnalysis · Accepted Answer · 2019-09-19 23:11:06Z

2

Usually, I would use np.where to apply a conditional to a column. However, given the .shift() function, this doesn't work without throwing it into a for loop. A quick method is using .replace():

for row in range(0,len(df)):
    df['value'] = df['value'].replace('0',df['value'].shift(1))

If you wish to maintain conditional, you could still utilize np.where in a similar fashion.

for row in range(0,len(df)):
    df['value'] = np.where((df['value'] == '0') & (df['value'].shift(1) == 'C'), 'C', df['value'])

answered Sep 19, 2019 at 23:11

ParalysisByAnalysis

7331 gold badge5 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Umar.H Over a year ago

nice answer, got pretty much the same but the loop threw me, am trying to think of a vectorised solution but no cigar.

Devon Oliver Over a year ago

Is this possible to do while grouping by ID?

Devon Oliver Over a year ago

Works due to current data structure, but there may be a situation where I have to group by ID

Devon Oliver Over a year ago

I added df['ID'].shift() == df['ID'] to make sure replacement doesnt not occur outside of grouping

dmontaner · Accepted Answer · 2019-09-19 23:43:09Z

1

Not easy to generalize to other situations but for your specific case you can do:

is_0 = df['value'] == '0'
is_C_block = df['value'].replace('0', pd.np.nan).fillna(method='ffill') == 'C'

df.loc[is_0 & is_C_block, 'value'] = 'C'

answered Sep 19, 2019 at 23:43

dmontaner

2,1651 gold badge17 silver badges17 bronze badges

5 Comments

Devon Oliver Over a year ago

Is this possible to do while grouping by ID?

Devon Oliver Over a year ago

Works due to current data structure, but there may be a situation where I have to group by ID

dmontaner Over a year ago

And what will the grouping do? Some different replacement?

Devon Oliver Over a year ago

Prevent referencing replacement outside of grouping. i.e., prevent replacing the first value of individual 24 if it is 0, if the last value for individual 4 is c

dmontaner Over a year ago

you could use df['ID'].shift() != df['ID'] to include an extra condition when a new ID is starteing...

Collectives™ on Stack Overflow

Replace string value with previous row value based on conditionals - Pandas

2 Answers 2

4 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related