1

I am trying to replace the value in the current row based on the previous row given that certain conditions are met.

Conditions:

Current row is 0

Previous row is C

Within Group (preferred, but will likely work without)

Example dataframe similar to mine:

ID  Week value
 4    1     W
 4    2     C
 4    3     0
 4    4     0
24    1     W
24    2     W
24    3     0
24    4     A

Example of what I need it to look like:

ID  Week value
 4    1     W
 4    2     C
 4    3     C
 4    4     C
24    1     W
24    2     W
24    3     0
24    4     A

Questions by others that I cant seem to rework or doesn't quite fit my problem:

  1. conditional replace based off prior value in same column of pandas dataframe python
  2. conditional change of a pandas row, with the previous row value

Code to build dataframe similar to mine

import pandas as pd

df = pd.DataFrame({'ID': {0:'4', 1:'4', 2:'4', 3:'4', 4:'24', 5:'24', 6:'24', 7:'24'}, 'Week': {0:'1', 1:'2', 2:'3', 3:'4', 4: '1', 5:'2', 6:'3', 7:'4'},  'value': {0:'W', 1:'C', 2:'0', 3:'0', 4: 'W', 5:'W', 6:'0', 7:'A'} })
df[['ID', 'Week']] = df[['ID', 'Week']].astype('int')

Poorly worked attempt to solve the problem (throws errors)

for i in range(1, len(df)):
    if df.value[i] == '0' and df.value[i-1] == 'C':
         df.value[i] = 'C'
     else:
         df.value[i] = df.value[i]

2 Answers 2

2

Usually, I would use np.where to apply a conditional to a column. However, given the .shift() function, this doesn't work without throwing it into a for loop. A quick method is using .replace():

for row in range(0,len(df)):
    df['value'] = df['value'].replace('0',df['value'].shift(1))

If you wish to maintain conditional, you could still utilize np.where in a similar fashion.

for row in range(0,len(df)):
    df['value'] = np.where((df['value'] == '0') & (df['value'].shift(1) == 'C'), 'C', df['value'])
Sign up to request clarification or add additional context in comments.

4 Comments

nice answer, got pretty much the same but the loop threw me, am trying to think of a vectorised solution but no cigar.
Is this possible to do while grouping by ID?
Works due to current data structure, but there may be a situation where I have to group by ID
I added df['ID'].shift() == df['ID'] to make sure replacement doesnt not occur outside of grouping
1

Not easy to generalize to other situations but for your specific case you can do:

is_0 = df['value'] == '0'
is_C_block = df['value'].replace('0', pd.np.nan).fillna(method='ffill') == 'C'

df.loc[is_0 & is_C_block, 'value'] = 'C'

5 Comments

Is this possible to do while grouping by ID?
Works due to current data structure, but there may be a situation where I have to group by ID
And what will the grouping do? Some different replacement?
Prevent referencing replacement outside of grouping. i.e., prevent replacing the first value of individual 24 if it is 0, if the last value for individual 4 is c
you could use df['ID'].shift() != df['ID'] to include an extra condition when a new ID is starteing...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.