Replace row values with other row values from same df based on conditions

Question

I have the following dataset:

df = pd.DataFrame( {'user': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2}, 
    'date': {0: '1995-09-01', 1: '1995-09-02', 2: '1995-10-03', 3: '1995-10-04', 4: '1995-10-05', 5: '1995-11-07', 6: '1995-11-08'}, 
    'x': {0: '1995-09-02', 1: '1995-09-02', 2: '1995-09-02', 3: '1995-10-05', 4: '1995-10-05', 5: '1995-10-05', 6: '1995-10-05'}, 
    'y': {0: '1995-10-03', 1: '1995-10-03', 2: '1995-10-03', 3: '1995-11-08', 4: '1995-11-08', 5: '1995-11-08', 6: '1995-11-08'}, 
    'c1': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'c2': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'c3': {0: '1', 1: '0', 2: '0', 3: '2', 4: '0', 5: '9', 6: '0'}, 
    'VTX1': {0: 1, 1: 0, 2: 0, 3: 1, 4: 0, 5: 0, 6: 0}, 
    'VTY1': {0: 0, 1: 1, 2: 0, 3: 0, 4: 0, 5: 1, 6: 0}} )

which gives me:

    user    date         x           y     c1   c2 c3 VTX1 VTY1
0   1   1995-09-01  1995-09-02  1995-10-03  1   1   1   1   0
1   1   1995-09-02  1995-09-02  1995-10-03  0   0   0   0   1
2   1   1995-10-03  1995-09-02  1995-10-03  0   0   0   0   0
3   2   1995-10-04  1995-10-05  1995-11-08  2   2   2   1   0
4   2   1995-10-05  1995-10-05  1995-11-08  0   0   0   0   0
5   2   1995-11-07  1995-10-05  1995-11-08  9   9   9   0   1
6   2   1995-11-08  1995-10-05  1995-11-08  0   0   0   0   0

I want to replaces df[‘c1’] as follows.

- When df[‘date’]=df[‘x’], 
       change df[‘c1’] for the df[‘c1’] value when df[‘VTX1’]=1

In this example, for user 1, when df[‘date’]=df[‘x’] it happens to be on index 1. Here we want df['c1'] to be 1. Note that 1 is the value that user 1 has on df['c1'] when df['VTX1'] = 1.

So the end result would be:

   user    date          x         y       c1   c2 c3  VTX1 VTY1
0   1   1995-09-01  1995-09-02  1995-10-03  1   1   1   1   0
1   1   1995-09-02  1995-09-02  1995-10-03  0   0   0   0   1
2   1   1995-10-03  1995-09-02  1995-10-03  0   0   0   0   0
3   2   1995-10-04  1995-10-05  1995-11-08  2   2   2   1   0
4   2   1995-10-05  1995-10-05  1995-11-08  2   0   0   0   0
5   2   1995-11-07  1995-10-05  1995-11-08  9   9   9   0   1
6   2   1995-11-08  1995-10-05  1995-11-08  0   0   0   0   0

For the first condition df[‘date’]=df[‘x’]... doesnot match for any row, can you explain how does it match with the second row? — anky
– anky, Commented Aug 24, 2021 at 16:16
In each group can there be more than one VTX1 value which is equal to 1? — Shubham Sharma
– Shubham Sharma, Commented Aug 25, 2021 at 8:04
@ShubhamSharma in each user group there is only 1 entry with VTX equal to 1 — josepmaria
– josepmaria, Commented Aug 25, 2021 at 8:44

Shubham Sharma · Accepted Answer · 2021-08-25 09:07:09Z

For each unique user select the row where the column VTX1 has the value 1, this can be done by setting the index to user and using query to select the required rows. Then mask the values in c1 where date is equal x and substitute the masked values using the mapping series d

d = df.set_index('user').query('VTX1 == 1')['c1']
df['c1'] = df['c1'].mask(df['date'].eq(df['x']), df['user'].map(d))

   user        date           x           y c1 c2 c3  VTX1  VTY1
0     1  1995-09-01  1995-09-02  1995-10-03  1  1  1     1     0
1     1  1995-09-02  1995-09-02  1995-10-03  1  0  0     0     1
2     1  1995-10-03  1995-09-02  1995-10-03  0  0  0     0     0
3     2  1995-10-04  1995-10-05  1995-11-08  2  2  2     1     0
4     2  1995-10-05  1995-10-05  1995-11-08  2  0  0     0     0
5     2  1995-11-07  1995-10-05  1995-11-08  9  9  9     0     1
6     2  1995-11-08  1995-10-05  1995-11-08  0  0  0     0     0

Collectives™ on Stack Overflow

Replace row values with other row values from same df based on conditions

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related