How to fill values in a Dataframe depending on values around it

Question

I have a dataframe that looks something like this:

1   2  3  'String'
''  4  X  ''
''  5  X  ''
''  6  7  'String'
''  1  Y  ''

And I want to change the Xs and Ys (put here just to visualize) to the value corresponding to the same column when the last column = 'String'. So, the Xs would become a 3, and the Y would be 7:

1  2  3 'String'
'' 4  3 ''
'' 5  3 ''
'' 6  7 'String'
'' 1  7 ''

The reference value is the same until another 'parent' row comes around. So the first 3 remains until there comes another 'String' parent round.

I tried generating another dataframe containing where there's 'String' and filling from idx to idx+1 with the value, but it's too slow.

This is really similar to a forward fill (pd.ffill()), but not exactly, and I don't really know if it's feasible to turn my problem into a ffill() problem.

I updated my solution, it now relies on df['D'] being 'String' — Aadvik
– Aadvik, Commented Jul 30 at 16:33

Aadvik · Accepted Answer · 2025-07-30 16:31:14Z

5

Updated solution:

This situation can be solved using .ffill() but, you just have to replace the random int values with `NaN` values,

df.loc[df['D'] != 'String', 'C'] = np.nan

What this does is it finds where df['D'] is not 'String' and assigns a NaN value to it.

Now, the last step is simple, just use .ffill()

df['C'] = df['C'].ffill()

Here is the final result:

>>> df
   C    D
0  3.0  String
1  3.0        
2  3.0        
3  7.0  String
4  7.0

answered Jul 30 at 16:31

Aadvik

1,5224 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

chrslg Jul 30 at 16:45

Works too. Assuming floats are ok, since assigning nan to a int column will promote it as float. Which most of the time should not be a problem, since float64 can hold exact values up to 9 quadrillions.

Aadvik Jul 30 at 16:48

If it is a big problem then its possible to convert floats into int

Lucas P Jul 30 at 17:44

I used this one and it was really quick, thank you!

Aadvik Aug 24 at 1:54

No problem! Glad to help!

Aadvik · Accepted Answer · 2025-07-30 16:20:14Z

Starting from example

import pandas as pd
df=pd.DataFrame({"C":[3, 'X', 'X', 7, 'Y'], 'D':['String', '', '', 'String', '']}) # my own [mre]. You should have included that line in your question ;-)

So df is

   C       D
0  3  String
1  X        
2  X        
3  7  String
4  Y

(don't worry, the X and Y have no influence on the result. I just included them to imitate your example)

What you are looking for is probably something like:

df['C'] = df.groupby((df['D']=='String').cumsum())['C'].transform('first')

Result:

   C       D
0  3  String
1  3        
2  3        
3  7  String
4  7

To understand it, it is worth looking at what df['D']=='String').cumsum() does. df['D']=='String' is just a boolean series (True where last column is 'String', False elsewhere). But if you apply .cumsum on such series, it behaves as if True is 1 and False is 0. So what you get is a counter that is incremented each time there is a 'String' and stays as is otherwise. So

>>> (df['D']=='String').cumsum()
0    1
1    1
2    1
3    2
4    2
Name: D, dtype: int64

Which is exactly what you need to group your rows by, to have one group for each row with 'String' and all following rows without (till the next 'String').

Now, just transform C to take the first value of each group, and voila

df['C'] = df.groupby((df['D']=='String').cumsum())['C'].transform('first')

>>> df
   C       D
0  3  String
1  3        
2  3        
3  7  String
4  7

PaulS · Accepted Answer · 2025-07-30 17:00:40Z

2

Another possible solution:

df = df.assign(C = df['C'].where(df['D'].eq('String')).ffill().astype(int))

This creates a new version of df where column C is updated by forward-filling only the numeric values, leaving other values untouched. The df['D'].eq('String') method identifies which entries in column D is 'String'. The where() method replaces non-numeric entries with NaN, and then ffill() propagates the last valid numeric value downward, effectively filling the rows where C was not numeric with the most recent numeric value above it.

Output:

    A  B  C         D
0   1  2  3    String
1  ''  4  3        ''
2  ''  5  3        ''
3  ''  6  7    String
4  ''  1  7        ''

edited Jul 30 at 17:00

answered Jul 30 at 16:16

PaulS

27.1k3 gold badges19 silver badges40 bronze badges

3 Comments

chrslg Jul 30 at 16:36

+1. Down vote seems severe. If C is not a string it doesn't work, sure (but the OP's X and Y could appear to be some string values to otherwise numeric column). Since it is not, that solution could still work with df['D']=='String' instead of df['C'].str.isnumeric()

PaulS Jul 30 at 16:40

Thanks, I am updating my solution accordingly.

Aadvik Jul 30 at 16:47

It works but you should update the solution and assign df to your code.

mozway · Accepted Answer · 2025-07-31 13:43:01Z

1

You can select the wanted rows with boolean indexing and reindex with method='ffill':

df['C'] = df.loc[df['D'].eq('String'), 'C'].reindex(df.index, method='ffill')

Alternatively, for fun, and assuming you have an ordered index, you could select the rows to propagate with boolean indexing and combine them to the original input with a merge_asof on the index:

df['C'] = pd.merge_asof(df[[]], df.loc[df['D'].eq('String'), 'C'],
                        left_index=True, right_index=True)

Or as a new DataFrame:

out = (pd.merge_asof(df.drop(columns='C'),
                     df.loc[df['D'].eq('String'), 'C'],
                     left_index=True, right_index=True)
         .reindex_like(df)
      )

Output:

   A  B  C       D
0  1  2  3  String
1     4  3        
2     5  3        
3     6  7  String
4     1  7

edited Jul 31 at 13:43

answered Jul 31 at 7:27

mozway

267k13 gold badges56 silver badges106 bronze badges

1 Comment

Aadvik Aug 26 at 2:13

A very creative solution!

Collectives™ on Stack Overflow

How to fill values in a Dataframe depending on values around it

4 Answers 4

4 Comments

Comments

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related