How can I add an empty row before a definite row in Python DataFrame?

Question

I'm working with a huge dataframe in python and sometimes I need to add an empty row or several rows in a definite position to dataframe. For this question I created a small dataframe df in order to show, what I want to achieve.

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
        'Price': [22000,25000,27000,35000]
        }

df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

If a row value is 27000, I want to add an empty row before it. I can insert row after with Concat but I can't really think of a way of adding it before..

anky · Accepted Answer · 2021-04-12 17:01:22Z

3

You can create a helper cumsum column for groupby then append a blank row only for the first group and then concat:

out = pd.concat((g.append(pd.Series(),ignore_index=True) if i==0 else g 
       for i, g in df.groupby(df['Price'].eq(27000).cumsum())))

print(out)

            Brand    Price
0     Honda Civic  22000.0
1  Toyota Corolla  25000.0
2             NaN      NaN
2      Ford Focus  27000.0
3         Audi A4  35000.0

answered Apr 12, 2021 at 17:01

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

ALollz · Accepted Answer · 2021-04-12 17:26:51Z

2

Create a DataFrame with the index labels based on your condition that has all null values. [Assumes df has a non-duplicated index]. Then concat and sort_index which will place the missing row before (because we concat df to empty). Then reset_index to remove the duplicate index labels.

import pandas as pd

empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
#            Brand  Price
#0     Honda Civic  22000
#1  Toyota Corolla  25000
#2             NaN    NaN
#3      Ford Focus  27000
#4         Audi A4  35000

This will add a blank row before every 27000 row

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4','Jeep'],
        'Price': [22000,25000,27000,35000,27000]}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])

empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
#            Brand  Price
#0     Honda Civic  22000
#1  Toyota Corolla  25000
#2             NaN    NaN
#3      Ford Focus  27000
#4         Audi A4  35000
#5             NaN    NaN
#6            Jeep  27000

edited Apr 12, 2021 at 17:26

answered Apr 12, 2021 at 17:15

ALollz

59.7k7 gold badges74 silver badges97 bronze badges

3 Comments

anky Over a year ago

Good one..!! visionary (: I didnt think that price can repeat :/

ALollz Over a year ago

@anky yeah I wasn't sure if "several rows" meant several rows in one position or several single rows in many positions. ¯_(ツ)_/¯

anky Over a year ago

Practically saying this makes more sense than my answer :)

Shubham Sharma · Accepted Answer · 2021-04-12 17:05:43Z

2

Let us try cummax with append:

m = df['Price'].eq(27000).cummax()
df[~m].append(pd.Series(), ignore_index=True).append(df[m])

            Brand    Price
0     Honda Civic  22000.0
1  Toyota Corolla  25000.0
2             NaN      NaN
2      Ford Focus  27000.0
3         Audi A4  35000.0

answered Apr 12, 2021 at 17:05

Shubham Sharma

71.8k6 gold badges26 silver badges58 bronze badges

Comments

Anurag Dabas · Accepted Answer · 2021-04-12 17:58:18Z

You can also do this by concat() method and apply() method:

result=pd.concat((df.apply(lambda x:np.nan if x['Price']==27000 else x,1),df))

Finally use sort_index() method,drop_duplicates() method and reset_index() method:

result=result.sort_index(na_position='first').drop_duplicates().reset_index(drop=True)

Now if you print result you will get your desired output:

    Brand           Price
0   Honda Civic     22000.0
1   Toyota Corolla  25000.0
2   NaN             NaN
3   Ford Focus      27000.0
4   Audi A4         35000.0

This will add a blank row before every row where Price=27000:

result=pd.concat((df.apply(lambda x:np.nan if x['Price']==27000 else x,1),df))

result=result.drop_duplicates().append(result[result.isna().all(1)].iloc[1:]).sort_index(na_position='first').reset_index(drop=True)

Collectives™ on Stack Overflow

How can I add an empty row before a definite row in Python DataFrame?

4 Answers 4

Comments

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related