insert rows to pandas Dataframe based on condition?

Question

I`m using pandas dataframe to read .csv format file. I would like to insert rows when specific column values changed from value to other. My data is shown as follow:

Id   type
 1    car
 1  track
 2  train
 2  plane
 3    car

I need to add row that contains Id is empty and type value is number 4 after any change in Id column value. My desired output should like this:

Id   type
 1    car
 1  track
        4
 2  train
 2  plane
        4
 3    car

How I do this??

Hi, what was your attempt? Where did it fail (if at all)? If you provided that, maybe the answerers could also point out what's missing about it and/or build their solutions on top of that instead of coming up with possibly unreadably complex answers (at least half of the answers here are as such, for example). — user17693816
– user17693816, Commented Jan 7, 2022 at 14:09

mozway · Accepted Answer · 2022-01-07 13:55:55Z

1

You could use groupby to split by groups and append the rows in a list comprehension before merging again with contact:

df2 = pd.concat([d.append(pd.Series([None, 4], index=['Id', 'type']), ignore_index=True)
                 for _,d in df.groupby('Id')], ignore_index=True).iloc[:-1]

If the index is sorted, another option is to find the index of the last item per group and use it to generate the new rows:

# get index of last item per group (except last)
idx = df.index.to_series().groupby(df['Id']).last().values[:-1]

# craft a DataFrame with the new rows
d = pd.DataFrame([[None, 4]]*len(idx), columns=df.columns, index=idx)

# concatenate and reorder
pd.concat([df, d]).sort_index().reset_index(drop=True)

output:

    Id   type
0  1.0    car
1  1.0  track
2  NaN    4.0
3  2.0  train
4  2.0  plane
5  NaN    4.0
6  3.0    car

edited Jan 7, 2022 at 13:55

answered Jan 7, 2022 at 13:42

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

فيصل عبد الله عزيز Over a year ago

Thanks for all answers

Serge de Gosson de Varennes · Accepted Answer · 2022-01-07 13:42:48Z

0

You can do this:

df = pd.read_csv('input.csv', sep=";")
 Id   type
0   1    car
1   1  track
2   2  train
3   2  plane
4   3    car

mask = df['Id'].ne(df['Id'].shift(-1))
df1 = pd.DataFrame('4',index=mask.index[mask] + .5, columns=df.columns)
df1['Id'] = df['Id'].replace({'4':' '})
df = pd.concat([df, df1]).sort_index().reset_index(drop=True).iloc[:-1]

which gives:

 Id   type
0  1.0    car
1  1.0  track
2  NaN      4
3  2.0  train
4  2.0  plane
5  NaN      4
6  3.0    car

answered Jan 7, 2022 at 13:42

Serge de Gosson de Varennes

11.6k4 gold badges30 silver badges60 bronze badges

Comments

Mayank Porwal · Accepted Answer · 2022-01-07 13:44:52Z

0

You can do:

In [244]: grp = df.groupby('Id')
In [256]: res = pd.DataFrame()

In [257]: for x,y in grp:
     ...:     if y['type'].count() > 1:
     ...:         tmp = y.append(pd.DataFrame({'Id': [''], 'type':[4]}))
     ...:         res = res.append(tmp)
     ...:     else:
     ...:         res = res.append(y)
     ...: 

In [258]: res
Out[258]: 
  Id   type
0  1    car
1  1  track
0         4
2  2  train
3  2  plane
0         4
4  3    car

answered Jan 7, 2022 at 13:44

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

2 Comments

فيصل عبد الله عزيز Over a year ago

please, can i do this without grpby, because i need data same as is(without sorted)

Mayank Porwal Over a year ago

But why do you not want to use groupby. In your sample dataframe, groupby looks mandatory.

Ashwiniku918 · Accepted Answer · 2022-01-07 13:57:50Z

Please find the solution below using index :

   ######  Create a shift variable to compare index
         df['idshift'] = df['Id'].shift(1)
        # When shift id does not match id, mean change index
        change_index = df.index[df['idshift']!=df['Id']].tolist()
        change_index
        # Loop through all the change index and insert at index
        for i in change_index[1:]:
           line = pd.DataFrame({"Id": ' ' , "rate": 4}, index=[(i-1)+.5])
           df = df.append(line, ignore_index=False)
        # finallt sort the index 
           df = df.sort_index().reset_index(drop=True)

Input Dataframe :

df = pd.DataFrame({'Id': [1,1,2,2,3,3,3,4],'rate':[1,2,3,10,12,16,10,12]})

Ouput Results from the code :

Collectives™ on Stack Overflow

insert rows to pandas Dataframe based on condition?

4 Answers 4

1 Comment

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related