3

I have the following dataframe:

df=pd.DataFrame({'seq':[0,1,2,3,4,5], 'location':['cal','cal','cal','il','il','il'],'lat':[29,29.1,28.2,15.2,15.6,14], 'lon':[-95,-98,-95.6,-88, -87.5,-88.9], 'name': ['mike', 'john', 'tyler', 'rob', 'ashley', 'john']})

I am wondering if there is a way to insert a new row at the beginning of the dataframe even though some fields may be missing in the new row.

I searched SO and found related links. add a row at top in pandas dataframe

However, my situation is different in that I don't have values for all the fields in my new row that I am inserting. Following link solves the same issue but in R: Inserting rows into data frame when values missing in category

How may I insert the following row in the above df? {'location' : 'warehouse', 'lat': 22, 'lon': -50}

My desired output is the following:

   seq   location   lat   lon    name
0       warehouse  25.0 -50.0        
1  0.0        cal  29.0 -95.0    mike
2  1.0        cal  29.1 -98.0    john
3  2.0        cal  28.2 -95.6   tyler
4  3.0         il  15.2 -88.0     rob
5  4.0         il  15.6 -87.5  ashley
6  5.0         il  14.0 -88.9    john

The number of columns of my actual dataframe is quite large. Hence not feasible to insert a np.nan for each column. Looking for a way to just specify the fields and associated values and the remaining fields get populated with nans.

2
  • Insert None or np.nan for the missing values Commented Jun 27, 2019 at 19:59
  • Hello @G.Anderson, I am showing only a representative dataframe. My actual dataframe has over 300 columns. Hence wanted to see if there is a easier way to add the row instead of np.nan for each missing field. Commented Jun 27, 2019 at 20:00

2 Answers 2

4

Try this:

import pandas as pd
import numpy as np
df=pd.DataFrame({'seq':[0,1,2,3,4,5], 'location':['cal','cal','cal','il','il','il'],'lat':[29,29.1,28.2,15.2,15.6,14], 'lon':[-95,-98,-95.6,-88, -87.5,-88.9], 'name': ['mike', 'john', 'tyler', 'rob', 'ashley', 'john']})

df_new1 = pd.DataFrame({'location' : ['warehouse'], 'lat': [22], 'lon': [-50]}) # sample data row1
df = pd.concat([df_new1, df], sort=False).reset_index(drop = True)
print(df) 

df_new2 = pd.DataFrame({'location' : ['abc'], 'lat': [28], 'name': ['abcd']}) # sample data row2
df = pd.concat([df_new2, df], sort=False).reset_index(drop = True) 
print(df)

output:

    lat   location   lon    name  seq
0  22.0  warehouse -50.0     NaN  NaN
0  29.0        cal -95.0    mike  0.0
1  29.1        cal -98.0    john  1.0
2  28.2        cal -95.6   tyler  2.0
3  15.2         il -88.0     rob  3.0
4  15.6         il -87.5  ashley  4.0
5  14.0         il -88.9    john  5.0

    lat   location    name   lon  seq
0  28.0        abc    abcd   NaN  NaN
1  22.0  warehouse     NaN -50.0  NaN
2  29.0        cal    mike -95.0  0.0
3  29.1        cal    john -98.0  1.0
4  28.2        cal   tyler -95.6  2.0
5  15.2         il     rob -88.0  3.0
6  15.6         il  ashley -87.5  4.0
7  14.0         il    john -88.9  5.0
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you Anubhav. Exactly what I was looking for :)
0

You can first transform your dict to a dict of lists:

dic = {k, [v] for k,v in dic.items()}

And then

pandas.concat([pandas.DataFrame(dic), df])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.