40

I have an empty dataframe.

df=pd.DataFrame(columns=['a'])

for some reason I want to generate df2, another empty dataframe, with two columns 'a' and 'b'.

If I do

df.columns=df.columns+'b'

it does not work (I get the columns renamed to 'ab') and neither does the following

df.columns=df.columns.tolist()+['b']

How to add a separate column 'b' to df, and df.emtpy keep on being True?

Using .loc is also not possible

   df.loc[:,'b']=None

as it returns

  Cannot set dataframe with no defined index and a scalar
5
  • 3
    df2=df.copy() followed by df2['b']="" ? Commented May 16, 2018 at 13:32
  • actually it does. but why is '' not adding one element to the index then? and empty string is still a string Commented May 16, 2018 at 13:33
  • This is something I have been wondering myself...sorry but I don't know the answer! Commented May 16, 2018 at 13:34
  • 1
    df['b'] = None ? Commented May 16, 2018 at 13:40
  • related: stackoverflow.com/questions/30926670/… Commented May 16, 2018 at 13:44

6 Answers 6

56

Here are few ways to add an empty column to an empty dataframe:

df=pd.DataFrame(columns=['a'])
df['b'] = None
df = df.assign(c=None)
df = df.assign(d=df['a'])
df['e'] = pd.Series(index=df.index)   
df = pd.concat([df,pd.DataFrame(columns=list('f'))])
print(df)

Output:

Empty DataFrame
Columns: [a, b, c, d, e, f]
Index: []

I hope it helps.

Sign up to request clarification or add additional context in comments.

2 Comments

See also df2 = df.join(pd.DataFrame(columns=['b'])) as per answer below.
In case you're looking to add multiple columns, inplace, in a single line - I enjoy df[['c', 'd', 'e', 'f', 'g']] = [None] * 4
21

If you just do df['b'] = None then df.empty is still True and df is:

Empty DataFrame
Columns: [a, b]
Index: []

EDIT: To create an empty df2 from the columns of df and adding new columns, you can do:

df2 = pd.DataFrame(columns = df.columns.tolist() + ['b', 'c', 'd'])

Comments

10

If you want to add multiple columns at the same time you can also reindex.

new_cols = ['c', 'd', 'e', 'f', 'g']
df2 = df.reindex(df.columns.union(new_cols), axis=1)

#Empty DataFrame
#Columns: [a, c, d, e, f, g]
#Index: []

3 Comments

Yeah, I like union better. It avoids the possibility of having two similarly named columns in the df
@piRSquared I think maybe using concat can conbine the reindex and union
@Wen I'm sure you're right. However, that requires constructing a new dataframe simply to concat. I tend to avoid constructing new pandas objects if it isn't necessary.
6

This is one way:

df2 = df.join(pd.DataFrame(columns=['b']))

The advantage of this method is you can add an arbitrary number of columns without explicit loops.

In addition, this satisfies your requirement of df.empty evaluating to True if no data exists.

4 Comments

Why do you have to copy?
@MrR, the question states: for some reason I want to generate df2, another empty dataframe,.
df2 = df.join(pd.DataFrame(columns=['b'])) is sufficient. No need for df2 = df.copy()
Upvoted. PS: This should be added to the first answer - it's missing from that nice compendium presented there, and it's one of the most elegant ways (if not the most elegant).
4

You can use concat:

df=pd.DataFrame(columns=['a'])
df
Out[568]: 
Empty DataFrame
Columns: [a]
Index: []

df2=pd.DataFrame(columns=['b', 'c', 'd'])
pd.concat([df,df2])
Out[571]: 
Empty DataFrame
Columns: [a, b, c, d]
Index: []

Comments

0

You can simply use the following syntax

import pandas as pd
df = pd.DataFrame(columns=['A', 'B', 'C'])
df[['D', 'E', 'F']] = None
print(df)

This creates an empty dataframe with columns from 'A' to 'F' with below result

 >>Empty DataFrame
 >>Columns: [A, B, C, D, E, F]
 >>Index: []

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.