How to create new columns using .loc syntax?

Question

I have a list of names of columns (cols) that exist in one dataframe.

I want to insert columns by those names in another dataframe.

So I am using a for loop to iterate the list and create the columns one by one:

cols = ['DEPTID', 'ACCOUNT', 'JRNL LINE DESCRIPTION', 'JRNL DATE', 'BASE AMOUNT', 'FOREIGN CURRENCY', 'FOREIGN AMOUNT', 'JRNL SOURCE']
for col in cols:
    # "summary" and "obiee" are dataframes
    summary.loc[obiee['mapid'], col] = obiee[col].tolist()

I would like to get rid of the for loop, however.

So I have tried multiple column assignment using the .loc syntax:

cols = ['DEPTID', 'ACCOUNT', 'JRNL LINE DESCRIPTION', 'JRNL DATE', 'BASE AMOUNT', 'FOREIGN CURRENCY', 'FOREIGN AMOUNT', 'JRNL SOURCE']
summary.loc[obiee['mapid'], cols] = obiee[cols]

but Pandas will throw an error:

KeyError: "['DEPTID' 'ACCOUNT' 'JRNL LINE DESCRIPTION' 'JRNL DATE' 'BASE AMOUNT'\n 'FOREIGN CURRENCY' 'FOREIGN AMOUNT' 'JRNL SOURCE'] not in index"

Is it not possible with this syntax? How can I do this otherwise?

I don't think you can create new columns with the .loc approach. pandas.pydata.org/pandas-docs/stable/reference/api/… says it is label based so those columns must be there. What you could do is just pre-allocate those columns and you're good. for col in cols: summary[col] = None or you can use the assign property. As long as the columns exist you should be able to use .loc — Buckeye14Guy
– Buckeye14Guy, Commented Oct 11, 2019 at 14:19
@Buckeye14Guy: Thanks! Regarding: "I don't think you can create new columns with the .loc approach": note that I am able to do this in my example with a single column label, but not with a list of labels (hence my question). — barciewicz
– barciewicz, Commented Oct 11, 2019 at 14:41
For many columns, even intuitively thinking, you want to merge, concat or join data frames. — rafaelc
– rafaelc, Commented Oct 11, 2019 at 14:41

piRSquared · Accepted Answer · 2019-10-11 14:33:17Z

2

`join`

You can create a new dataframe and then join. From your problem description and sample code, 'mapid' represents index values in the summary dataframe. join is made to merge on index. So by setting obiee's index to 'mapid' then taking the the appropriate columns, we can just use join.

summary.join(obiee.set_index('mapid')[cols])

edited Oct 11, 2019 at 14:33

answered Oct 11, 2019 at 14:28

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Yatish Kadam · Accepted Answer · 2019-10-11 14:24:23Z

you have a dataFrame df1 .. with some columns...

And you want those in a df2 ... all you need to do is just equate them as show below

df2 = pd.DataFrame({ 'A' : 1.,
   ....:                      'B' : pd.Timestamp('20130102'),
   ....:                      'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
   ....:                      'D' : np.array([3] * 4,dtype='int32'),
   ....:                      'E' : pd.Categorical(["test","train","test","train"]),
   ....:                      'F' : 'foo' })
df1 = pd.DataFrame({ 'G' : 1.,
   ....:                      'H' : pd.Timestamp('20130102'),
   ....:                      'I' : pd.Series(1,index=list(range(4)),dtype='float32'),
   ....:                      'J' : np.array([3] * 4,dtype='int32'),
   ....:                      'K' : pd.Categorical(["test","train","test","train"]),
   ....:                      'L' : 'foo' })
df2['G'],df2['F'] = df1['G'],df1['H']

Collectives™ on Stack Overflow

How to create new columns using .loc syntax?

2 Answers 2

`join`

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

join

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related

`join`