Converting dataframe into column of other dataframe in Python

Question

I have two two datasets:

df1:

Name        Answers Questions People-reached Reputation  
Alex Gaynor   154        44          ~1.4m     8,871

df2:

 Project               Total-score Post     
 python                    337      93  
 django-templates          22       4  
 slug                      12       1  
 google-app-engine         8        1  
 django                    235      57  
 clang                     22       2

Is there any way in Python (pandas or other library) I merge the two dataframe in a way so that df2 becomes new column in df1?

Desired output would be:

Name       Answers     Questions   People-reached    Reputation   Project-details
Alex Gaynor   154        44          ~1.4m             8,871   python 337 93  
                                                              django-templates 22 4   
                                                               slug   12  1  
                                                              google-app-engine 8 1

You want the entire df as a string in a new column, all in the first row of df1? — sundance
– sundance, Commented Aug 17, 2018 at 3:22
@sundance Yes. you are right all in new column and in the first row of df1 — user2293224
– user2293224, Commented Aug 17, 2018 at 3:24

andrew_reece · Accepted Answer · 2018-08-17 04:22:23Z

If you need to preserve the columnar structure of the added fields, you can create a column MultiIndex.

If you just need to store the information in df2 as a column in df1, you can make a column that contains a list of df2.values.

Option 1: Preserve column structure

# first merge df1 and df2
df2.index = ["Alex Gaynor"] * len(df2)
merged = df1.merge(df2, left_on="Name", right_index=True)

# now create multi-index columns
top_lvl = df1.columns.tolist() + ["project_details"]*3
bottom_lvl = [" "]*len(df.columns) + df2.columns.tolist()
merged.columns = [top_lvl, bottom_lvl]

merged

          Name Answers Questions People-reached Reputation    project_details  \
                                                                      Project   
0  Alex Gaynor     154        44          ~1.4m      8,871             python   
0  Alex Gaynor     154        44          ~1.4m      8,871   django-templates   
0  Alex Gaynor     154        44          ~1.4m      8,871               slug   
0  Alex Gaynor     154        44          ~1.4m      8,871  google-app-engine   
0  Alex Gaynor     154        44          ~1.4m      8,871             django   
0  Alex Gaynor     154        44          ~1.4m      8,871              clang   


  Total-score Post  
0         337   93  
0          22    4  
0          12    1  
0           8    1  
0         235   57  
0          22    2

If you really need all the df1 entries below the first row to be blank, you can just do:

merged.iloc[1:, :5] = ""
merged
          Name Answers Questions People-reached Reputation    project_details  \
                                                                      Project   
0  Alex Gaynor     154        44          ~1.4m      8,871             python   
0                                                            django-templates   
0                                                                        slug   
0                                                           google-app-engine   
0                                                                      django   
0                                                                       clang   


  Total-score Post  
0         337   93  
0          22    4  
0          12    1  
0           8    1  
0         235   57  
0          22    2

Option 2: Just store the df2 information in a column

df1["project_details"] = [df2.values]
df1
          Name  Answers  Questions People-reached Reputation  \
0  Alex Gaynor      154         44          ~1.4m      8,871   

                                     project_details  
0  [[python, 337, 93], [django-templates, 22, 4],...

sundance · Accepted Answer · 2018-08-17 04:54:06Z

1

You can make the dataframe into a string and add the value to the first row in a new column:

# make df into string
df_string = df2.to_string(index=False, header=False)

# make new column
df1["project_details"] = np.nan

# add df_string to first row in new column
df1.iloc[0, df1.columns.get_loc('project_details')] = df_string

answered Aug 17, 2018 at 4:54

sundance

2,9554 gold badges23 silver badges31 bronze badges

Collectives™ on Stack Overflow

Converting dataframe into column of other dataframe in Python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related