1

I have two Dataframes A and B. Both have same 4 columns. I want to merge the two data frames such that if first three column values match, then merge the id values(which is a jasonb array)

Sample data:

df_A

name     age    zip      id
abc      25     11111    ["2722", "2855", "3583"]

df_B

name     age    zip      id
abc      25     11111    ["123", "234"]

I want the final output to look like

Final output:

name     age    zip      id
----------------------------------------------------------------
abc      25     11111    ["2722", "2855", "3583", "123", "234"]
0

2 Answers 2

1

One quick solution will be

l=['name','age','zip']
df=(df1.set_index(l)+df2.set_index(l)).reset_index()
Sign up to request clarification or add additional context in comments.

Comments

1

Another option is to merge, then use a list comprehension to handle the "id" columns.

output = df_A.merge(df_B, on=['name', 'age', 'zip'])
output['id'] = [[*x, *y] for x, y in zip(output.pop('id_x'), output.pop('id_y'))] 

output
  name  age    zip                            id
0  abc   25  11111  [2722, 2855, 3583, 123, 234]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.