Consider the sample df below:
import pandas as pd
d = {'id': ["A123", "A123", "A123"],
'text1': ["this is a sample", "this is a sample", "this is a sample"],
'text2': ["sing with me", "one two three", "sing with me"]}
df = pd.DataFrame(data=d)
I'm trying to take the id column id and concat the unique values of each of the text columns, so that the sample df:
id text1 text2
A123 this is a sample sing with me
A123 this is a sample one two three
A123 this is a sample sing with me
Will look like this:
id combined_text
A123 this is a sample | sing with me | one two three
I tried all sort of combination of " | ".join(x) and agg and more... I can take d['id','text1'].unique() and d['id','text2'].unique() and later merge, but there must be a more efficient way.