2

Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])
   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that's taking a lot of time. I was thinking of something like .groupby(level=0) but can't figure out the next aggregation apply part

5
  • 2
    Try: df_out=df.groupby(level=0).agg('_'.join) to start. Commented Jul 28, 2022 at 17:37
  • yeah tried that, stuck there! Commented Jul 28, 2022 at 17:38
  • 1
    df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0])) Commented Jul 28, 2022 at 17:40
  • @ScottBoston Is it possible to apply multiple function on aggregate? like say first list then tuple, obviously no one would want that but just thinking if multiple function could be applied or not Commented Jul 28, 2022 at 17:40
  • 1
    @ScottBoston please add that as answer, ll accept that Commented Jul 28, 2022 at 17:41

3 Answers 3

3

IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Or

df_out=df.groupby(level=0).agg('_'.join)
df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
print(df_out)
Sign up to request clarification or add additional context in comments.

Comments

2
df.groupby(level=0).agg(lambda x: f"{'_'.join(x)}_{x.index[0]}")

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_PQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Comments

2

You can do:

df.groupby(level=0).agg('_'.join).transform(lambda x:x+'_'+str(x.index[0]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.