1

I have a DataFrame as follows. Both columns have Member_ID which indicates which Member_ID connected with other Member_ID

   col1  col2
    1     3
    1     4
    1     5
    2     3
    2     4
    3     1
    3     2
    3     5
    4     1
    4     2
    5     1
    5     3
    
and I have calculated each Member_ID connected with how many Member_ID. For example Member_ID 1 is connected with 3 Member_ID. If an Member_ID contains more or equal to 3 connections we have to put "a" in front of the Member_Id else we have to put "b" so the label we have to give the label as "a1" for Member_ID 1. Likewise I have calculated the labels for each Member_Id and the label array is below.
 member_ID   No_of_con  Label
    1          3         a1
    2          2         b2
    3          3         a3
    4          2         b4
    5          2         b5
    

Now I have to replace the first Dataframe's values referring from the label array. Dataframe is big for using for loops is not efficient So how can i achive this using Pandas in simpler way? I'm expecting the result as below

    col1     col2
    a1         a3
    a1         b4
    a1         b5
    b2         a3
    b2         b4
    a3         a1
    a3         b2
    a3         b5
    b4         a1
    b4         b2
    b5         a1
    b5         a3

3 Answers 3

5

we can stack, map and unstack:

In [9]: d1.stack().map(d2.set_index('member_ID')['Label']).unstack()
Out[9]:
   col1 col2
0    a1   a3
1    a1   b4
2    a1   b5
3    b2   a3
4    b2   b4
5    a3   a1
6    a3   b2
7    a3   b5
8    b4   a1
9    b4   b2
10   b5   a1
11   b5   a3
Sign up to request clarification or add additional context in comments.

1 Comment

@kumaranragunathan, glad i could help :)
2

Or you can try this

df2.set_index('member_ID',inplace=True)
df1.apply(lambda x: x.map(df2['Label']))


   col1 col2
0    a1   a3
1    a1   b4
2    a1   b5
3    b2   a3
4    b2   b4
5    a3   a1
6    a3   b2
7    a3   b5
8    b4   a1
9    b4   b2
10   b5   a1
11   b5   a3

1 Comment

we can also use .applymap() - it looks like it got faster in latest Pandas versions...
1

You can use pd.DataFrame.replace using a pd.Series in a dictionary context.

d1.replace(d2.set_index('member_ID').Label)

   col1 col2
0    a1   a3
1    a1   b4
2    a1   b5
3    b2   a3
4    b2   b4
5    a3   a1
6    a3   b2
7    a3   b5
8    b4   a1
9    b4   b2
10   b5   a1
11   b5   a3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.