0

I have Dataframes like followings:

df = pd.DataFrame({'buck' : ['3', '3', '3', '2', '2', '1', '1', '1', '0', '0'], 
                   'rank' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
df2 = pd.DataFrame({'col_no' : ['mr.A', 'mr.B', 'mr.C', 'mr.D', 'mr.E', 'mr.F', 'mr.G'],
                    'grade' : ['H', 'H', 'M', 'M', 'L', 'L', 'L']})

I want to create a column in df using 'col_no' from df2, based on conditions from both df and df2. The conditions are as follows:

  1. df['buck'] == 3 and df2['grade'] == 'H'
  2. df['buck'] == 2 and df2['grade'] == 'M'
  3. (df['buck'] == 1 or df['buck'] == 0) and df2['grade'] == 'L'

If the condition is met ->> loop through df2['col_no'] and put corresponding values repeatedly.

For example, the only values for the condition no.1 are mr.A and mr.B.

The desired output would be:

buck     rank     col_no        
3        1        mr.A        
3        2        mr.B       
3        3        mr.A
2        4        mr.C
2        5        mr.D
1        6        mr.E        
1        7        mr.F     
1        8        mr.G        
0        9        mr.E
0        10       mr.F

I am new to Python and do not know what key words to solve this problem are. The only option I could think of is to write a hard code looping through each row and not using Panda coding style.

Hence, any suggestion would be really appreciated. Thanks in advance !!

1 Answer 1

1

I would use a mapping and a transformation function

grades = {'3': 'H', '2': 'M', '1': 'L', '0': 'L'}

def trans(x):
    ln = len(x)
    grade = grades[x.name]         # find the corresponding grade
    data = df2[df2['grade'] == grade, 'col_no'].to_list() # extract matching data from df2
    data *= (ln // len(data)) + 1  # repeat as required
    return data[:ln]               # and return the expected len

It is then as simple as:

df.assign(col_no=df.groupby('buck')['buck'].transform(trans))

which gives as expected:

  buck  rank col_no
0    3     1   mr.A
1    3     2   mr.B
2    3     3   mr.A
3    2     4   mr.C
4    2     5   mr.D
5    1     6   mr.E
6    1     7   mr.F
7    1     8   mr.G
8    0     9   mr.E
9    0    10   mr.F

As above code requests one single column from the GroupBy object, the function called from transform receives each time a single Series where the name is the group identifier (here from '0' to '3') and the values the corresponding values from the column.

Sign up to request clarification or add additional context in comments.

2 Comments

That works !! Thank you!! Could you please explain what x.name does? Is it an object from groupby?
@eieiboom: I have edited my post to better explain it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.