How to apply the same function with different input arguments to create new columns in pandas dataframe?

Question

So i've this sample dataframe:

      x_mean    x_min    x_max     y_mean     y_min     y_max
 1      85.6        3      264       75.7         3       240
 2     105.5        6      243       76.4         3       191
 3      95.8       19      287       48.4         8       134
 4      85.5       50      166       64.8        32       103
 5      55.9       24      117       46.7        19        77 


x_range = [list(range(0,50)),list(range(51,100)),list(range(101,250)),list(range(251,350)),list(range(351,430)),list(range(431,1000))]
y_range = [list(range(0,30)),list(range(31,60)),list(range(61,90)),list(range(91,120)),list(range(121,250)),list(range(251,2000))]


#here x = Any column with mean value (eg. x_mean or y_mean)
# y = x_range / y_range 

def min_max_range(x,y):
for a in y:
    if int(x) in a:
        min_val = min(a)
        max_val = max(a)+1
        return max_val - min_val

def min_range(x,y):
for a in y:
    if int(x) in a:
        min_val = min(a)
        return min_val

Now i want to apply these function min_max_range() and min_range() to column x_mean, y_mean to get new columns.

Like the function min_max_val is using column x_mean & the range x_range as the input to create column x_min_max_val , similarly column y_mean & the range y_range are used for the column y_min_max_val :

I can create each column one by one, by using these one liners, but i want to apply this to both column x_mean & y_mean columns in one go with a one liner.

df['x_min_max_val'] = df['x_mean'].apply(lambda x: min_max_range(x,x_range))
df['y_min_max_val'] = df['y_mean'].apply(lambda x: min_max_range(x,y_range))

The resultant dataframe should look like this:

      x_mean    x_min    x_max     y_mean     y_min     y_max    x_min_max_val   y_min_max_val        x_min_val   y_min_val
1      85.6        3      264       75.7         3       240                49              29               51          61
2     105.5        6      243       76.4         3       191               149              29              101          91
3      95.8       19      287       48.4         8       134                49              29               51          91
4      85.5       50      166       64.8        32       103                49              29               51          61
5      55.9       24      117       46.7        19        77                49              29               51          31

I want to create these columns in one go, instead of creating one column ata time. How can i do this? Any suggestions? or something like this could work?

df.filter(regex='mean').apply(lambda x: min_max_range(x,x+'_range'))

are these the functions you are using or are they just for example? — Umar.H
– Umar.H, Commented Jan 12, 2020 at 0:50
@astroluv it is hard to understand what you are after. so you want your function min_max_range to take input x_mean and x_range and spit out the columns specified? — BICube
– BICube, Commented Jan 12, 2020 at 0:58
currently, your min_max_range would return None as no values in y_mean are in x_mean ? also passing in columns results in an error — Umar.H
– Umar.H, Commented Jan 12, 2020 at 0:58
Also, how do you currently do it and how are you hoping to make it 'in one go' ? — BICube
– BICube, Commented Jan 12, 2020 at 0:59

BICube · Accepted Answer · 2020-01-12 05:21:39Z

1

This is the concept that you need to follow to make this happen. First you need to have your ranges stored in a dictionary to enable access to them through names.

range_dict = {}
range_dict['x_range'] = x_range
range_dict['y_range'] = y_range

Also, you need to have the columns that you need to do the calculation for in a list (or you can use regex to get those if they have a specific pattern)

mean_cols_list = ['x_mean', 'y_mean']

Now, to apply your function over all columns, you need to define a function like this

def min_max_calculator(df, range_dictionary, mean_columns_list):
    for i in range(len(mean_cols_list)):
        # this returns 'x_mean'
        current_column = mean_cols_list[i]
        # this returns 'x_min_max_value'
        output_col_name = current_column.replace('mean','min_max_value')
        # this returns 'x_range'
        range_name = current_column.replace('mean','range')
        # this returns the list of ranges for x_range
        range_list = range_dict[range_name]
        # This add the calculated column to the dataframe
        df[output_col_name] = df[current_column].apply(lambda x: min_max_range(x,range_list))
    return(df)

df_output = min_max_calculator(df, range_dict, mean_cols_list)

answered Jan 12, 2020 at 5:21

BICube

4,7211 gold badge26 silver badges47 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

astroluv Over a year ago

How can i add an another column that will use the other columns to get a new column. x_new = df.x_min_max_val / ( df.x_max - df.x_min ) * (df.x_mean - df.x_min) + df.x_min_max_val

Collectives™ on Stack Overflow

How to apply the same function with different input arguments to create new columns in pandas dataframe?

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related