Transform all rows of data frame into arrays and pass to function

Question

I want to transform all rows of a data frame to arrays and use the arrays in a function. The function should create a new column with the results of the function for every row.

def harmonicMean(arr):
    sum = 0;
    for item in arr:
        sum = sum + float(1.0/item);
        print "inside" + str(float(1.0/item));
    print sum;
    return float(len(arr) / sum);

The function actually generates harmonic mean for every row in the data frame. These values should be populated in a new column in the data frame. (the data frame also contains Nan values)

Can you provide more information? as data sample (can be df.head()), what did you try and what is your desire output — Terry
– Terry, Commented Apr 8, 2019 at 16:18

ALollz · Accepted Answer · 2019-04-08 16:36:18Z

3

You can calculate without iterating over the rows:

df['hmean'] = df.notnull().sum(axis=1)/(1/df).sum(axis=1)

   a    b    c     d   e     hmean
0  4  5.0  2.0   5.0  10  4.000000
1  2  8.0  1.0   8.0   6  2.608696
2  7  NaN  1.0   1.0   8  1.763780
3  7  1.0  9.0   4.0   9  3.095823
4  8  5.0  8.0   NaN   3  5.106383
5  3  8.0  6.0  10.0   6  5.607477
6  3  7.0  3.0   9.0   9  4.846154
7  8  NaN  NaN   NaN   6  6.857143
8  2  4.0  1.0   5.0   2  2.040816
9  5  7.0  5.0   3.0   1  2.664975

edited Apr 8, 2019 at 16:36

answered Apr 8, 2019 at 16:29

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jagruthi C Over a year ago

Hi! thank you for the answer, I get 1 error which I do not understand. It says: Could not operate 1 with block values float division by zero. Do you know what it means?

ALollz Over a year ago

@JagruthiC I am not quite sure. It may be a divide by 0 issue, though I can't replicate that issue on my end, as this seems to handle all NaN rows and 0/NaN or #/0 on my end.

nickyfot · Accepted Answer · 2019-04-08 16:19:07Z

0

you can use in built .iloc and .to_list() methods to get the rows as an array and pass them to your method.

rows = df.shape[0]
for i in range(rows):
    row_lst = df.iloc[i].to_list()
    print(harmonicMean(row_lst))

answered Apr 8, 2019 at 16:19

nickyfot

1,99919 silver badges26 bronze badges

4 Comments

Bhanu Tez Over a year ago

df.values will give numpy ndarray.. this can be iterable along a row.. faster in this way..

Jagruthi C Over a year ago

@nickthefreak thank you! it worked. However i get this error:ZeroDivisionError: ('float division by zero', 'occurred at index 0') as the rows contains zeroes as well. any idea how do i ignore zero and Nan values while computing the harmonic mean?

nickyfot Over a year ago

I cannot tell if the zero division error is when you divide by item or when you divide by the sum; probably can happen for either one division. You probably need to add an if statement checking that item and sum are greater than 0 before dividing

Jagruthi C Over a year ago

i did try that, i get this error stating: row_lst = data1.iloc[i].to_list() File "C:\Users\Pinky\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py", line 4376, in getattr return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'to_list'

Collectives™ on Stack Overflow

Transform all rows of data frame into arrays and pass to function

2 Answers 2

2 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related