0

So, i have the following data below and i want to loop through the dataframe and perform some functions and at the end save the results from the function in a list. I am have trouble creating a list. i only get a single value in the list and not the two means which i intend to get. Anybody with a more effective way to solve this problem please share.


     dict = {'PassengerId' : [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01], 
'Survived' : [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], 
'Pclass' : [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.5],
'Age' : [0.271, 0.472, 0.321, 0.435, 0.435, np.nan, 0.673, 0.02, 0.334, 0.171], 
'SibSp' : [0.125, 0.125, 0.0, 0.125, 0.0, 0.0, 0.0, 0.375, 0.0, 0.125], 
'Parch' : [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.167, 0.333, 0.0], 
'Fare' : [0.014, 0.139, 0.015, 0.104, 0.016, 0.017, 0.101, 0.041, 0.022, 0.059]}


        
import pandas as pd
dicts = pd.DataFrame(dicts, columns = dicts.keys())
def Mean(self):
    list_mean = []
    list_all = []
    for i, row in dicts.iterrows():
        if (row['Age'] > 0.2) & (row['Fare'] < 0.1):
            list_all.append(row['PassengerId'])
        elif (row['Age'] > 0.2) & (row['Fare'] > 0.1):
            list_all.clear()
            list_all.append(row['PassengerId'])
    return list_mean.append(np.mean(list_all))
            
               
Mean()

Help Please!!

7
  • If I understand the question correctly, you are getting only item in the list and that is because you are returning as soon as your if condition is satisfied for the first value in the dataframe. I believe you should return the final value i.e. Return at the completion of the for loop. Commented Apr 22, 2021 at 11:57
  • @SomuSinhhaa Thank you for your reply, however i was able to solve the problem, i now have a new challange, could you help me check it out? i have modified the code. Commented Apr 22, 2021 at 14:02
  • Sorry, I still see your old code, where you are trying to return within the if block. You should return as mentioned in one of the answers i.e only after you have stored all the list elements in list_mean i.e after completion of for loop. Further if you have a different question, I would suggest you to open a new thread. Commented Apr 22, 2021 at 14:19
  • @SomuSinhhaa i have edited it and you can check it now. Commented Apr 22, 2021 at 14:20
  • Request you to elaborate this line a bit more. Its not very clear "I only get a single value in the list and not the two means which i intend to get" I guess but not sure that you want to append to the list in case if either of your condition matches then in that case you have to use logical OR to combine the 2 conditions rather than if elif Commented Apr 22, 2021 at 14:30

2 Answers 2

1

Some of changes you have to made in you solution to resolve this issue. And for vectorized answer checkout my Code section.

1.

Return statement return list_mean should placed in function block not in if-block

Change:

. . .         
if (row['Age'] > self.age) & (row['Fare'] < self.fare):
                list_mean.append(row['PassengerId'])
                return list_mean            
. . .

To:

. . .
list_mean = []
for i, row in dicts.iterrows():
    if (row['Age'] > self.age) & (row['Fare'] < self.fare):
         list_mean.append(row['PassengerId'])
return list_mean
. . .

CODE :(Vectorized-Version-Solution) No need of defining explicit class to perform this action

import numpy as np
dict_ = {
    'PassengerId':
    [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01],
    'Survived': [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0],
    'Pclass': [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.5],
    'Age':
    [0.271, 0.472, 0.321, 0.435, 0.435, np.nan, 0.673, 0.02, 0.334, 0.171],
    'SibSp': [0.125, 0.125, 0.0, 0.125, 0.0, 0.0, 0.0, 0.375, 0.0, 0.125],
    'Parch': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.167, 0.333, 0.0],
    'Fare':
    [0.014, 0.139, 0.015, 0.104, 0.016, 0.017, 0.101, 0.041, 0.022, 0.059]
}

import pandas as pd
dicts = pd.DataFrame(dict_, columns=dict_.keys())

l1 = dicts['PassengerId'][np.logical_and(dicts['Age'] > 0.2, dicts['Fare'] < 0.1)]
l2 = dicts['PassengerId'][np.logical_and(dicts['Age'] > 0.2, dicts['Fare'] > 0.1)]

print( (sum(list(l1))/len(l1), sum(list(l2))/len(l2)) )

OUTPUT :

(0.00375, 0.0036666666666666666)
Sign up to request clarification or add additional context in comments.

3 Comments

i edited the question, can you please help me with the most effective way to solve it.
@deeplearningEngineer please check the solution and lets me know any issue
thanks for the reply, it looks alright. However, i was hoping if there was a way to do it using loop? the code is part of a larger code and it would be efficent to loop and not have a chunky code @Exploore X
0
import pandas as pd
import numpy as np

dict = {'PassengerId' : [0.0, 0.001, 0.002, 0.003, 0.004, 0.006, 0.007, 0.008, 0.009, 0.01],
'Survived' : [0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0],
'Pclass' : [1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.5],
'Age' : [0.271, 0.472, 0.321, 0.435, 0.435, np.nan, 0.673, 0.02, 0.334, 0.171],
'SibSp' : [0.125, 0.125, 0.0, 0.125, 0.0, 0.0, 0.0, 0.375, 0.0, 0.125],
'Parch' : [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.167, 0.333, 0.0],
'Fare' : [0.014, 0.139, 0.015, 0.104, 0.016, 0.017, 0.101, 0.041, 0.022, 0.059]}

df = pd.DataFrame(dict, columns = dict.keys())

def calculate_mean():
    l1, l2 = [], []
    for i, row in df.iterrows():
        if row['Age'] > 0.2 and row['Fare'] < 0.1:
            l1.append(row['PassengerId'])
        elif row['Age'] > 0.2 and row['Fare'] > 0.1:
            l2.append(row['PassengerId'])
    return np.mean(l1), np.mean(l2)


print(calculate_mean()) # (0.00375, 0.0036666666666666666)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.