Python | How to make a program that calculates strings

Question

I'm trying to create a Python script using pandas that can import a .txt file and calculate the average of each subject

I'm trying to turn this "file.txt":

code name subject1 subject2 subject3
1234 Ali 6 0 8
1235 Carl 4 7 7
1236 Jason 3 5 0

and turn in intro this:

subject1 average is: 4.3
subject2 average is: 6
subject3 average is: 7.5

subject1 is calculated like this: (6 + 4 + 3) / 3,
subject2 is calculated like this: (7 + 5) / 2 <-- because one person has a 0 means he/she didn't anticipate so their 0 does't add and counts toward the average
subject3 is calculated like this: (8 + 7) / 2 <-- Like above

I'm also trying to figure out a way for the script to be flexible and have the ability to add more subjects and more people (so 3 instead of 5)

This is my code until now:

# read input file
df = pd.read_csv('file.txt')

# calculate mean, ignoring 0 values
df['mean'] = df.iloc[:, 2:].astype(float).replace(0, np.nan).mean(1)

# iterate rows and print results
for name, mean in df.set_index('name')['mean'].items():
    print(f'{name} has average of {mean:.2f}')

It calculates the average of each person (horizontally)
but I can't figure out a way to do it vertically for each subject.

thanks for the help guys ^_^

What kind of help do you expect? Do you want us to write code for you? If so, we don't do that: we only help with specific issues in concrete code. Otherwise, please post the code you've written to solve this and explain what the issue is. — ForceBru
– ForceBru, Commented Oct 7, 2018 at 13:47
@ForceBru, I added more information, I already have some code I hope it help thanks! — Doughnut
– Doughnut, Commented Oct 7, 2018 at 13:58

fuglede · Accepted Answer · 2018-10-07 14:00:35Z

2

The argument 1 that you provide to pd.Series.mean is the axis along which the mean is calculated; the default is columns, so you are explicitly telling it to calculate the row-wise mean. Remove that argument and you should be good.

In [155]: df.iloc[:, 2:].astype(float).replace(0, np.nan).mean()
Out[155]:
subject1    4.333333
subject2    6.000000
subject3    7.500000

answered Oct 7, 2018 at 14:00

fuglede

18.3k3 gold badges62 silver badges107 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Doughnut Over a year ago

Is it possible to both calculate the horizontal and vertical to print both calculations?

fuglede Over a year ago

Well, that's what you're doing. If you let df_nan = df.iloc[:, 2:].astype(float).replace(0, np.nan), then you could print df_nan.mean() first, then df_nan.mean(1) afterwards.

fuglede Over a year ago

Great, you're welcome. If you found the answer helpful, you can accept it. Besides giving us mostly useless internet points, this helps to indicate which questions on StackOverflow are still in need of attention.

JesusR · Accepted Answer · 2018-10-07 14:14:16Z

0

If I understand you good, you want to do this.

import pandas as pd
data=pd.read_csv('data.csv',sep=' ')
    #You can change the range for number of subjects
    for i in range(1,4):
    #Print average for subject
    print(datos['subject'+str(i)].mean())

answered Oct 7, 2018 at 14:14

JesusR

91 bronze badge

Collectives™ on Stack Overflow

Python | How to make a program that calculates strings

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related