I am trying to write a program which will take as input one or more files and summarize the average values coming from 2 columns in each file.
for example I have 2 files:
File1:
ID Feature Total Percent
1.2 ABC 300 75
1.4 CDE 129 68
File2:
ID Feature Total Percent
1.2 ABC 289 34
1.4 CDE 56 94
I want to iterate over each file and convert the percent to a number using:
def ReadFile(File):
LineCount = 0
f = open(File)
Header = f.readline()
Lines = f.readlines()
for Line in Lines:
Info = Line.strip("\n").split("\t")
ID, Feature, Total, Percent= Info[0], Info[1], int(Info[2]), int(Info[3])
Num = (Percent/100.0)*Total
I'm not sure what's the best way to store the output so that I have the ID, Feature, Total and Percent for each file. Ultimately, I would like to create an outfile that contains the average percent over all files. In the above example I would get:
ID Feature AveragePercent
1.2 ABC 54.9 #(((75/100.0)*300)+((34/100.0)*289)) / (300+289))
1.4 CDE 75.9 #(((68/100.0)*129)+((94/100.0)*56)) / (129+56))