I have got a large csv file where the sample looks like the following (2 columns and many rows)
date score
1/1/16 0
2/1/16 0
3/1/16 0.2732
3/1/16 -0.6486
4/1/16 0
5/1/16 0.4404
5/1/16 -0.2732
6/1/16 -0.5859
6/1/16 0.34
You can see that there are multiple same dates with different score in the sample (same as the original file where there are hundreds of same dates with scores). I want to average the score by date and then save it as a csv format. The expected result should look like this (for each date one average score)
date Avg_Score
1/1/16 0
2/1/16 0
3/1/16 -0.1877
4/1/16 0
5/1/16 0.0836
6/1/16 -0.12295
How can I do it in Pandas module in Python? I checked stackoverflow for suggesstions and loc, iloc and groupby were all I found. But I could not make them useful I guess as this is what I have tried and still gets the same file as my original (nothing changes). Don't know why it is not working and how to get it to work.
import pandas as pd
import csv
df = pd.read_csv('myfile.csv')
df.groupby('date').mean().reset_index()
df.to_csv('average.csv', encoding='utf-8', index=False)
Would appreicate any help as I have been struggling with this for a while. Thank you.