I need to create a .csv file and append subsets of multiple dataframes into it.
All the dataframes are structured identically, however I need to create the output data set with headers, and then append all the subsequent data frames without headers.
I know I could just create the output file using the headers from the first data frame and then do an append loop with no headers from there, but I'd really like to learn how to do this in a more efficient way.
path ='/Desktop/NYC TAXI/Green/*.csv'
allFiles = glob.glob(path)
for file in allFiles:
df = pd.read_csv(file, skiprows=[1,2], usecols=np.arange(20))
metsdf = df.loc[df['Stadium_Code'] == 2]
yankdf = df.loc[df['Stadium_Code'] == 1]
with open('greenyankeetaxi.csv','a') as yankeetaxi:
yankdf.to_csv(yankeetaxi,header=false)
with open('greenmetstaxi.csv','a') as metstaxi:
metsdf.to_csv(metstaxi,header=false)
print(file + " done")