I have a .csv file with over 50k rows. I would like to divide it into smaller chunks and save as separate .csv files. Not sure if pandas are best approach here (if not I'm open for any suggestions).
My goal: Read file, identify number of existing rows in dataframe, divide dataframe into chunks (3000 rows each file including the header row, save as separate .csv files)
My code so far:
import os
import pandas as pd
i = 0
while os.path.exists("output/path/chunk%s.csv" % i):
i += 1
size = 3000
df = pd.read_csv('/input/path/input.csv')
list_of_dfs = [df.loc[i:i+size-1,:] for i in range(0, len(df),size)]
for x in list_of_dfs:
x.to_csv('/output/path/chunk%s.csv' % i, index=False)
the above code didn't throw any error, but created only one file ('chunk0.csv') with 1439 rows instead of 3000.
Could someone help me with this? thanks in advance!