1

I have a Dataframe of a million rows. I am trying to break it up into small sized Dataframes of 1000 rows each. I am able to break this huge Dataframe into smaller chunks (of 1000 rows each) using the below code:

size = 1000
list_of_dfs = [df[i:i+size-1,:] for i in range(0, len(df),size)]

I am however unsure how to call each of these smaller Dataframes in a loop so that I can read the entire 1000 small Dataframes that have been created

1 Answer 1

3

Output is list of DataFrames, so you can loop them:

out = []
for df_small in list_of_dfs:
    print (df_small)
    #procesing...
    out.append(df_small)

Similar:

for i in range(0, len(df),size):
    df_small = df[i:i+size-1,:] 
    print (df_small)

If data are in csv is possible use chunksize parameter:

for df_small in pd.read_csv(filename, chunksize=size):
    print (df_small)
Sign up to request clarification or add additional context in comments.

3 Comments

thank you for your response. What I am trying to do here is call each of these Dataframe and perform some action. I am trying to see if I can call each Dataframe by some name (variable). I am able to do so using list_of_dfs[0] and so on but I am not sure how many such dataframes exists. Hope I made my request clearer here.. Thanks.
@KevinNash - Not sure, there is not possible use loop and processing each DataFrame?
@KevinNash - If need aproach like list_of_dfs[0], list_of_dfs[1], list_of_dfs[N] then N is N = len(list_of_dfs-1)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.