4

I have a really simple problem that I cant solve in Pandas. I have a dataframe to start with, with that dataframe I want to apply some function. I want to repeat this many times and build/stack the reults from the operations in a new larger dataframe. I was thinking of doing this with a for loop. Here is a simplified example that I can not get to work:

import pandas as pd

df = pd.DataFrame(np.random.randn(3, 4), columns=list('ABCD'))

large_df = df*0

for i in range(1,10):
    df_new = df*i
    large_df= pd.concat(large_df,df_new)

large_df

Any ideas??

2
  • large_df = pd.concat([large_df,df_new]) Commented Mar 21, 2014 at 11:15
  • You could just append: large_df= large_df.append(df_new) Commented Mar 21, 2014 at 11:16

1 Answer 1

5

It will be fastest to build all of the results first and concatenate once in the end. If you append one result at a time, the memory for the results has to be re-allocated each time.

So, if you are applying some_function with a different parameter p through the loop (like i in your toy example above) I suggest:

pd.concat([df.apply(lambda x: some_function(x, p)) for p in parameters])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.