1

I have some function that takes a DataFrame and an integer as arguments:

func(df, int)

The function returns a new DataFrame, e.g.:

df2 = func(df,2) 

I'd like to write a loop for integers 2-10, resulting in 9 DataFrames. If I do this manually it would look like this:

df2 = func(df,2) 
df3 = func(df2,3) 
df4 = func(df3,4) 
df5 = func(df4,5) 
df6 = func(df5,6) 
df7 = func(df6,7) 
df8 = func(df7,8) 
df9 = func(df8,9) 
df10 = func(df9,10) 

Is there a way to write a loop that does this?

3 Answers 3

4

This type of thing is what lists are for.

data_frames = [df]
for i in range(2, 11):
    data_frames.append(func(data_frames[-1], i))

It's a sign of brittle code when you see variable names like df1, df2, df3, etc. Use lists when you have a sequence of related objects to build.

To clarify, this data_frames is a list of DataFrames that can be concatenated with data_frames = pd.concat(data_frames, sort=False), resulting in one DataFrame that combines the original df with everything that results from the loop, correct?

Yup, that's right. If your goal is one final data frame, you can concatenate the entire list at the end to combine the information into a single frame.

Do you mind explaining why data_frames[-1], which takes the last item of the list, returns a DataFrame? Not clear on this.

Because as you're building the list, at all times each entry is a data frame. data_frames[-1] evaluates to the last element in the list, which in this case, is the data frame you most recently appended.

Sign up to request clarification or add additional context in comments.

3 Comments

To clarify, this data_frames is a list of DataFrames that can be concatenated with data_frames = pd.concat(data_frames, sort=False), resulting in one DataFrame that combines the original df with everything that results from the loop, correct?
Do you mind explaining why data_frames[-1], which takes the last item of the list, returns a DataFrame? Not clear on this.
@ConfusedDolphin I edited answers to your comments in my answer.
1

You may try using itertools.accumulate as follows:

sample data

df:
    a   b   c
0  75  18  17
1  48  56   3

import itertools

def func(x, y):
    return x + y

dfs = list(itertools.accumulate([df] + list(range(2, 11)), func))

[    a   b   c
 0  75  18  17
 1  48  56   3,     a   b   c
 0  77  20  19
 1  50  58   5,     a   b   c
 0  80  23  22
 1  53  61   8,     a   b   c
 0  84  27  26
 1  57  65  12,     a   b   c
 0  89  32  31
 1  62  70  17,     a   b   c
 0  95  38  37
 1  68  76  23,      a   b   c
 0  102  45  44
 1   75  83  30,      a   b   c
 0  110  53  52
 1   83  91  38,      a    b   c
 0  119   62  61
 1   92  100  47,      a    b   c
 0  129   72  71
 1  102  110  57]

dfs is the list of result dataframes where each one is the adding of 2 - 10 to the previous result


If you want concat them all into one dataframe, Use pd.concat

pd.concat(dfs)

Out[29]:
     a    b   c
0   75   18  17
1   48   56   3
0   77   20  19
1   50   58   5
0   80   23  22
1   53   61   8
0   84   27  26
1   57   65  12
0   89   32  31
1   62   70  17
0   95   38  37
1   68   76  23
0  102   45  44
1   75   83  30
0  110   53  52
1   83   91  38
0  119   62  61
1   92  100  47
0  129   72  71
1  102  110  57

Comments

0

You can use exec with a formatted string:

for i in range(2, 11):
    exec("df{0} = func(df{1}, {0})".format(i, i - 1 if i > 2 else ''))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.