0

I have a pandas df with a column (let's say col3) containing a number. These numbers are used in multiple rows and I want to run a function for rows of each number separatly.

So I wrote each number once into an array like this:

l = df.col3.unique()

Then a for loop is used to run a function for each number:

for i in l:
   a,b = func(df[df.col3 == i])

So the function gets rows where col3 contains the value of i of each run. The function returns two data frames (a and b in this case). I need these two returned data frames of each run.

I want to be able to identify them properly. For that I would like to save returned data frames within the loop like this:

First run: a123, b123 Second run a456, b456 Third run: a789, b789

Means the name of the dataframe contains the current value of i.

I already read I should not use global variables for dynamic variable names but do not know how to realize this instead.

Thank you :)

4
  • How do you use these data frames? Commented Mar 20, 2018 at 3:16
  • I will compare the results. Why is that important? I just want the current value of i within a run to be part of the names of the two data frames. Commented Mar 20, 2018 at 10:17
  • Why is the name of the variables so important? You can use a dict with the col3's value as the key to save the dataframes. Commented Mar 20, 2018 at 10:48
  • I need it because if I just call them df1, df2,... then I always need to have a look into them to remember which one contains what data. With the names I would immediatly see which data is in data frame a123. yes, I read about dictonaries but I can't manage to use them properly for my issue. That is why I'm asking this community. Commented Mar 20, 2018 at 14:26

1 Answer 1

1

Solution A (recommended):

dfs = {}

for i in l:
    dfs["a"+str(i)], dfs["b"+str(i)] = func(df[df.col3 == i])
...

And then you can use the dataframes like this:

func2(dfs["a1"]) # dfs["a1"] represents func(df[df.col3 == i])'s first return.
...

Solution B (not recommended)

If you absolutely want to use local variables, you need:

for i in l:
    locals()["a"+str(i)], locals()["b"+str(i)] = func(df[df.col3 == i])

And then you can use the dataframes with their variable names a1,b1 etc.

Sign up to request clarification or add additional context in comments.

1 Comment

Used your first solution and splitted the list afterwards to have separate dataframes like I need it. Thanks a lot! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.