3

Problem:

I am using a function that takes a (global) variable as an input, then performs operations on that variable (locally), and then returns that same variable. I do only want the variable to change locally, but instead my function is changing the global variable as well.

Code to reproduce:

data = {'A' : [1,2,3],
        'B' : [4,5,6],
        'C' : [7,8,9]}
df = pd.DataFrame(data)

def func(df):
    df['D'] = df['A'] * df['B'] / df['C']
    return df

func(df) # running function, without assigning it to original variable

print(df)

Returns:

Running the code shows that the original dataframe has been changed and a column was added.

   A  B  C   D
0  1  4  7  12
1  2  5  8  15
2  3  6  9  18

Expected behaviour:

My intention is to run the function without adding the column to the global variable, only add it locally within the function.

    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9

Set-up:

  • Python 3.7
  • Pandas 0.25.3
  • Windows 10
5
  • 3
    df is a local variable, but it refers to the same object as the global variable data. All function arguments behave this way. The general rule is: don't use a mutating method on a function argument unless you intend to modify the object. Commented Mar 11, 2020 at 19:34
  • or do ret = df.copy() and work with the ret dataframe in your function. Commented Mar 11, 2020 at 19:38
  • You need to make your own local copy of df if you intend to add a new column temporarily, or (if other threads using the data frame aren't a concern) you can remove the new column with del df['D'] before returning. Commented Mar 11, 2020 at 19:40
  • Don't focus on the variables - variables are just ways to refer to objects, and objects are what really matter in a Python program. Commented Mar 11, 2020 at 19:45
  • Also see nedbatchelder.com/text/names.html Commented Mar 11, 2020 at 19:45

1 Answer 1

2

You can make a local copy:

def func(d):
    df = d.copy()
    df['D'] = df['A'] * df['B'] / df['C']
    return df
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, this works! I'm still wondering, why doesn't my code behave as I expected?
Because passing parameters (and in fact all assigments) in Python is done by reference. When you do x = y, you're not copying the value of y - instead, you're making x refer to the same object as y does. Therefore df in your function refers to the same object the actual parameter (which you somewhat unfortunately also named df) refers to,
Okay, so to make sure I understand correctly, moving variables into a different namespace does not automatically create a copy of the variable into that new namespace. Thus changing the variable in one namespace (local) will also change the variable in the other namespace (global). Is that right?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.