1

I have a very big dataset and I analyze it with R.

The problem is that I want to add some columns with different treatments on my dataset AND I need some recursive function which use some global variable. Each function modify some global variable et create some variables. So the duplication of my dataset in memory is a big problem...

I read some documentation: if I didn't misunderstand, neither the use of <<- nor assign() could help me...

What I want:

mydata <- list(read.table(), ...)
myfunction <- function(var1, var2) {
   #modification of global mydata
   mydata = ...
   #definition of another variable with the new mydata
   var3 <- ...
   #recursive function
   mydata = myfunction(var2, var3)
}

Do you have some suggestions for my problem?

1 Answer 1

5

Both <<- and assign will work:

myfunction <- function(var1, var2) {
   # Modification of global mydata
   mydata <<- ...
   # Alternatively:
   #assign('mydata', ..., globalenv())

   # Assign locally as well
   mydata <- mydata

   # Definition of another variable with the new mydata
   var3 <- ...

   # Recursive function
   mydata = myfunction(var2, var3)
}

That said, it’s almost always a bad idea to want to modify global data from a function, and there’s almost certainly a more elegant solution to this.

Furthermore, note that <<- is actually not the same as assigning to a variable in globalenv(), rather, it assigns to a variable in the parent scope, whatever that may be. For functions defined in the global environment, it’s the global environment. For functions defined elsewhere, it’s not the global environment.

Sign up to request clarification or add additional context in comments.

3 Comments

I knew that, but when I coded it: assign() or <<- I modified mydata in the global environment but not in the local environment of my function. So when I used the recursive function in myfunction() : the modifications were not apply. The modifications are like : dataset[dataset[, "colx"]==var1, ] = anotherfunction() so assign() is not really handy here for me...
@EaudeRoche Oh, got it. But actually the solution to this is straightforward: do two assigns, one local and one nonlocal. See my edit. You can also change the order of these operations around: assign first locally and then globally.
@konrad-rudolf OK ! So I will have to duplicate the dataset for each iteration anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.