1

I am trying to create a custom function for batch analysis, at the beginning it is like:

> myfunction <- function(DATA, col1, col2, col3){
>   print(class(col3))  #"name"
>   print(is.object(col3))  #FALSE
>   library(plyr) 
>   output <- ddply(DATA, .(eval(col1), eval(col2)), summarize, N=sum(eval(col3)),...)
>   ...
> }

> myfunction(DATA=df, col1=quote(colA), col2=quote(colB), col3=quote(colC))  #colA[chr], colB[chr], colC[numeric] are column names in dataframe df

but this is what I get:

> Error in eval("col3") : object 'col3' not found
> 10. eval("col3")
> 9. eval(cols[[col]], .data, parent.frame())
> 8. eval(cols[[col]], .data, parent.frame())
> 7. .fun(piece, ...)
> 6. (function (i)  { piece <- pieces[[i]] if (.inform) { ...
> 5. loop_apply(n, do.ply)
> 4. llply(.data = .data, .fun = .fun, ..., .progress = .progress,  .inform = .inform, .parallel = .parallel, .paropts = .paropts)
> 3. ldply(.data = pieces, .fun = .fun, ..., .progress = .progress,  .inform = .inform, .parallel = .parallel, .paropts = .paropts)
> 2. ddply(DATA, .(eval(col1), eval(col2)), summarize, N = sum(eval(col3)))
> 1. myfunction(DATA = df, col1 = quote(colA), col2= quote(colB),  col3 = quote(colC))

I dont understand why error comes up until col3 while nothing goes wrong with col1 and col2.

As class(col3) in the custom function show me col3 is a "name", I replace the eval() with get() but it doesn't work.

Can anyone tell me how to get the object behind the name col3?

Or am I in the wrong way from the start and need to change my mindset allover?

2
  • plyr is a bit outdated package. It may be easier to work with dplyr Commented Nov 19, 2021 at 19:50
  • Because it is so outdated and many functions are common in dplyr, it is going to be a challenge to still working in plyr Commented Nov 19, 2021 at 20:12

1 Answer 1

1

It may be easier to use dplyr instead of plyr. Here, is one way to change the function

myfunction <- function(DATA, col1, col2, col3){

   
    plyr::ddply(DATA, c(col1, col2),
         .fun = function(.data) c(N = sum(.data[[col3]], na.rm = TRUE)))

 }

-testing

> myfunction(mtcars, "cyl", "vs", "mpg")
 cyl vs     N
1   4  0  26.0
2   4  1 267.3
3   6  0  61.7
4   6  1  76.5
5   8  0 211.4
# outside the function
> plyr::ddply(mtcars, c("cyl", "vs"), summarize, N = sum(mpg))
  cyl vs     N
1   4  0  26.0
2   4  1 267.3
3   6  0  61.7
4   6  1  76.5
5   8  0 211.4
Sign up to request clarification or add additional context in comments.

1 Comment

this answer elegantly solve the problem, although I dont know what the differences are between the two types of coding. And I will spend some time on dplyr to avoid coding problem like this one :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.