3

I have a dataframe:

df <- data.frame(Category = c(rep("A", 3), rep("B", 3)), Value = rnorm(6))
df
 Category       Value
1        A -0.94968814
2        A  2.56687061
3        A -0.15665153
4        B -0.47647105
5        B  0.83015076
6        B -0.03744522

Now I want to add another column which is the mean per Category. This can be done with the dplyr package really easy:

df %>% group_by(Category) %>% 
  summarize(mean = mean(Value))

Now in piece of code my problem is: I can't use mean(Value), but I have a variable name that knows the column name: columnName = "Value" But this unfortunately won't work:

columnName = "Value"

df %>% group_by(Category) %>% 
  summarize(mean = mean(columnName))

Warning messages: 1: In mean.default("Value") : argument is not numeric or logical: returning NA 2: In mean.default("Value") :
argument is not numeric or logical: returning NA

How can I pass the column name with the variable?

6
  • mean(df[,columnName]) this code worked for me, when using the same variables as you did. Commented Dec 21, 2016 at 10:06
  • 1
    No, that doesn't work. It has to be mean of the groups, not the mean of the column. Commented Dec 21, 2016 at 10:08
  • It is not using the package dplyr but it works like this: tapply(df[,columnName],df$Category, mean) Commented Dec 21, 2016 at 10:14
  • please use set.seed when using such functions as rnorm to create data frames so we can double check results Commented Dec 21, 2016 at 10:16
  • 2
    This is called standard evaluation. There hundreds are of dupes regarding this on SO. Please read vignette("nse"). One way to achieve this is library(lazyeval) ; dots <- interp(~ mean(columnName), columnName = as.name("Value")) ; df %>% group_by(Category) %>% summarise_(.dots = dots) Commented Dec 21, 2016 at 10:17

1 Answer 1

2

We can use get with aggregate

aggregate(get(columnName)~Category, df, mean)

#    Category get(columnName)
#1        A      -0.5490751
#2        B      -0.2594670
Sign up to request clarification or add additional context in comments.

1 Comment

This works thanks! But I was looking for a solution within the dplyr package. Do you know if that is possible too?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.