1

I have a dataframe which I'd like to aggregate over two variables applying the function mean on each meassurement. Here the head of the dataframe:

  Subject Activity         meassureA         meassureB         meassureC       meassureD
1       1  running         0.2820216      -0.037696218       -0.13489730       -0.3282802
2       1  running         0.2558408      -0.064550029       -0.09518634       -0.2292069
3       1  walking         0.2548672       0.003814723       -0.12365809       -0.2751579
4       2  running         0.3433705      -0.014446221       -0.16737697       -0.2299235

Now, I would like to get something like this:

  Subject Activity         meassureA         meassureB         meassureC       meassureD
1       1  running         mean(S1,A1)      mean(S1,A1)       mean(S1,A1)       mean(S1,A1)
2       1  walking         mean(S1,A2)      mean(S1,A2)       mean(S1,A2)       mean(S1,A2)
3       2  running         mean(S2,A1)      mean(S2,A1)       mean(S2,A1)       mean(S2,A1)
4       2  walking         mean(S2,A2)      mean(S2,A2)       mean(S2,A2)       mean(S2,A2)

Where the value of meassure A is the mean of all values of subject 1 (S1) performing activity running (A1).

I was thinking of using aggregate(), but I am not able to apply what I learned so far to my problem. Any help is highly appreciated.

3
  • 1
    Not sure but maybe aggregate(.~ Subject + Activity, df, mean)? Commented Apr 26, 2015 at 14:33
  • @DavidArenburg, I think that's what they are looking for. Commented Apr 26, 2015 at 14:33
  • Also could probably take a look here Commented Apr 26, 2015 at 14:39

1 Answer 1

1

As mentionned by David in the comments, you could do:

aggregate(. ~ Subject + Activity, df, mean)

Or using data.table:

data.table::setDT(df)[, lapply(.SD, mean), by = .(Subject, Activity)]

Or using dplyr:

library(dplyr)
df %>% group_by(Subject, Activity) %>% summarise_each(funs(mean))

Which gives:

#  Subject Activity meassureA    meassureB  meassureC  meassureD
#1       1  running 0.2689312 -0.051123123 -0.1150418 -0.2787436
#2       1  walking 0.2548672  0.003814723 -0.1236581 -0.2751579
#3       2  running 0.3433705 -0.014446221 -0.1673770 -0.2299235
Sign up to request clarification or add additional context in comments.

1 Comment

Or data.table::setDT(df)[, lapply(.SD, mean), by = .(Subject, Activity)] if you know what you doing :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.