5

Sorry for the simple question but I can't think of a good way to take functions elements of a list of data frames. I am sure there is something within the plyr/reshape2 packages but I just can't think of it.

For example I have a list A as follows:

>A
[[1]]
        [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
   [1,]    1    1    1    1    1    1    1    1    1     1
   [2,]    1    1    1    1    1    1    1    1    1     1
   [3,]    1    1    1    1    1    1    1    1    1     1
   [4,]    1    1    1    1    1    1    1    1    1     1
   [5,]    1    1    1    1    1    1    1    1    1     1

[[2]]
       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    2    2    2    2    2    2    2    2    2     2
 [2,]    2    2    2    2    2    2    2    2    2     2
 [3,]    2    2    2    2    2    2    2    2    2     2
 [4,]    2    2    2    2    2    2    2    2    2     2
 [5,]    2    2    2    2    2    2    2    2    2     2

Say I want to take the mean across the corresponding elements of the matrices in the list. One way to do this would be

Reduce("+",A)/length(A)

I can't seem to feed Reduce() more complex functions and assume there is a better way in general.

5
  • two lists?... what do you mean by element?... the whole data frame? Commented Aug 16, 2011 at 21:38
  • Ok, I edited the question to better reflect what you're asking. Hopefully I captured what you're looking for... Commented Aug 16, 2011 at 21:42
  • sorry for the confusing language. I want to take the mean of each number in, for example, position [1,1] of each matrix in the list. Commented Aug 16, 2011 at 21:44
  • 1
    any function that operates on two objects at a time will work with Reduce. mean accepts only one R object and that's the reason it fails. i don't think there is a generic approach, but for a class of functions, it would be possible to define a simple wrapper that makes it operate on 2 objects at a time and can be passed to Reduce. it would be useful if you could post what function you have in mind Commented Aug 16, 2011 at 23:02
  • In general you can't convert a dataframe into an array unless all the data columns are homegeneous (e.g. all integer, as in your example, or all factor, or all string, or all Date). So you would be reduced to the do.call() approach. Commented May 16, 2012 at 15:18

2 Answers 2

7

In this case, maybe you're better off with your data in an array rather than a list?

#Recreate data
A <- list(a=matrix(1,5,10),b=matrix(2,5,10))

#Convert to array
A1 <- array(do.call(cbind,A),dim = c(5,10,2))

#Better way to convert to array
require(abind)
A1 <- abind(A,along = 3)

#Now we can simply use apply
apply(A1,c(1,2),mean)
Sign up to request clarification or add additional context in comments.

3 Comments

that's what @hadley suggested to me a while ago
It doesn't change the structure: apply(abind, A, along=3), 1:2, mean) does not affect A.
Oops, I think that's a minor typo by @DWin: apply(abind(A,along=3),c(1,2),mean).
2

Maybe do.call?

do.call(`+`, A)/length(A)

Or if you really don't want to abind it into a larger matrix,

array(sapply(seq_along(A[[1]]), function(i) mean(sapply(A,`[`,i))), 
      dim=dim(A[[1]]))

2 Comments

that is similar to Reduce() and I run into problems when I supply a more complex function to it (like 'mean')
Actually mean isn't complex enough; this will work on any function that takes multiple arguments and handles them in a vectorized way. Given the further information, though, joran's method of putting it in an array is what you want, though I've added a much clumsier sapply solution, which is basically just binding each spot together in turn.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.