3

New to R, and in over my head!

I am trying to write code that will combine the following steps:

a) Find the minimum values, per row, between two columns

b) Sum the minimum values found

c) Do this among many columns and construct a pairwise matrix of the results

Steps a & b are easy enough for two columns at a time. Like this:

column1 = c(0.08,   0.20,   0.09,   0.19,   0.25,   0.20,   0.00)
column2 = c(0.07,   0.19,   0.09,   0.21,   0.25,   0.19,   0.00)
ps = data.frame(column1, column2)

sum(pmin(ps$column1,ps$column2))

But for step c, I am having difficulty writing a code that will perform this operation for each pairwise column comparison in a dataframe consisting of 7 rows and 32 columns. This is what I've come up with so far:

d <- replicate(32, rnorm(7))
c <- combn(seq_len(ncol(d)),2)
mat1 <-  matrix(0,ncol=32,nrow=32,dimnames=list(colnames(d),colnames(d)))
v1 <- unlist(lapply(seq_len(ncol(c)),function(i) {d1<-d[,c[,i]];    length(which(d1[,1]!=0 & d1[,2]!=0)) }))

mat1[lower.tri(mat1)]<-v1 

I am pretty sure my issues lie within the "function" command associated with "v1". But I'm stumped and could really use a bit of help!

Again, my goal is to have a 32x32 matrix of the summed minimum values between each pairwise column comparison.

Does this make sense?

Thank you so much.

2 Answers 2

2

The outer function will do this and keep track of the bookkeeping for you, but you have to pass it a vectorized function.

summin <- Vectorize(function(i, j) sum(pmin(ps[[i]], ps[[j]])))
outer(seq_len(ncol(ps)), seq_len(ncol(ps)), FUN=summin)
##      [,1] [,2]
## [1,] 1.01 0.98
## [2,] 0.98 1.00

I have no idea what's supposed to going on in your v1 code, it doesn't look like you're summing the minimums anymore.

If I was going to loop myself, I'd use expand.grid instead of combn, as then I get the diagonals and don't have to figure out how to populate the two sides of the matrix, though at the expense of doing all the computations twice. (The computer can do it twice faster than I can figure out how to ask it to do only once, anyway.) I'd also just make it as a vector and then convert to a matrix afterwards.

cc <- expand.grid(seq_len(ncol(d)), seq_len(ncol(d)))
out <- sapply(seq_len(nrow(cc)), function(k) {
    i <- cc[k,1]
    j <- cc[k,2]
    sum(pmin(d[[i]],d[[j]]))
})
out <- matrix(out, ncol=ncol(d))
Sign up to request clarification or add additional context in comments.

Comments

1

I think you could try the following (it is a simplistic approach I have to admit):

column1 = c(0.08,   0.20,   0.09,   0.19,   0.25,   0.20,   0.00)
column2 = c(0.07,   0.19,   0.09,   0.21,   0.25,   0.19,   0.00)
column3 = c(0.05,   0.49,   0.39,   0.1,   0.5,   0.11,   0.01)
ps = data.frame(column1, column2, column3)

res <-matrix(nrow = ncol(ps), ncol = ncol(ps))

for (i in (1:ncol(ps))) {

  for (j in (i:ncol(ps))){

    res[i,j] <- sum(pmin(ps[,i],ps[,j]))
  }

}

In order to make use of the fact that the matrix is symmetrical you can do:

res[lower.tri(res)] <- t(res)[lower.tri(res)]

(One thing to note that I also learnt thanks to @Aaron and his comment is that res[lower.tri(res)] <- res[upper.tri(res)] does not work because R is filling the values by column)

Or alternatively (again thanks to Aaron) you could do (and skip the last step):

for (i in (1:ncol(ps))) {

      for (j in (i:ncol(ps))){

        res[i,j] <- res[j,i] <- sum(pmin(ps[,i],ps[,j]))
      }

    }

6 Comments

Watch out, lower.tri and upper.tri aren't symmetric in that way.
@Aaron sorry I did not get that could you explain?
Add a fourth column and try it, you'll see that the resulting matrix isn't symmetric, as R always fills by column. It's a good answer, though; I'd simply suggest just making your inner loop start at 1.
Alternatively, you can just fill both spots in at once: res[j,i] <- res[i,j] <- sum(...)
Glad to hear it, that's what SO is supposed to be all about. :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.