1

I have a fun challenge: I'm trying to construct a a binary matrix from an integer vector. The binary matrix should contain as many rows as the length of vector, and as many columns as the max value in the integer vector. The ith row in the matrix will correspond to the ith element of the vector, with the row containing a 1 at the position j, where j is equal to the value of the ith element of the vector; otherwise, the row contains zeros. If the value of the ith integer is 0, then the whole ith row should be 0.

To make this a whole lot simpler, here is a working reproducible example:

set.seed(1)
playv<-sample(0:5,20,replace=TRUE)#sample integer vector

playmat<-matrix(playv,nrow=length(playv),ncol=max(playv))#create matrix from vector

for (i in 1:length(playv)){
pos<-as.integer(playmat[i,1])
playmat[i,pos]<-1
playmat[i,-pos]<-0}

    head(playmat)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    0    0    0
[2,]    0    1    0    0    0
[3,]    0    0    1    0    0
[4,]    0    0    0    0    1
[5,]    1    0    0    0    0
[6,]    0    0    0    0    1

The above solution is correct, I'm just looking to make something more robust.

2
  • 1
    Are you looking for model.matrix? Commented Aug 12, 2014 at 18:27
  • t(sapply(playv, function(i) {if (i!=0) {c(rep(0, i-1), 1, rep(0,max(playv)-i))} else rep(0, max(playv))})) Commented Aug 12, 2014 at 18:51

2 Answers 2

4
set.seed(1)
playv <- sample(0:5,20,replace=TRUE)
playv <- as.character(playv)
results <- model.matrix(~playv-1)

The columns in result you may rename.

I like the solution provided by Ananda Mahto and compared it to model.matrix. Here is a code

library(microbenchmark)

set.seed(1)
v <- sample(1:10,1e6,replace=TRUE)

f1 <- function(vec) {
  vec <- as.character(vec)
  model.matrix(~vec-1)
}

f2 <- function(vec) {
  table(sequence(length(vec)), vec)
}

microbenchmark(f1(v), f2(v), times=10)

model.matrix was a little bit faster then table

Unit: seconds
  expr      min       lq   median       uq      max neval
 f1(v) 2.890084 3.147535 3.296186 3.377536 3.667843    10
 f2(v) 4.824832 5.625541 5.757534 5.918329 5.966332    10
Sign up to request clarification or add additional context in comments.

1 Comment

One of those things that's hard to google for, but a very simple solution exists.
4

You can, of course, also just use table:

> table(sequence(length(playv)), playv)
    playv
     0 1 2 3 4 5
  1  0 1 0 0 0 0
  2  0 0 1 0 0 0
  3  0 0 0 1 0 0
  4  0 0 0 0 0 1
  5  0 1 0 0 0 0
  6  0 0 0 0 0 1
  7  0 0 0 0 0 1
  8  0 0 0 1 0 0
  9  0 0 0 1 0 0
  10 1 0 0 0 0 0
  11 0 1 0 0 0 0
  12 0 1 0 0 0 0
  13 0 0 0 0 1 0
  14 0 0 1 0 0 0
  15 0 0 0 0 1 0
  16 0 0 1 0 0 0
  17 0 0 0 0 1 0
  18 0 0 0 0 0 1
  19 0 0 1 0 0 0
  20 0 0 0 0 1 0

If speed is a concern, I would suggest a manual approach. First, identify the unique values in your vector. Second, create an empty matrix to fill in. Third, use matrix indexing to identify the positions that should be filled in as 1.

Like this:

f3 <- function(vec) {
  U <- sort(unique(vec))
  M <- matrix(0, nrow = length(vec), 
              ncol = length(U), 
              dimnames = list(NULL, U))
  M[cbind(seq_len(length(vec)), match(vec, U))] <- 1L
  M
}

Usage would be f3(playv).

Adding that into the benchmarks, we get:

library(microbenchmark)
microbenchmark(f1(v), f2(v), f3(v), times = 10)
# Unit: milliseconds
#   expr       min        lq    median        uq       max neval
#  f1(v) 2104.4808 3151.4308 3314.8173 3344.6696 4023.5246    10
#  f2(v) 3956.5678 4782.7863 5994.4448 6320.1901 6646.0405    10
#  f3(v)  486.4406  574.1133  746.9112  927.3407  987.9121    10

3 Comments

Nice solution! But a little bit slower then model.matrix.
I agree that in this case the speed is not important. But if one works with large sets this can become annoying. If you will be able to overtake model.matrix you may propose your solution as a replacement of a model.matrix function. :)
Excellent! I'll use it as needed. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.