R - Data frame manipulation without a for loop

Question

I want to read a dataframe read if the first column is T or F and depending on this I will add a new entry to a new column in the matrix using data from the second column.

If z[,1] == true set z[,4] to 2*z[,2]
else set z[,4] to z[,2]

Set if the row in column 1 is true, set the new entry to 2 times the second column, other wise just set it to the value of the second column at that index

Lets create z:

set.seed(4)
z <- data.frame(first=c(T, F, F, T, F), second=sample(-2:2),
                third=letters[5:1], stringsAsFactors=FALSE)
z

here is my for loop:

for(i in 1:nrow(z)){
  if(z$first == TRUE){
    z$newVar2 <- 2*z$second
  }
  else{
    z$newVar2 <- z$second
  }
}

Here is without a for loop:

z$newVar<-ifelse(z$first==TRUE, 2*z$second, z$second)

Is there a way to do this with apply? Is there a more efficient way to accomplish this task?

ifelse is already the right way to do this. You don't need the ==TRUE, though, as it's already Boolean. — alistaire
– alistaire, Commented Mar 22, 2016 at 19:05
@Shekeine I don't particularly want to avoid it, just wanted to know if there was another more concise way. — Kevin
– Kevin, Commented Mar 22, 2016 at 19:52
aaah, actually, looking at ur question again, I would skip the apply as well, put my data in a data.table and all the stuff I want done in a function then run that function on the data.table...Would be super fast, super efficient.. — shekeine
– shekeine, Commented Mar 22, 2016 at 19:57

shekeine · Accepted Answer · 2016-03-22 20:10:28Z

2

Not what you asked exactly but if working with a matrix data structure, you might as well explore data.table way of going about it:

#Make data.table
setDT(z)
setkey(z)

#Write function to do all the stuff
myfun <- function(first, second){ifelse(first, 2*second, second)}

#Do stuff
z[, newvar2:=myfun(first, second)]

#Printing z
   first second third newvar2
1: FALSE     -2     d      -2
2: FALSE     -1     a      -1
3: FALSE      1     c       1
4:  TRUE      0     e       0
5:  TRUE      2     b       4

answered Mar 22, 2016 at 20:10

shekeine

1,46510 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kevin Over a year ago

interesting, I have never seen this before.

shekeine Over a year ago

check this and this for a more complete walk-through of data.tables. Be sure to understand the assignment by reference and key paradigms coz that's where data.tables really shine. If you work with big data, only data.tables will do..

SymbolixAU · Accepted Answer · 2016-03-22 21:12:11Z

2

We can use data.table in a more efficient way still without defining a function, by making use of the fact that TRUE == 1

## use set.seed because we are sampling
set.seed(123)
z <- data.frame(first=c(T, F, F, T, F), 
                second=sample(-2:2),
                third=letters[5:1], stringsAsFactors=FALSE)

library(data.table)

setDT(z)[, newvar2 := (first + 1) * second]
z

#     first second third newvar2
# 1:  TRUE     -1     e      -2
# 2: FALSE      1     d       1
# 3: FALSE      2     c       2
# 4:  TRUE      0     b       0
# 5: FALSE     -2     a      -2

edited Mar 22, 2016 at 21:12

answered Mar 22, 2016 at 21:06

SymbolixAU

26.4k4 gold badges72 silver badges148 bronze badges

Collectives™ on Stack Overflow

R - Data frame manipulation without a for loop

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related