Create new variable condition on multiple variables R code

Question

I have a data set named "dat".

TEAM1  TEAM2    WINNER

A       P       A
I       S       I
P       S       S
S       I       I
S       P       P
W       P       W
A       E       A
A       S       S
E       A       E

I want to create variable "LOSER" using R code. I have tried like this

Loser <- NULL

    for (i in 1: nrow(dat)){
        if(match(dat$Team1[i],dat$Winner)==TRUE){
            Loser[i] <- cricket$Team2[i]
        }else if(match(dat$Team1[i],dat$Winner)==FALSE ){
            Loser[i] <- dat$Team1[i] 
        }
            }

But this does not give exact result. What is wrong with this code?

Desired out put:

TEAM1  TEAM2   WINNER LOSER 

A       P       A      P
I       S       I      S 
P       S       S      P
S       I       I      S
S       P       P      S
W       P       W      P
A       E       A      E
A       S       S      A
E       A       E      A

Try dat[1:2][cbind(1:nrow(dat),with(dat, TEAM1==WINNER)+1L)] — akrun
– akrun, Commented Aug 10, 2015 at 14:09

akrun · Accepted Answer · 2015-08-10 14:37:46Z

We can get the desired output by comparing the 'TEAM1' with the 'WINNER' column. Add 1 to it to coerce 'FALSE/TRUE' to '1/2'. This can be used as a column index. We can then cbind with row number and get the corresponding elements to create the 'LOSER' column

 dat$LOSER <- dat[cbind(1:nrow(dat), with(dat, TEAM1 == WINNER) + 1)]
 dat$LOSER
 #[1] "P" "S" "P" "S" "S" "P" "E" "A" "A"

NOTE: Modified based on @David Arenburg's comments. Also, in the dataset, 1st and 2nd columns were the 'TEAM1' and 'TEAM2'. If we have a dataset with many columns and these are not in the 1st and 2nd positions, we can subset the dataset as I showed in the comments to have only two columns

 dat$LOSER <- dat[paste0('TEAM', 1:2)][cbind(1:nrow(dat),
                                with(dat, TEAM1==WINNER)+1L)]

Another option using data.table. For TRUE values in TEAM1==WINNER, we assign (:=) 'LOSER' as 'TEAM2'. Then, we replace the NA values in 'LOSER' with 'TEAM1'

  library(data.table)
  setDT(dat)[TEAM1==WINNER, LOSER:= TEAM2][is.na(LOSER), LOSER:= TEAM1]
  dat

data

 dat <- structure(list(TEAM1 = c("A", "I", "P", "S", "S", "W", "A", "A", 
 "E"), TEAM2 = c("P", "S", "S", "I", "P", "P", "E", "S", "A"), 
 WINNER = c("A", "I", "S", "I", "P", "W", "A", "S", "E")),
 .Names =   c("TEAM1", 
 "TEAM2", "WINNER"), class = "data.frame", row.names = c(NA, -9L))

akrun · Accepted Answer · 2015-08-10 14:27:28Z

2

I was unable to resist to write a dplyr way.

library(dplyr)
dat %>% 
     mutate(LOSER = ifelse(TEAM1 == WINNER, TEAM2, TEAM1))
  TEAM1 TEAM2 WINNER LOSER
1     A     P      A     P
2     I     S      I     S
3     P     S      S     P
4     S     I      I     S
5     S     P      P     S
6     W     P      W     P
7     A     E      A     E
8     A     S      S     A
9     E     A      E     A

edited Aug 10, 2015 at 14:27

akrun

891k38 gold badges590 silver badges700 bronze badges

answered Aug 10, 2015 at 14:24

SabDeM

7,2103 gold badges28 silver badges38 bronze badges

4 Comments

David Arenburg Over a year ago

Nah, I wouldn't think this needs dplyr here, could just do transform(df, LOSER = ifelse(TEAM1 == WINNER, TEAM2, TEAM1))

SabDeM Over a year ago

@DavidArenburg I agree, I should think base R solution first.

David Arenburg Over a year ago

dplyr is nice when base R is getting ugly or for different grouping stuff, but transform is just a beauty and very efficient too.

SabDeM Over a year ago

@DavidArenburg I did not know that. I though that transform was like base R reshape: ugly and slow as hell.

Collectives™ on Stack Overflow

Create new variable condition on multiple variables R code

2 Answers 2

data

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

data

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related