0

I need to create a new column in my data frame, with value value chosen from another column, with a different name per row.

E.g., I have something like this data frame, df, describing what happened (events like "a", "h" and so on) to some persons (id 10,12,13..23) on seven days in a week:

id  day mon tue wed thu fri sat sun
10  wed a   y   b   j   j   b   a
12  wed b   e   h   y   b   h   b
13  tue h   y   j   b   h   j   h
14  thu j   u   b   h   j   b   j
16  thu y   i   h   j   y   h   y
19  fri e   y   j   y   a   j   e
20  sun y   e   y   a   b   y   y
21  mon u   y   a   b   h   a   u
23  mon i   u   b   h   j   b   i

I need a new column "val", showing the value on the day mentioned in the "day" variable.

Hence, like this:

id  day val mon tue wed thu fri sat sun
10  wed b   a   y   b   j   j   b   a
12  wed h   b   e   h   y   b   h   b
13  tue y   h   y   j   b   h   j   h
14  thu h   j   u   b   h   j   b   j
16  thu j   y   i   h   j   y   h   y
19  fri a   e   y   j   y   a   j   e
20  sun y   y   e   y   a   b   y   y
21  mon u   u   y   a   b   h   a   u
23  mon i   i   u   b   h   j   b   i

I tried making a function I could apply on a column to produce a new column

lookupfunction <- function(x) {
  rownumberofx <- which(x=x)
  dayvalue <- df[rownumberofx,"day"]
  dayvalue
    rownumberofx <- NULL

}
df$val <- lookupfunction(df$day)

I hope to learn a code to produce the column "val"

2 Answers 2

1

You can use subsetting with an index matrix (see help("[")):

#make sure that factor levels are in the same order as the columns
DF$day <- factor(DF$day, levels = names(DF)[-(1:2)])

#index matrix (does as.integer(DF$day) automatically)
ind <- cbind(seq_len(nrow(DF)), DF$day)
#     [,1] [,2]
#[1,]    1    3
#[2,]    2    3
#[3,]    3    2
#[4,]    4    4
#[5,]    5    4
#[6,]    6    5
#[7,]    7    7
#[8,]    8    1
#[9,]    9    1

#subset
DF[,-(1:2)][ind]
#[1] "b" "h" "y" "h" "j" "a" "y" "u" "i"
Sign up to request clarification or add additional context in comments.

1 Comment

We can get rid of the restriction that the factor levels of day are in the same order as the columns by using match() to define the ind matrix as follows: ind <- cbind(seq_len(nrow(DF)), match(DF$day, names(DF))). Then we simply use DF[ind] to obtain the desired new val column.
0

Usually having weekdays, or dates etc as columns make the anlysis harder. Oftern converting the data frame to "long" helps. Try:

Code

library(dplyr)
library(tidyr)

df %>% 
  gather(weekday, letter, -id, -day) %>% 
  group_by(id) %>% 
  mutate(val = letter[day == weekday]) %>% 
  spread(weekday, letter)

Result

     id day   val   fri   mon   sat   sun   thu   tue   wed  
  <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1    10 wed   b     j     a     b     a     j     y     b    
2    12 wed   h     b     b     h     b     y     e     h    
3    13 tue   y     h     h     j     h     b     y     j    
4    14 thu   h     j     j     b     j     h     u     b    
5    16 thu   j     y     y     h     y     j     i     h    
6    19 fri   a     a     e     j     e     y     y     j    
7    20 sun   y     b     y     y     y     a     e     y    
8    21 mon   u     h     u     a     u     b     y     a    
9    23 mon   i     j     i     b     i     h     u     b   

Data

df <- structure(list(id = c(10L, 12L, 13L, 14L, 16L, 19L, 20L, 21L, 
                            23L), day = c("wed", "wed", "tue", "thu", "thu", "fri", "sun", 
                                          "mon", "mon"), mon = c("a", "b", "h", "j", "y", "e", "y", "u", 
                                                                 "i"), tue = c("y", "e", "y", "u", "i", "y", "e", "y", "u"), wed = c("b", 
                                                                                                                                     "h", "j", "b", "h", "j", "y", "a", "b"), thu = c("j", "y", "b", 
                                                                                                                                                                                      "h", "j", "y", "a", "b", "h"), fri = c("j", "b", "h", "j", "y", 
                                                                                                                                                                                                                             "a", "b", "h", "j"), sat = c("b", "h", "j", "b", "h", "j", "y", 
                                                                                                                                                                                                                                                          "a", "b"), sun = c("a", "b", "h", "j", "y", "e", "y", "u", "i"
                                                                                                                                                                                                                                                          )), .Names = c("id", "day", "mon", "tue", "wed", "thu", "fri", 
                                                                                                                                                                                                                                                                         "sat", "sun"), row.names = c(NA, -9L), class = c("data.table", 
                                                                                                                                                                                                                                                                                                                          "data.frame"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.