Using a loop to define values of case_when in R

Question

I'm currently using case_when to define a new variable in my data as such:

data[,46] <- NA

data[,46] <- case_when(
   data[,35] ==  1 ~ data[,36],
   data[,35] ==  2 ~ data[,37],
   data[,35] ==  3 ~ data[,38],
   data[,35] ==  4 ~ data[,39],
   data[,35] ==  5 ~ data[,40],
   data[,35] ==  6 ~ data[,41],
   data[,35] ==  7 ~ data[,42],
   data[,35] ==  8 ~ data[,43],
   data[,35] ==  9 ~ data[,44],
   data[,35] ==  10 ~ data[,45]
)

I'm trying to write a loop to make this function more efficient, but am running into some trouble. Here is what I have attempted:

for (j in 1:10) {
data[,46] <- case_when(
   data[,35] ==  j ~ data[,35+j]
)
}

However, this is returning NAs for all of my values of data[,46]. Any thoughts on what might be going wrong? I would be happy to provide sample data if necessary, but I'm thinking this is more related to me making a simple programming mistake. Thanks in advance!

This seems like a better problem so solve by shaping your data with tidyr perhaps. It would be easier to help if you provided a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions. Show what your real goal is rather than just the code you tried to write to solve it. — MrFlick
– MrFlick, Commented Oct 8, 2018 at 18:29

Rui Barradas · Accepted Answer · 2018-10-09 09:44:49Z

3

All you have to do is to remember that R is vectorized.
You are comparing data[, 35] to the integers 1 to 10 and for each of these assign data[, 35 + <1 to 10>] back to data[, 35]. So all you have to do is

data[, 35] <- data[, 35 + data[, 35]]

If there are values in data[, 35] not in 1:10 then an ifelse will be more appropriate.

data[, 35] <- ifelse(data[, 35] %in% 1:10, data[, 35 + data[, 35]], data[, 35])

edited Oct 9, 2018 at 9:44

answered Oct 8, 2018 at 19:28

Rui Barradas

78k8 gold badges41 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Julian Over a year ago

Not exactly. I'm checking to see whether data[,35] is equal to the values of 1-10 and depending on that, inputting data[,36] into data[,46] into the values where data[,35] == 1, data[,37] into data[,46] when data[,35]==2, etc. Doing data[, 35] <- data[, 35 + data[, 35]] gives me the following error: Error in [.data.frame(data, , 35 + data[, 35]) : undefined columns selected

Rui Barradas Over a year ago

@Zereg Then you must have values not in 1-10. See the edit.

e.matt · Accepted Answer · 2018-10-08 18:59:39Z

1

You may need [j] as shown below to store its iteration in data[,46]

for (j in 1:10) {
data[,46][j]<- case_when(
   data[,35] ==  j ~ data[,35+j]
)}

answered Oct 8, 2018 at 18:59

e.matt

8861 gold badge5 silver badges15 bronze badges

2 Comments

Julian Over a year ago

Thank you! Your solution worked for me about an hour ago... but now I feel like I'm going crazy because it's not replicating. I'm getting this error now:

for (j in 2:10) { data[,46][j] <- case_when(    data$since ==  1 ~ lag(data[,31], 1),    data$since ==  j ~ data[,36+j] ) }

(I know the code is a bit different, I kept the example in the original post simple to make the question as easy to answer as possible). Any thoughts as to what's going on?

e.matt Over a year ago

It’s hard to understand without knowing your data fully. The lag function may be causing the result stored in data[,46] to be smaller than the dimensions of the data frame, ie you have 1 result short of the number of rows for your data frame..

Collectives™ on Stack Overflow

Using a loop to define values of case_when in R

2 Answers 2

2 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related