0

I'm working on converting Stata code to R. There's a snippet of code that creates a new variable and adds the column value if it meets specific parameters. For example, if a cell is greater than 0 and less than or equal to 3, that value would be added to newvar

gen newvar=0
 
local list a b c
foreach x of local list{
    qui replace newvar=newvar+`x' if `x'>0 & `x'<=3 
}
set.seed(5)
dat <- data.frame(a = rnorm(5), b = rnorm(5), c = rnorm(5))

enter image description here

Desired Output

enter image description here

0

3 Answers 3

1

A tidyverse approach

library(dplyr)
set.seed(5)
dat <- data.frame(a = rnorm(5), b = rnorm(5), c = rnorm(5))

conditional_sum <- function(x,a = 0,b = 3){
  sum(x[x > a & x <= b],na.rm = TRUE)
}

dat %>% 
  rowwise() %>% 
  mutate(newvar = conditional_sum(c_across()))

# A tibble: 5 x 4
# Rowwise: 
        a      b      c newvar
    <dbl>  <dbl>  <dbl>  <dbl>
1 -0.841  -0.603  1.23  1.23  
2  1.38   -0.472 -0.802 1.38  
3 -1.26   -0.635 -1.08  0     
4  0.0701 -0.286 -0.158 0.0701
5  1.71    0.138 -1.07  1.85 
Sign up to request clarification or add additional context in comments.

Comments

0

Replace the elements that are not satisfying the condition to NA and get the rowSums on the rest of the elements to create the 'newvar'

dat$newvar <-  rowSums(NA^(dat <=0|dat >=3)*dat, na.rm = TRUE)

-output

> dat
            a          b          c     newvar
1 -0.84085548 -0.6029080  1.2276303 1.22763034
2  1.38435934 -0.4721664 -0.8017795 1.38435934
3 -1.25549186 -0.6353713 -1.0803926 0.00000000
4  0.07014277 -0.2857736 -0.1575344 0.07014277
5  1.71144087  0.1381082 -1.0717600 1.84954910

2 Comments

Thank you for your response! In this example, if I only wanted to conditionally sum columns a & b without dropping c, how would the code be updated?
@Marrrrie For that dat[c("a", "b")] instead of dat in the code i.e. rowSums(NA^(dat[c("a", "b")] <=0|dat[c("a", "b")]>=3)*dat[c("a", "b")], na.rm = TRUE)
0

A common way to perform rowwise operations is using the apply function. E.g.:

dat$newvar <- apply(dat, 1, \(r) sum(r[r > 0 & r <= 3]))

Read as: Apply a function to every row of dat. The function takes a vector r, and sums the elements of r which satisfy the criterio.

Results in

            a          b          c     newvar
1 -0.84085548 -0.6029080  1.2276303 1.22763034
2  1.38435934 -0.4721664 -0.8017795 1.38435934
3 -1.25549186 -0.6353713 -1.0803926 0.00000000
4  0.07014277 -0.2857736 -0.1575344 0.07014277
5  1.71144087  0.1381082 -1.0717600 1.84954910

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.