Replace values in set of columns based on condition

Question

I have a dataframe like this

id  v1  v2  v3  v4  v5  pos
1   11  12  11  10  10  3
2   17  11  22  40  23  4
1   11  22  50  10  10  2

I would like to change its values based on a condition related to pos to get:

id  v1  v2  v3  v4  v5  pos
1   11  12  12  12  12  3
2   17  11  22  22  22  4
1   11  11  11  11  11  2

So basically values get the previous value and the variable pos defines from where should we start.

Thx!

thelatemail · Accepted Answer · 2019-03-07 03:06:18Z

An approach using some indexing, which should be efficient in running time.
Not super efficient in terms of memory however, due to making a copy the same size as the input data:

vars <- paste0("v",1:5)
nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]
ow <- col(dat[vars]) >= dat$pos
dat[vars][ow] <- nv[row(ow)[ow]]

#  id v1 v2 v3 v4 v5 pos
#1  1 11 12 12 12 12   3
#2  2 17 11 22 22 22   4
#3  1 11 11 11 11 11   2

Explanation:

Get the variables of interest:

vars <- paste0("v",1:5)

Get the new values to overwrite for each row:

nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]

Make a logical matrix of the cells to overwrite

ow <- col(dat[vars]) >= dat$pos

Overwrite the cells using a row identifier to pick the appropriate value.

dat[vars][ow] <- nv[row(ow)[ow]]

Quick comparative timing using a larger dataset:

dat <- dat[rep(1:3,1e6),]

# indexing
#   user  system elapsed 
#   1.36    0.31    1.68 

# apply
#   user  system elapsed 
#  77.30    0.83   78.41 

# gather/spread
#   user  system elapsed 
# 293.43    3.64  299.10

www · Accepted Answer · 2019-03-07 02:45:47Z

1

Here is one idea with gather and spread.

library(tidyverse)

dat2 <- dat %>%
  rowid_to_column() %>%
  gather(v, value, starts_with("v")) %>%
  group_by(rowid) %>%
  mutate(value = ifelse(row_number() >= (pos - 1), nth(value, (pos - 1)[[1]]), value)) %>%
  spread(v, value) %>%
  ungroup() %>%
  select(names(dat))

dat2
# # A tibble: 3 x 7
#      id    v1    v2    v3    v4    v5   pos
#   <int> <int> <int> <int> <int> <int> <int>
# 1     1    11    12    12    12    12     3
# 2     2    17    11    22    22    22     4
# 3     1    11    11    11    11    11     2

DATA

dat <- read.table(text = "id  v1  v2  v3  v4  v5  pos
1   11  12  11  10  10  3
2   17  11  22  40  23  4
1   11  22  50  10  10  2",
                  header = TRUE)
library(tidyverse)

edited Mar 7, 2019 at 2:45

answered Mar 7, 2019 at 2:41

www

39.3k12 gold badges52 silver badges94 bronze badges

Comments

Ronak Shah · Accepted Answer · 2019-03-07 03:19:09Z

1

Using apply from base R

data.frame(t(apply(df, 1, function(x) 
     c(x[1:x["pos"]], rep(x[x["pos"]], ncol(df) - x["pos"] - 2), x['pos']))))

#  X1 X2 X3 X4 X5 X6
#1  1 11 12 12 12  3
#2  2 17 11 22 22  4
#3  1 11 11 11 11  2

edited Mar 7, 2019 at 3:19

answered Mar 7, 2019 at 2:47

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

Collectives™ on Stack Overflow

Replace values in set of columns based on condition

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related