1

I have a dataframe like this

id  v1  v2  v3  v4  v5  pos
1   11  12  11  10  10  3
2   17  11  22  40  23  4
1   11  22  50  10  10  2

I would like to change its values based on a condition related to pos to get:

id  v1  v2  v3  v4  v5  pos
1   11  12  12  12  12  3
2   17  11  22  22  22  4
1   11  11  11  11  11  2

So basically values get the previous value and the variable pos defines from where should we start.

Thx!

3 Answers 3

2

An approach using some indexing, which should be efficient in running time.
Not super efficient in terms of memory however, due to making a copy the same size as the input data:

vars <- paste0("v",1:5)
nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]
ow <- col(dat[vars]) >= dat$pos
dat[vars][ow] <- nv[row(ow)[ow]]

#  id v1 v2 v3 v4 v5 pos
#1  1 11 12 12 12 12   3
#2  2 17 11 22 22 22   4
#3  1 11 11 11 11 11   2

Explanation:

Get the variables of interest:

vars <- paste0("v",1:5)

Get the new values to overwrite for each row:

nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]

Make a logical matrix of the cells to overwrite

ow <- col(dat[vars]) >= dat$pos

Overwrite the cells using a row identifier to pick the appropriate value.

dat[vars][ow] <- nv[row(ow)[ow]]

Quick comparative timing using a larger dataset:

dat <- dat[rep(1:3,1e6),]

# indexing
#   user  system elapsed 
#   1.36    0.31    1.68 

# apply
#   user  system elapsed 
#  77.30    0.83   78.41 

# gather/spread
#   user  system elapsed 
# 293.43    3.64  299.10
Sign up to request clarification or add additional context in comments.

Comments

1

Here is one idea with gather and spread.

library(tidyverse)

dat2 <- dat %>%
  rowid_to_column() %>%
  gather(v, value, starts_with("v")) %>%
  group_by(rowid) %>%
  mutate(value = ifelse(row_number() >= (pos - 1), nth(value, (pos - 1)[[1]]), value)) %>%
  spread(v, value) %>%
  ungroup() %>%
  select(names(dat))

dat2
# # A tibble: 3 x 7
#      id    v1    v2    v3    v4    v5   pos
#   <int> <int> <int> <int> <int> <int> <int>
# 1     1    11    12    12    12    12     3
# 2     2    17    11    22    22    22     4
# 3     1    11    11    11    11    11     2

DATA

dat <- read.table(text = "id  v1  v2  v3  v4  v5  pos
1   11  12  11  10  10  3
2   17  11  22  40  23  4
1   11  22  50  10  10  2",
                  header = TRUE)
library(tidyverse)

Comments

1

Using apply from base R

data.frame(t(apply(df, 1, function(x) 
     c(x[1:x["pos"]], rep(x[x["pos"]], ncol(df) - x["pos"] - 2), x['pos']))))

#  X1 X2 X3 X4 X5 X6
#1  1 11 12 12 12  3
#2  2 17 11 22 22  4
#3  1 11 11 11 11  2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.