-1

I'd like to build up on a question I asked earlier (Adding row for missing value in data.table). Actualy I don't need to add so many zero, I only need the first missing one to be 0.

So this time I would like to insert a 0 only if the last entry for the same ('ida', 'idb') is not already 0. We suppose there is no 0 in the original data.table.

So an example would be :

 ida       idb         value     date
  A         3          26600  2004-12-31
  A         3          19600  2005-03-31
  A         3          18200  2005-06-30
  A         4           9560  2004-12-31
  B         1           2600  2004-12-31
  B         1            100  2005-03-31
  B         1           8200  2005-06-30
  B         1           9560  2007-12-31
  B         9           1423  2004-12-31
  B         9           1235  2005-06-30
  C         2           8700  2005-12-31

Gives :

 ida       idb         value     date
  A         3          26600  2004-12-31
  A         3          19600  2005-03-31
  A         3          18200  2005-06-30
  A         4           9560  2004-12-31
  A         4              0  2005-03-31
  B         1           2600  2004-12-31
  B         1            100  2005-03-31
  B         1           8200  2005-06-30
  B         1           9560  2007-12-31
  B         9           1423  2004-12-31
  B         0              0  2005-03-31
  B         9           1235  2005-06-30
  B         0              0  2007-12-31
  C         2           8700  2005-12-31
4
  • It's not clear how is this question differs from the previous one. Commented Mar 25, 2015 at 22:49
  • In the previous one there would be more 0 Commented Mar 25, 2015 at 22:56
  • So what's the rule here? Why B would have additional two 0 while A will have only one additional? Commented Mar 25, 2015 at 22:57
  • There is only one 0 for A because if we added a second one (at the date 2005-06-30) there would be two consecutive 0. B has two additional 0 because the first one is separated by 2005-06-30 from the second one. The rule is : insert a 0 only if no 0 was inserted right before. Commented Mar 25, 2015 at 23:01

1 Answer 1

1

I found a way to do it. So we take the answer to my previous question :

setkey(dt, idb, date)
dt[, .SD[CJ(unique(idb), unique(date))], by = ida][is.na(value), value := 0]

And we add some more steps :

setkey(dt, idb, date)
dt[, .SD[CJ(unique(idb), unique(date))][is.na(value), value := 0][prev.value := c(0, value[-.N])][value > 0 | (value == 0 & prev.value > 0), ][, prev.value := NULL], by = ida]

So for each subset we replace NAs by 0s, and then add a new column corresponding to the previous values. Finally we keep only the rows != 0 and the first 0, and we delete the prev.value column.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.