4

Data is like:

 quarter name  week  value
 17Q3    abc   1     0.7
 17Q3    abc   3     0.65
 17Q3    def   1     0.13
 17Q3    def   2     0.04

Can I insert rows with value=0 where there is missing values for week i.e the output should be like:

quarter name  week  value
 17Q3    abc   1     0.7
 17Q3    abc   3     0.65
 17Q3    abc   2     0.0
 17Q3    def   1     0.13
 17Q3    def   2     0.04
 17Q3    def   3     0.0

need to fill till week 13.(i.e check till 13)

7
  • 2
    Try library(dplyr);df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) %>% arrange(quarter, name, id) %>% mutate(id = row_number()) %>% select(names(df1)) Commented Feb 19, 2018 at 10:32
  • Thanks..i have tried with library(dplyr);df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) ..got the error like couldnt find the function complete Commented Feb 19, 2018 at 11:24
  • complete is from tidyr sorry Commented Feb 19, 2018 at 11:25
  • Thanks alot..it did the wonder.solved. Commented Feb 19, 2018 at 11:27
  • actually i dont have id column.and i am getting repetitive rows after the codes mentioned above. library(tidyr);df1 %>% complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) . Commented Feb 19, 2018 at 11:36

2 Answers 2

1

How about by using expand within complete.

library(tidyverse)
complete(df, expand(df, quarter, name, week), fill = list(value=0))

#   quarter name   week  value
#   <fct>   <fct> <int>  <dbl>
# 1 17Q3    abc       1 0.700 
# 2 17Q3    abc       2 0     
# 3 17Q3    abc       3 0.650 
# 4 17Q3    def       1 0.130 
# 5 17Q3    def       2 0.0400
# 6 17Q3    def       3 0   

Or, maybe easier to understand:

df %>% expand(quarter, name, week) %>% left_join(df) %>% replace_na(list(value=0))
Sign up to request clarification or add additional context in comments.

2 Comments

this will give repeated results
add %>% distinct() at the end.
0

Here is one option with tidyverse. We get the missing combination of rows with complete, arrange the rows based on the 'quarter', 'name' and 'id', then mutate the 'id' to 'row_number())andselect` the columns to have the same order as in the original dataset

library(tidyverse)
df1 %>%
  complete(quarter, name, week = full_seq(week, 1), fill = list(value = 0)) %>%
  arrange(quarter, name, id) %>%
  mutate(id = row_number()) %>% 
  select(names(df1))
# A tibble: 6 x 5
#     id quarter name   week  value
#  <int> <chr>   <chr> <dbl>  <dbl>
#1     1 17Q3    abc    1.00 0.700 
#2     2 17Q3    abc    3.00 0.650 
#3     3 17Q3    abc    2.00 0     
#4     4 17Q3    def    1.00 0.130 
#5     5 17Q3    def    2.00 0.0400
#6     6 17Q3    def    3.00 0     

5 Comments

Thanks Akrun..Is there a way if we dont have id column. I am getting repetitive rows.One way is to remove duplicates.
another updates.i need to fill till week 13. adding full_seq(week,1,13) didnt help.
@buntysahoo Please update your input data and expected output in the post
updated the post itself.. i need to check till week 13 and fill till 13.
solved...have used unique function and it gave exact solution.Thanks a lot Akrun.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.