Adding rows depending on condition

Question

I need a little help with a very simple question:

Let's say I have this data frame:

data_new <- data.frame(section = c("1", "4", "5","6"),
                       density = c("0.2", "0.7", "0.8", "0.2"))
> data_new
  section density
1       1     0.2
2       4     0.7
3       5     0.8
4       6     0.2

I need to add rows because the full table is based on 6 sections, but only have data on 4. This means that in this case I have to add 2 rows (sections 2 and 3) with density 0 so I have:

> data_desired
  section density
1       1     0.2
2       4     0.7
3       5     0.8
4       6     0.2
5       2       0
6       3       0

The point is that the combination of 0 density rows may vary. In this case sections 3 and 4 were empty, but next time it may be that no section has density 0 or that I have to add 5 sections, etc. It can vary a lot, from 1 section with data to all sections with data.

I'm sure there is an elegant way to add to my pipe to ad the rows I need and that is case specific. Thanks a lot for your help!!

one · Accepted Answer · 2023-02-09 22:18:04Z

1

Another option using rows_update:

library(dplyr)
#create zero density dataframe

n<-6
data_zero <- data.frame(section = as.character(c(1:n)),
                       density = as.character(rep(0,n)))

data_new <- data.frame(section = c("1", "4", "5","6"),
                       density = c("0.2", "0.7", "0.8", "0.2"))

rows_update(data_zero ,data_new)
  section density
1       1     0.2
2       2       0
3       3       0
4       4     0.7
5       5     0.8
6       6     0.2

For multiple columns:

library(dplyr)
n<-6
data_zero <- data.frame(section = as.character(c(1:n)),
                       density = as.character(rep(0,n)))


data_new <- data.frame(section = c("1", "4", "5","6"), density = c("0.2", "0.7", "0.8", "0.2"), potatoes = c("a","n","ed","3"))

rows_update(data_zero ,data_new[,c('section','density')]) %>%
  merge(data_new,all.x=T)

  section density potatoes
1       1     0.2        a
2       2       0     <NA>
3       3       0     <NA>
4       4     0.7        n
5       5     0.8       ed
6       6     0.2        3

edited Feb 9, 2023 at 22:18

answered Feb 9, 2023 at 21:33

one

4,1872 gold badges7 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Oriol Baena Crespo Over a year ago

I think I'm having problelms because my real data_new has many columns and those are not the same as data_zero..

Oriol Baena Crespo Over a year ago

data_zero <- data.frame(section = as.character(c(1:n)), density = as.character(rep(0,n))) data_new <- data.frame(section = c("1", "4", "5","6"), density = c("0.2", "0.7", "0.8", "0.2"), potatoes = c("a","n","ed","3")) rows_update(data_zero ,data_new) then ->> rows_update(data_zero ,data_new) Matching, by = "section" Error: All columns in y must exist in x. Run rlang::last_error() to see where the error occurred.

Oriol Baena Crespo Over a year ago

all right, I can add to my data_zero all final columns but that can get really slow. Thanks anyway, i'll work this time!

one Over a year ago

alternatively, you if the main columns are section and density, you can do this rows_update(data_zero ,data_new[,c('section','density')]) and left join this with your original dataframe. This should be faster.

Collectives™ on Stack Overflow

Adding rows depending on condition

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related