0

I've got a data frame (df) with four variables out of which two are factors, var1 and var2. var1 and var2 each have three levels.

Some combinations of var1 and var2 are not present in the data frame, eg there is no var2 level "4 or 5" present for var1 level "slow".

I'd like to add those missing combination rows to my data frame (dfgoal), and set var3 and var4 of those rows to 0.

I find adding rows tricky at the best of times, and have no idea how to achieve this. Any help would be much appreciated!

# Starting point 
df <- data.frame(var1=c("fast","fast","fast","medium","slow","slow"),
                 var2=c("1 or 2","3","4 or 5","3","1 or 2","3"),
                 var3_freq=c(22,56,22,100,36,64),
                 var4_n=c(10,26,10,2,5,9))
df$var1 <- as.factor(df$var1)
df$var2 <- as.factor(df$var2)

# Goal
dfgoal <- data.frame(var1=c("1 or 2","3","4 or 5","1 or 2","3","4 or 5","1 or 2","3","4 or 5"),
                 var2=c("fast","fast","fast","medium","medium","medium","slow","slow","slow"),
                 var3_freq=c(22,56,22,0,100,0,36,64,0),
                 var4_n=c(10,26,10,0,2,0,5,9,0))
3
  • Why not rbind()? Commented Oct 13, 2018 at 12:03
  • And: do your starting and target dataframes have a different structure on purpose? (var1 vs. var2) Commented Oct 13, 2018 at 12:05
  • 1
    No, that's my mistake, but it really doesn't matter Commented Oct 13, 2018 at 13:12

1 Answer 1

0

Simple solution without loading external libraries:

    var1   var2 var3_freq var4_n
1   fast 1 or 2        22     10
2   fast      3        56     26
3   fast 4 or 5        22     10
4 medium      3       100      2
5   slow 1 or 2        36      5
6   slow      3        64      9
7 medium 1 or 2         0      0
8 medium 4 or 5         0      0
9   slow 4 or 5         0      0

Code

new <- data.frame(var1 = c("medium", "medium", "slow"),
                  var2 = c("1 or 2", "4 or 5", "4 or 5"),
                  var3_freq = c(0, 0, 0),
                  var4_n = c(0, 0, 0))
rbind(df, new)

Data

df <- data.frame(var1=c("fast","fast","fast","medium","slow","slow"),
                 var2=c("1 or 2","3","4 or 5","3","1 or 2","3"),
                 var3_freq=c(22,56,22,100,36,64),
                 var4_n=c(10,26,10,2,5,9))
df$var1 <- as.factor(df$var1)
df$var2 <- as.factor(df$var2)    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.