0

I have a dataset with columns that contain information of a code + name, which I would like to separate into 2 columns. So, just an example:

Column E5000_A contain values like `0080002. ALB - Democratic Party' in one cell, I would like two columns one containing the code 0080002, and the other containing the other info.

I have 8 more columns with values very similar (E5000_A until E5000_H). This is the code that I am writing.

cols2 <- c("E5000_A" , "E5000_B" , "E5000_C" , "E5000_D" , 
           "E5000_E" , "E5000_F" , "E5000_G" , "E5000_H" )

for(i in cols2){
  cses_imd_m <- cses_imd_m %>% mutate(substr(i, 1L, 7L))  
}

But for some reason it is only generating a new column for the E5000_A and the loop does not go to the other variables. What am I doing wrong? Let me know if you need more details about the code or data frame.

1 Answer 1

1

data.frame approach

# to extract codes
df %>% 
mutate_at(.vars = vars(c("E5000_A", "E5000_B", "E5000_C", "E5000_D", "E5000_E", 
                          "E5000_F", "E5000_G", "E5000_H")), 
          .funs = function(x) str_extract("^\\d+", x)) 

You can also use across() inside of mutate().

If you want to use for loop

col_names <- c("E5000_A", "E5000_B", "E5000_C", "E5000_D", "E5000_E", "E5000_F", "E5000_G", "E5000_H")

for (i in col_names) {
  
  df[,sprintf("code_%s", i)] <- str_extract("^\\d+", df[,i])
  df[,sprintf("party_%s", i)] <- gsub(".*\\.", "", df[,i]) %>% str_trim() # remove all before dot (.)
  
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.