2

I have the following variable in a dataframe

test<-data.frame(x=c("", "1-7-9", "3", "2-4-6-8"))

I want to splits that in variables like:

Var1 Var2 Var3 Var4
NA   NA   NA   NA
1    7    9
3    NA   NA   NA
2    4    6   8

I have tried

 test2<-strsplit(as.character(vartest$x), "\\-")  

but I get a list rather a dataframe

Please help me

4 Answers 4

5
library(data.table)
setDT(test)[, tstrsplit(x, "-", type.convert = TRUE, fixed = TRUE)]
#    V1 V2 V3 V4
# 1: NA NA NA NA
# 2:  1  7  9 NA
# 3:  3 NA NA NA
# 4:  2  4  6  8

Note: data.table dev version 1.9.5. The type.convert argument and factor to character conversion have been implemented in the latest dev version per #1094 (Thanks Arun!).

Or

splitstackshape::cSplit(test, "x", "-")
#    x_1 x_2 x_3 x_4
# 1:  NA  NA  NA  NA
# 2:   1   7   9  NA
# 3:   3  NA  NA  NA
# 4:   2   4   6   8

These both return data tables that can be converted back to data frames by assigning the result then using setDF(). They also both properly convert the numeric characters to classed "integer" columns.


And just for fun, a really difficult way to get a data frame back with scan()

x <- as.character(test$x)
v <- max(vapply(strsplit(x, "-", fixed = TRUE), length, 1L))
s <- scan(text = x, what = as.list(integer(v)), sep = "-", fill = TRUE, 
    na.strings = "", blank.lines.skip = FALSE)
setNames(data.frame(s), make.names(seq_along(s)))
#   X1 X2 X3 X4
# 1 NA NA NA NA
# 2  1  7  9 NA
# 3  3 NA NA NA
# 4  2  4  6  8
Sign up to request clarification or add additional context in comments.

3 Comments

I have a funny feeling that are you going to keep adding solutions all night :)
Make a PR on type.convert for tstrsplit then :)
I think I might. I pinged Arun in the chat room
3

Some other other options

library(tidyr) 
separate(test, x, paste0("Var", 1:4), extra = "merge", convert = TRUE)
#   Var1 Var2 Var3 Var4
# 1   NA   NA   NA   NA
# 2    1    7    9   NA
# 3    3   NA   NA   NA
# 4    2    4    6    8

And (using partially your solution - though types are not guarantied)

library(stringi)
data.frame(stri_list2matrix(strsplit(as.character(test$x), "-", fixed = TRUE), byrow = TRUE)) 
#    X1   X2   X3   X4
# 1 <NA> <NA> <NA> <NA>
# 2    1    7    9 <NA>
# 3    3 <NA> <NA> <NA>
# 4    2    4    6    8

Or (contributed by @Richard) a complete stringi version of the above

data.frame(stri_split_fixed(test$x, "-", simplify = NA, omit_empty = NA))
#     X1   X2   X3   X4
# 1 <NA> <NA> <NA> <NA>
# 2    1    7    9 <NA>
# 3    3 <NA> <NA> <NA>
# 4    2    4    6    8

4 Comments

Check out stringi::stri_split_fixed(test$x, "-", simplify = NA)
@RichardScriven that's a nice one
Also, stringi converts factors automatically so we don't need as.character(), which is really nice too
Yes, and it has simplify argument too. And separate has a type.convert argument aperantly...
2

This is a base attempt, although it fails to populate the first row with NA's, and some testing shows that it never coverts the empty character item to a rows of NA's.

dat <- read.table(text=as.character(test$x), sep="-", 
                   fill =TRUE,col.names=paste0("Var", 1:4) )
> dat
  Var1 Var2 Var3 Var4
1    1    7    9   NA
2    3   NA   NA   NA
3    2    4    6    8

Comments

0

Using base R:

x <- strsplit(as.character(test$x),"-")
nc <- max(sapply(x, length))
out <- data.frame(do.call(rbind, lapply(x, "[", 1:nc)))
names(out) <- paste("var", 1:nc, sep = "")

> out
  var1 var2 var3 var4
1 <NA> <NA> <NA> <NA>
2    1    7    9 <NA>
3    3 <NA> <NA> <NA>
4    2    4    6    8

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.