1

I have a file like this

1880.1.1    74
1881.1.1    74
1882.1.1    75
1883.1.1    79
1884.1.1    111
1885.1.1    145

and I want to create a dataframe like this

1880    1    1  74
1881    1    1  74
1882    1    1  75
1883    1    1  79
1884    1    1  111
1885    1    1  145

but when I try with the gsub function I fail.. Many many thanks!

2
  • You have to escape the period, try out: gsub("\\."," ","1880.1.1") Commented Sep 9, 2013 at 14:51
  • Since you didn't show us how your gsub is failing, I'm going to guess you aren't escaping the .. It should look like gsub('\\.', ...) However, I don't think gsub is the function you want. Instead, look at strsplit and please share more of the code that you have tried. Commented Sep 9, 2013 at 14:52

3 Answers 3

5

You can use concat.split from my "splitstackshape" package for a more convenient way to do what you're trying to do. Assuming your data.frame is called "mydf" and the first column is called "V1", you can do:

> library(splitstackshape)
> concat.split(mydf, "V1", sep = ".", drop = TRUE)
   V2 V1_1 V1_2 V1_3
1  74 1880    1    1
2  74 1881    1    1
3  75 1882    1    1
4  79 1883    1    1
5 111 1884    1    1
6 145 1885    1    1

Here, "mydf" is defined as:

mydf <- structure(list(V1 = c("1880.1.1", "1881.1.1", "1882.1.1", "1883.1.1", 
  "1884.1.1", "1885.1.1"), V2 = c(74L, 74L, 75L, 79L, 111L, 145L)), 
  .Names = c("V1", "V2"), class = "data.frame", row.names = c(NA, -6L))

The equivalent in base R is to use something like the following:

> cbind(read.table(text = as.character(mydf$V1), sep = "."), mydf[-1])
    V1 V2 V3  V2
1 1880  1  1  74
2 1881  1  1  74
3 1882  1  1  75
4 1883  1  1  79
5 1884  1  1 111
6 1885  1  1 145
Sign up to request clarification or add additional context in comments.

Comments

2

Although Anandas' R base solution is the simplier and nicer, here's another approach using strsplit

> data.frame(t(sapply(strsplit(mydf[,"V1"], "\\." ), as.numeric)), X4=mydf[, "V2"])
    X1 X2 X3  X4
1 1880  1  1  74
2 1881  1  1  74
3 1882  1  1  75
4 1883  1  1  79
5 1884  1  1 111
6 1885  1  1 145

4 Comments

I did not know as.numeric would coerce the data to a matrix. Thanks for the lesson!
@dayne, it's not the as.numeric that's coercing to a matrix. You can have almost anything there that won't change the values (c, as.vector, ...). It's just that sapply will simplify to a matrix whenever possible (as it was in this case).
@AnandaMahto Thanks! I really should have known that. In this case is sapply or mapply more appropriate? They both seem to behave identically, using either the as.numeric or cbind/rbind approach.
@dayne, Not sure, really. Probably depends on how you define "more appropriate" :). I don't know which function is more efficient. I haven't used mapply much.
1

Here is a strsplit approach. I used @Ananda's data.

> data.frame(t(mapply(cbind,strsplit(mydf[,1],split='[:.:]'))),mydf[,2])
    X1 X2 X3 mydf...2.
1 1880  1  1        74
2 1881  1  1        74
3 1882  1  1        75
4 1883  1  1        79
5 1884  1  1       111
6 1885  1  1       145

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.