Split data frame by delimiter rows in R

Question

I have a data frame looks like:

   X1   X2   X3
1 ### <NA> <NA>
2  aa   bb   cc
3  dd   ee   ff
4 ### <NA> <NA>
5  a1   a2   a3
6  b1   b2   b3
7  g3   h3   k5
8 ### <NA> <NA>
9  k1   k2   k3

Is there a way to split it into a list of 3 small data.frame by the ### rows to:

[[1]]
   X1   X2   X3
1  aa   bb   cc
2  dd   ee   ff
[[2]]
1  a1   a2   a3
2  b1   b2   b3
3  g3   h3   k5  
[[3]]
1  k1   k2   k3

Thanks!

The code to generate the example df:

df=data.frame(rbind(c("###",NA,NA),c("aa","bb","cc"),c("dd","ee","ff"),c("###",NA,NA),c("a1","a2","a3"),c("b1","b2","b3"),c("g3","h3","k5"),c("###",NA,NA),c("k1","k2","k3")))

akrun · Accepted Answer · 2018-01-08 19:22:35Z

3

We can use split after creating a grouping variable using logical vector

i1 <- df$X1 == "###"
split(df[!i1,], cumsum(i1)[!i1])

answered Jan 8, 2018 at 19:22

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

l0110 Over a year ago

Beautiful answer, akrun!

G. Grothendieck · Accepted Answer · 2018-01-08 22:29:19Z

3

Group the rows using g which is 1 for the rows in the first data frame, 2 for the rows in the second and so on. Then split by g and remove the first row in each component.

g <- cumsum(df$X1 == "###")
lapply(split(df, g), tail, -1)

giving:

$`1`
  X1 X2 X3
2 aa bb cc
3 dd ee ff

$`2`
  X1 X2 X3
5 a1 a2 a3
6 b1 b2 b3
7 g3 h3 k5

$`3`
  X1 X2 X3
9 k1 k2 k3

Alternately the last line of code could be replaced with (which produces a by list):

by(df, g, tail, -1)

edited Jan 8, 2018 at 22:29

answered Jan 8, 2018 at 19:23

G. Grothendieck

273k18 gold badges221 silver badges365 bronze badges

Comments

Karsten W. · Accepted Answer · 2018-01-08 19:22:10Z

2

This may work:

from <- which(df[,"X1"]=="###")+1
to <- c(tail(from,-1)-2, nrow(df))
mapply(function(a,b) df[a:b,], from, to, SIMPLIFY=FALSE)

You would need to check for corner case (e.g. what if the first row is not ### or if the last row has ###).

answered Jan 8, 2018 at 19:22

Karsten W.

18.6k12 gold badges74 silver badges114 bronze badges

Collectives™ on Stack Overflow

Split data frame by delimiter rows in R

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related