2

I have a data frame looks like:

   X1   X2   X3
1 ### <NA> <NA>
2  aa   bb   cc
3  dd   ee   ff
4 ### <NA> <NA>
5  a1   a2   a3
6  b1   b2   b3
7  g3   h3   k5
8 ### <NA> <NA>
9  k1   k2   k3

Is there a way to split it into a list of 3 small data.frame by the ### rows to:

[[1]]
   X1   X2   X3
1  aa   bb   cc
2  dd   ee   ff
[[2]]
1  a1   a2   a3
2  b1   b2   b3
3  g3   h3   k5  
[[3]]
1  k1   k2   k3

Thanks!

The code to generate the example df:

df=data.frame(rbind(c("###",NA,NA),c("aa","bb","cc"),c("dd","ee","ff"),c("###",NA,NA),c("a1","a2","a3"),c("b1","b2","b3"),c("g3","h3","k5"),c("###",NA,NA),c("k1","k2","k3")))

3 Answers 3

3

We can use split after creating a grouping variable using logical vector

i1 <- df$X1 == "###"
split(df[!i1,], cumsum(i1)[!i1])
Sign up to request clarification or add additional context in comments.

1 Comment

Beautiful answer, akrun!
3

Group the rows using g which is 1 for the rows in the first data frame, 2 for the rows in the second and so on. Then split by g and remove the first row in each component.

g <- cumsum(df$X1 == "###")
lapply(split(df, g), tail, -1)

giving:

$`1`
  X1 X2 X3
2 aa bb cc
3 dd ee ff

$`2`
  X1 X2 X3
5 a1 a2 a3
6 b1 b2 b3
7 g3 h3 k5

$`3`
  X1 X2 X3
9 k1 k2 k3

Alternately the last line of code could be replaced with (which produces a by list):

by(df, g, tail, -1)

Comments

2

This may work:

from <- which(df[,"X1"]=="###")+1
to <- c(tail(from,-1)-2, nrow(df))
mapply(function(a,b) df[a:b,], from, to, SIMPLIFY=FALSE)

You would need to check for corner case (e.g. what if the first row is not ### or if the last row has ###).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.