data.frame: find last index of a value in each row

Question

I have a data.frame like

Which I generated using

df<- data.frame(a=sample(0:1,5,replace=T),b=sample(0:1,5,replace=T),c=sample(0:1,5,replace=T),d=sample(0:1,5,replace=T))

How can I get the result as 4, 2, 2, 3, 1 if I pass 1 to that function depicting to find the last index of 1 in each row.

yes @akrun. i was using codebunk which was not allowing me to copy. — Saksham
– Saksham, Commented Jul 30, 2015 at 12:18
Suppose you have a row with only 0's what will be the index for that row? — akrun
– akrun, Commented Jul 30, 2015 at 12:20

josliber · Accepted Answer · 2015-07-30 12:44:41Z

4

One approach would be:

apply(df, 1, function(x) max(which(x == 1)))

If you wanted to be flexible about which element you're checking for and handle cases where the value is missing from a row:

max.row <- function(df, val) unname(apply(df, 1, function(x) tail(c(NA, which(x == val)), 1)))
max.row(df, 0)
# [1] 3 4 4 4
max.row(df, 1)
# [1] 4 2 2 3
max.row(df, 2)
# [1] NA NA NA NA

edited Jul 30, 2015 at 12:44

answered Jul 30, 2015 at 12:07

josliber

44.4k12 gold badges103 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mamoun Benghezal · Accepted Answer · 2015-07-30 12:17:28Z

4

you can try max.col which is a little bit faster than apply

max.col(df, "last")
# [1] 2 4 4 2 4

Data

set.seed(1)
df <- data.frame(a=sample(0:1,5,replace=T),b=sample(0:1,5,replace=T),c=sample(0:1,5,replace=T),d=sample(0:1,5,replace=T))

edited Jul 30, 2015 at 12:17

answered Jul 30, 2015 at 12:12

Mamoun Benghezal

5,3247 gold badges32 silver badges33 bronze badges

1 Comment

nicola Over a year ago

Set a seed and generate the df again, otherwise the results you are showing won't match with anybody else. +1 however

Saksham · Accepted Answer · 2015-07-30 13:26:58Z

4

Another option is using pmax. We multiply the col(df) by 'df' and get the max value by row.

  do.call(pmax,col(df)*df)
  #[1] 4 2 2 3 1

col(df) is a convenient function to get the column index of the dataset.

  col(df)
  #     [,1] [,2] [,3] [,4]
  #[1,]    1    2    3    4
  #[2,]    1    2    3    4
  #[3,]    1    2    3    4
  #[4,]    1    2    3    4
  #[5,]    1    2    3    4

By doing the multiplication of 'df' with the col(df) of equal dimension, the '0' values will remain 0 while the places that are '1' will be replaced by the column index, i.e.

 col(df)*df
 #  a b c d
 #1 1 0 0 4
 #2 1 2 0 0
 #3 0 2 0 0
 #4 1 0 3 0
 #5 1 0 0 0

Now, we can get the max value per each row by do.call(pmax)

edited Jul 30, 2015 at 13:26

Saksham

9,4209 gold badges48 silver badges75 bronze badges

answered Jul 30, 2015 at 12:32

akrun

891k38 gold badges590 silver badges700 bronze badges

Comments

Saksham · Accepted Answer · 2015-09-20 06:54:23Z

0

Seeing all the possible solutions and one from my side, here are the times taken by each replicated 10,000 times

apply(df,1,function(x){tail(which(x==1),1)})
user  system elapsed
2.978  0.010  2.988


apply(df*col(df),1,function(x){max(x)})
user  system elapsed
8.217  0.026  8.245



apply(df, 1, function(x) max(which(x == 1)))
user  system elapsed
1.621  0.005  1.627


max.col(df, "last")
user  system elapsed
1.348  0.004  1.352

Though @Mamoun Benghezal's answer is the most efficient, it doesn't solve my purpose of being flexible. The accepted answer does.

edited Sep 20, 2015 at 6:54

answered Aug 19, 2015 at 14:12

Saksham

9,4209 gold badges48 silver badges75 bronze badges

Collectives™ on Stack Overflow

data.frame: find last index of a value in each row

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related