subsetting data frame without using column names

Question

I'm wondering if there is a better way to do this or if I might be running into some unforeseen trouble. I need to subset from a data frame but I do not want to use the column names. I'd need do it by referencing the column number.

data <- data.frame(col1= c(50, 20, NA, 100, 50), 
                   col2= c(NA, 25, 125, 50, NA),
                   col3= c(NA, 100, 15, 55, 25),
                   col4= c(NA, 30, 125, 100, NA),
                   col5= c(80, 25, 75, 40, NA))

Suppose I want to subset the data frame and keep only the row that contain 3 consecutive NAs before a valid number in column 5. Best I can come up with without using column names is this:

sub <- data[(which(is.na(data[2]) & 
                   is.na(data[3]) & 
                   is.na(data[4]) & 
                   !is.na(data[5]))), ]

Anyone see any trouble with this or know of a better way? I'm worried about using subsets within subsets although every thing appears to be working as it should.

David Arenburg · Accepted Answer · 2014-08-28 18:41:38Z

4

If you're looking to condense your code a little, you can do something like:

> data[rowSums(is.na(data[2:4])) == 3 & !is.na(data[5]), ]
  col1 col2 col3 col4 col5
1   50   NA   NA   NA   80

edited Aug 28, 2014 at 18:41

David Arenburg

92.4k18 gold badges145 silver badges202 bronze badges

answered Aug 28, 2014 at 18:25

A5C1D2H2I1M1N2O1R2T1

194k31 gold badges417 silver badges497 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

IRTFM Over a year ago

Why is column 1 being excluded from consideration?

A5C1D2H2I1M1N2O1R2T1 Over a year ago

@BondedDust, because it seems like that's what the OP is after based on their code (though their description is a little ambiguous).

Rich Scriven Over a year ago

(+1) I just finished this exact code 12 minutes late. By the way, soread() is a really useful function. :-)

A5C1D2H2I1M1N2O1R2T1 Over a year ago

@BondedDust, I think he's referring to github.com/sebastian-c/overflow/blob/master/R/soread.R

IRTFM Over a year ago

OK, although I thought I could see how spread migh apply here. I already had a function I had whipped up to do that job but I think I'll ask a question regarding a cognate problem with reading SO questions that are zoo or xts screen output. Maybe there could be automatic detection inside an expanded soread?

Collectives™ on Stack Overflow

subsetting data frame without using column names

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related