22

I have the following data frame:

> str(df)
'data.frame':   3149 obs. of  9 variables:
 $ mkod : int  5029 5035 5036 5042 5048 5050 5065 5071 5072 5075 ...
 $ mad  : Factor w/ 65 levels "Akgün Kasetçilik         ",..: 58 29 59 40 56 11 33 34 19 20 ...
 $ yad  : Factor w/ 44 levels "BAKUGAN","BARBIE",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ donem: int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...
 $ sayi : int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...
 $ plan : int  2 2 3 2 2 2 7 3 2 7 ...
 $ sevk : int  2 2 3 2 2 2 6 3 2 7 ...
 $ iade : int  0 0 3 1 2 2 6 2 2 3 ...
 $ satis: int  2 2 0 1 0 0 0 1 0 4 ...

I want to remove 21 specific rows from this data frame.

> a <- df[df$plan==0 & df$sevk==0,]
> nrow(a)
[1] 21

So when I remove those 21 rows, I will have a new data frame with 3149 - 21 = 3128 rows. I found the following solution:

> b <- df[df$plan!=0 | df$sevk!=0,]
> nrow(b)
[1] 3128

My above solution uses a modified logical expression (!= instead of == and | instead of &). Other than modifying the original logical expression, how can I obtain the new data frame without those 21 rows? I need something like that:

> df[-a,] #does not work

EDIT (especially for the downvoters, I hope they understand why I need an alternative solution): I asked for a different solution because I'm writing a long code, and there are various variable assignments (like a's in my example) in various parts of my code. So, when I need to remove rows in advancing parts of my code, I don't want to go back and try to write the inverse of the logical expressions inside a-like expressions. That's why df[-a,] is more usable for me.

5
  • -1 You have a solution contained within the question. There is no problem to solve (as the question is currently worded). Commented Oct 27, 2011 at 13:10
  • 1
    @RichieCotton: My solution uses a modified (different) logical expression which ends up with the result I need; but what I want to see is how to remove specific rows from a data frame. I included my solution in my question because I didn't want to see it in the answers. Commented Oct 27, 2011 at 13:16
  • I've added a few lines to my question to explain what I want to know. Commented Oct 27, 2011 at 13:22
  • I think there is confusion over why you want something like df[-a,], when df[df$plan!=0 | df$sevk!=0,] seems to be the correct approach. Could you comment why, in the bigger picture, something like df[-a,] is preferable? Perhaps, in the bigger picture, there is an approach which avoids this problem. Commented Oct 27, 2011 at 21:50
  • It's because I'm writing a long code, and there are various variable assignments (like a's in my example) in various parts of my code. So, when I need to remove rows in advancing parts of my code, I don't want to go back and try to write the inverse of the logical expressions inside a-like expressions. That's why df[-a,] is more usable for me. Commented Oct 28, 2011 at 6:53

5 Answers 5

15

Just negate your logical subscript:

a <- df[!(df$plan==0 & df$sevk==0),]
Sign up to request clarification or add additional context in comments.

Comments

12

You can use the rownames to specify a "complementary" dataframe. Its easier if they are numerical rownames:

df[-as.numeric(rownames(a)),]

But more generally you can use:

df[setdiff(rownames(df),rownames(a)),]

1 Comment

Of course this assumes you have rownames, which the OP did in this case, but it's not a general solution
9

Are you looking for subset()?

dat <- airquality
dat.sub <- subset(dat, Temp > 80 & Month < 10)

dim(dat)
dim(dat.sub)

Applied to your example:

df.sub <- subset(df, plan != 0 & sevk != 0)

2 Comments

This is the same as my solution: df[df$plan!=0 | df$sevk!=0,] which selects a subset; but thanks anyway.
BTW, the & operator must be | (OR) operator in subset(df, plan != 0 & sevk != 0).
2

You're almost there. 'a' needs to be a vector of indices:

    df <- data.frame(plan=runif(10),sevk=runif(10))
    a <- c(df$plan<.1 | df$sevk < .1) # some logical thing
    df[-a,]

or, with your data:

    a <- c(df$plan==0 & df$sevk==0)
    df[-a,]

2 Comments

I tried the last two lines of your code with my data, but it gives the wrong result (3148 rows instead of 3128). (BTW, b[-a,] should be df[-a,] I guess)
sorry about the slop- it works with my self-contained little example above, so I guess whatever is going on with your data is over my head
0

I don't see why you object to your solution, but here's another way.

which( df[df$plan==0 & df$sevk==0,], arr.ind=TRUE) ->killlist 
newdf <- df[-c(killlist[1,])] 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.