Deleting specific rows in a data frame

Question

I have a data frame that looks like this:

            type   created_at repository_name
1        IssuesEvent 3/11/12 6:48       bootstrap
2        IssuesEvent 3/11/12 6:48       bootstrap
3        IssuesEvent 3/11/12 6:48       bootstrap
4        IssuesEvent 3/11/12 6:52       bootstrap
5        IssuesEvent 3/11/12 6:52       bootstrap
6        IssuesEvent 3/11/12 6:52       bootstrap
7  IssueCommentEvent 3/11/12 7:03       bootstrap
8  IssueCommentEvent 3/11/12 7:03       bootstrap
9  IssueCommentEvent 3/11/12 7:03       bootstrap
10       IssuesEvent 3/11/12 7:03       bootstrap
11       IssuesEvent 3/11/12 7:03       bootstrap
12       IssuesEvent 3/11/12 7:03       bootstrap
13        WatchEvent 3/11/12 7:15       bootstrap
14        WatchEvent 3/11/12 7:15       bootstrap
15        WatchEvent 3/11/12 7:15       bootstrap
16        WatchEvent 3/11/12 7:18        hogan.js
17        WatchEvent 3/11/12 7:18        hogan.js
18        WatchEvent 3/11/12 7:18        hogan.js
19        WatchEvent 3/11/12 7:19       bootstrap

Here is a dput():

structure(list(type = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("IssueCommentEvent", 
"IssuesEvent", "WatchEvent"), class = "factor"), created_at = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 6L), .Label = c("3/11/12 6:48", "3/11/12 6:52", "3/11/12 7:03", 
"3/11/12 7:15", "3/11/12 7:18", "3/11/12 7:19"), class = "factor"), 
    repository_name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L), .Label = c("bootstrap", 
    "hogan.js"), class = "factor")), .Names = c("type", "created_at", 
"repository_name"), class = "data.frame", row.names = c(NA, -19L
))

I want to delete every row that contains the string 'WatchEvent" in column 'type'. How can I accomplish this in R?

R may not be the best tool if you just want to delete these lines from a csv file. Do you really care what is in the csv file, or just what is in the data.frame? Often, it makes more sense to keep the original file unchanged and just subset the data within R. — David LeBauer
– David LeBauer, Commented Aug 28, 2012 at 1:33
You are right. I want to change the data.frame. Do I need to do anything different just to change the data.frame? — histelheim
– histelheim, Commented Aug 28, 2012 at 1:36
nope, all you need is the answer by @AndyGarcia: df_a <- df[df$type!="WatchEvent",]. I will edit your question to reflect this. I would label this as a duplicate - but a quick search did not return any duplicates - although many use this method. — David LeBauer
– David LeBauer, Commented Aug 28, 2012 at 1:49

mnel · Accepted Answer · 2012-08-28 00:53:51Z

df <- structure(list(type = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("IssueCommentEvent", 
"IssuesEvent", "WatchEvent"), class = "factor"), created_at = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 6L), .Label = c("3/11/12 6:48", "3/11/12 6:52", "3/11/12 7:03", 
"3/11/12 7:15", "3/11/12 7:18", "3/11/12 7:19"), class = "factor"), 
    repository_name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L), .Label = c("bootstrap", 
    "hogan.js"), class = "factor")), .Names = c("type", "created_at", 
"repository_name"), class = "data.frame", row.names = c(NA, -19L
))

df_a <- df[df$type!="WatchEvent",]

#                 type   created_at repository_name
# 1        IssuesEvent 3/11/12 6:48       bootstrap
# 2        IssuesEvent 3/11/12 6:48       bootstrap
# 3        IssuesEvent 3/11/12 6:48       bootstrap
# 4        IssuesEvent 3/11/12 6:52       bootstrap
# 5        IssuesEvent 3/11/12 6:52       bootstrap
# 6        IssuesEvent 3/11/12 6:52       bootstrap
# 7  IssueCommentEvent 3/11/12 7:03       bootstrap
# 8  IssueCommentEvent 3/11/12 7:03       bootstrap
# 9  IssueCommentEvent 3/11/12 7:03       bootstrap
# 10       IssuesEvent 3/11/12 7:03       bootstrap
# 11       IssuesEvent 3/11/12 7:03       bootstrap
# 12       IssuesEvent 3/11/12 7:03       bootstrap

Deleting the rows is a separate process from anything csv related:

write.csv(df_a, "no_WatchEvent.csv", row.names=FALSE)

Collectives™ on Stack Overflow

Deleting specific rows in a data frame

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related