4

I have a data frame that looks like this:

            type   created_at repository_name
1        IssuesEvent 3/11/12 6:48       bootstrap
2        IssuesEvent 3/11/12 6:48       bootstrap
3        IssuesEvent 3/11/12 6:48       bootstrap
4        IssuesEvent 3/11/12 6:52       bootstrap
5        IssuesEvent 3/11/12 6:52       bootstrap
6        IssuesEvent 3/11/12 6:52       bootstrap
7  IssueCommentEvent 3/11/12 7:03       bootstrap
8  IssueCommentEvent 3/11/12 7:03       bootstrap
9  IssueCommentEvent 3/11/12 7:03       bootstrap
10       IssuesEvent 3/11/12 7:03       bootstrap
11       IssuesEvent 3/11/12 7:03       bootstrap
12       IssuesEvent 3/11/12 7:03       bootstrap
13        WatchEvent 3/11/12 7:15       bootstrap
14        WatchEvent 3/11/12 7:15       bootstrap
15        WatchEvent 3/11/12 7:15       bootstrap
16        WatchEvent 3/11/12 7:18        hogan.js
17        WatchEvent 3/11/12 7:18        hogan.js
18        WatchEvent 3/11/12 7:18        hogan.js
19        WatchEvent 3/11/12 7:19       bootstrap

Here is a dput():

structure(list(type = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("IssueCommentEvent", 
"IssuesEvent", "WatchEvent"), class = "factor"), created_at = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 6L), .Label = c("3/11/12 6:48", "3/11/12 6:52", "3/11/12 7:03", 
"3/11/12 7:15", "3/11/12 7:18", "3/11/12 7:19"), class = "factor"), 
    repository_name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L), .Label = c("bootstrap", 
    "hogan.js"), class = "factor")), .Names = c("type", "created_at", 
"repository_name"), class = "data.frame", row.names = c(NA, -19L
))

I want to delete every row that contains the string 'WatchEvent" in column 'type'. How can I accomplish this in R?

3
  • R may not be the best tool if you just want to delete these lines from a csv file. Do you really care what is in the csv file, or just what is in the data.frame? Often, it makes more sense to keep the original file unchanged and just subset the data within R. Commented Aug 28, 2012 at 1:33
  • You are right. I want to change the data.frame. Do I need to do anything different just to change the data.frame? Commented Aug 28, 2012 at 1:36
  • nope, all you need is the answer by @AndyGarcia: df_a <- df[df$type!="WatchEvent",]. I will edit your question to reflect this. I would label this as a duplicate - but a quick search did not return any duplicates - although many use this method. Commented Aug 28, 2012 at 1:49

1 Answer 1

3
df <- structure(list(type = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("IssueCommentEvent", 
"IssuesEvent", "WatchEvent"), class = "factor"), created_at = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 6L), .Label = c("3/11/12 6:48", "3/11/12 6:52", "3/11/12 7:03", 
"3/11/12 7:15", "3/11/12 7:18", "3/11/12 7:19"), class = "factor"), 
    repository_name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L), .Label = c("bootstrap", 
    "hogan.js"), class = "factor")), .Names = c("type", "created_at", 
"repository_name"), class = "data.frame", row.names = c(NA, -19L
))

df_a <- df[df$type!="WatchEvent",]

#                 type   created_at repository_name
# 1        IssuesEvent 3/11/12 6:48       bootstrap
# 2        IssuesEvent 3/11/12 6:48       bootstrap
# 3        IssuesEvent 3/11/12 6:48       bootstrap
# 4        IssuesEvent 3/11/12 6:52       bootstrap
# 5        IssuesEvent 3/11/12 6:52       bootstrap
# 6        IssuesEvent 3/11/12 6:52       bootstrap
# 7  IssueCommentEvent 3/11/12 7:03       bootstrap
# 8  IssueCommentEvent 3/11/12 7:03       bootstrap
# 9  IssueCommentEvent 3/11/12 7:03       bootstrap
# 10       IssuesEvent 3/11/12 7:03       bootstrap
# 11       IssuesEvent 3/11/12 7:03       bootstrap
# 12       IssuesEvent 3/11/12 7:03       bootstrap

Deleting the rows is a separate process from anything csv related:

write.csv(df_a, "no_WatchEvent.csv", row.names=FALSE)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.