2

I am trying to create a dummy variable from a column variable from an existing data set. The variable I am interested in is a title in this format:

CHEMICALS - Commission Delegated Directive (EU) 2015/863 of 31 March 2015 amending Annex II to Directive 2011/65/EU of the European Parliament and of the Council as regards the list of restricted substances (Text with EEA relevance)

or

Commission Implementing Directive (EU) 2015/2392...

I want to create a dummy variable indicating that the Title is either implementing or delegated. In other words, when the word "delegated" is in my title variable, this will be labeled 1 and everything else will be labeled 0.

Can anyone help me with this? It is very appreciated. So far, I have used this code:

infringements$delegated <- ifelse(infringements$Title=="Delegated", 1, 0)
table(infringements$delegated, infringements$Title)  
summary(infringements$delegated)

When I run the code, I get 0 matches, even though I know that there are 41 matches.

4
  • Can you provide a minimal data example? Commented Mar 23, 2017 at 9:58
  • 1
    You can use str_detect() from the package stringr instead of == because == will only check if your string is equal to "Delegated" and what you're trying to do is to detect a pattern in your title. Commented Mar 23, 2017 at 10:00
  • 2
    Use grepl, i.e. as.integer(grepl('Delegated', infringements$Title)) Commented Mar 23, 2017 at 10:06
  • great, thank you! I used the grepl suggestion because I have already been working with the grep package, and this worked. Commented Mar 23, 2017 at 10:28

3 Answers 3

4

We can do

+(grepl('Delegated', infringements$Title))
Sign up to request clarification or add additional context in comments.

2 Comments

What is "+" sign is stands for?
@AleksandrVoitov It is a hacky way to coerce the logical to binary. Better approach is as.integer(grepl(...
3

Use str_detect() from the package stringr

library(stringr)

as.integer(str_detect(infringements$Title,"Delegated"))

Comments

1
infringements = data.frame(lapply(data.frame(Title=c("CHEMICALS - Commission Delegated Directive (EU) 2015/863 of 31 March 2015 amending Annex II to Directive 2011/65/EU of the European Parliament and of the Council as regards the list of restricted substances (Text with EEA relevance)","No Text","Text3Delegated")), as.character), stringsAsFactors=FALSE)
infringements$delegated = lapply(infringements$Title, function(x) ifelse(length(grep("Delegated", x))!=0, 1, 0))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.