I have a main data frame contains lots of websites that I'm working with and another data frame contains a list of bad websites to match and identify whether I have bad websites in my main data frame. Since I'm very new to this, I'm not sure how to match and replace the bad websites to "www.badwebsite.com"? Thanks.
Here is an example of the data frames:
site_list <- data.frame("host" = c("www.companya.com", "www.companyb.com", "www.malwaresite.com",
"www.companyc.com", "www.companyd.com", "www.virussite.com",
"www.companye.com", "www.companyf.com", "www.phishingsite.com"),
"URL" = c("www.companya.com/home", "www.companyb.com/home", "www.malwaresite.com/home",
"www.companyc.com/home", "www.companyd.com/home", "www.virussite.com/home",
"www.companye.com/home", "www.companyf.com/home", "www.phishingsite.com/home"))
bad_site_list <- data.frame("host" = c("www.malwaresite.com", "www.virussite.com", "www.phishingsite.com"))
I hope to achieve this result:
host URL
www.companya.com www.companya.com/home
www.companyb.com www.companyb.com/home
www.badwebsite.com www.badwebsite.com/home
www.companyc.com www.companyc.com/home
www.companyd.com www.companyd.com/home
www.badwebsite.com www.badwebsite.com/home
www.companye.com www.companye.com/home
www.companyf.com www.companyf.com/home
www.badwebsite.com www.badwebsite.com/home