1

I was doing some data wrangling of wine reviews in R and could not find an elegant way to do what I wanted.
My goal was to look at the title column of the wine reviews which usually contain the year of the wine and put that year in a different column. Kernal: https://www.kaggle.com/kieroneil/data-wrangling-wine-reviews-in-r

This is the code that did what I wanted but I'm hoping someone can show me a better way:

# Create the year columns and assign an arbitrary value.
library(tidyverse)
wine_04$year <- 1900
year_2000 <- unlist(str_detect(wine_04$title, "2000"))
year_2001 <- unlist(str_detect(wine_04$title, "2001"))
year_2002 <- unlist(str_detect(wine_04$title, "2002"))
year_2003 <- unlist(str_detect(wine_04$title, "2003"))
year_2004 <- unlist(str_detect(wine_04$title, "2004"))
year_2005 <- unlist(str_detect(wine_04$title, "2005"))
year_2006 <- unlist(str_detect(wine_04$title, "2006"))
year_2007 <- unlist(str_detect(wine_04$title, "2007"))
year_2008 <- unlist(str_detect(wine_04$title, "2008"))
year_2009 <- unlist(str_detect(wine_04$title, "2009"))
year_2010 <- unlist(str_detect(wine_04$title, "2010"))
year_2011 <- unlist(str_detect(wine_04$title, "2011"))
year_2012 <- unlist(str_detect(wine_04$title, "2012"))
year_2013 <- unlist(str_detect(wine_04$title, "2013"))
year_2014 <- unlist(str_detect(wine_04$title, "2014"))
year_2015 <- unlist(str_detect(wine_04$title, "2015"))
year_2016 <- unlist(str_detect(wine_04$title, "2016"))
year_2017 <- unlist(str_detect(wine_04$title, "2017"))

wine_04[year_2000 == TRUE, 15] <- 2000
wine_04[year_2001 == TRUE, 15] <- 2001
wine_04[year_2002 == TRUE, 15] <- 2002
wine_04[year_2003 == TRUE, 15] <- 2003
wine_04[year_2004 == TRUE, 15] <- 2004
wine_04[year_2005 == TRUE, 15] <- 2005
wine_04[year_2006 == TRUE, 15] <- 2006
wine_04[year_2007 == TRUE, 15] <- 2007
wine_04[year_2008 == TRUE, 15] <- 2008
wine_04[year_2009 == TRUE, 15] <- 2009
wine_04[year_2010 == TRUE, 15] <- 2010
wine_04[year_2011 == TRUE, 15] <- 2011
wine_04[year_2012 == TRUE, 15] <- 2012
wine_04[year_2013 == TRUE, 15] <- 2013
wine_04[year_2014 == TRUE, 15] <- 2014
wine_04[year_2015 == TRUE, 15] <- 2015
wine_04[year_2016 == TRUE, 15] <- 2016
wine_04[year_2017 == TRUE, 15] <- 2017

Thanks for the help.

1
  • 2
    You would like wine_04 to contain a column labelled year containing the year of the wine? Would this work? wine_04$year <- sub('.*(\\d{4}).*', '\\1', wine_04$title) Commented Feb 26, 2018 at 20:39

1 Answer 1

5

This works.

library(stringr)
df <- data.table(text = c('the wine is from 1898','the wine is since 2008'))
df[,year := str_extract(string = text, pattern = '\\d{4}')]

                     text year
1:  the wine is from 1898 1898
2: the wine is since 2008 2008
Sign up to request clarification or add additional context in comments.

1 Comment

Worked perfect. Thanks Manish.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.