0

I'm working on a dataset that has a product description column and we are trying to extract info out of that dataset and create a new column based on the results. For instance if 'Room Darkening' is within the product description, then create a new column with the value of 'RD' for that row.

Here is the code I have:

for (i in 1:length(HD$Fabric.Description)){
  if (str_detect(HD$Fabric.Description[i],'RD')){
    HD$Type[i] == "RD"
  } 
  if (str_detect(HD$Fabric.Description[i],'Room Darkening')){
    HD$Type[i] == "RD"
  } 
  if (str_detect(HD$Fabric.Description[i],'LF')){
    HD$Type[i] == "LF"
  } 
  if (str_detect(HD$Fabric.Description[i],'Light Filtering')){
    HD$Type[i] == "LF"
  } else {
    HD$Type[i] == 'Other'
  }
}

The code runs without error but when I look at the HD dataset, the column doesn't appear.

head(HD)
X         Date Month Week Printer      FabricColor Fabric                    Product
1 1  1/8/19 0:00     1    2    5202 A1-321BOTTOMRAIL     A1 Silhouette Window Shadings
2 2 3/22/19 0:00     3   12    5201           A1-110     A1 Silhouette Window Shadings
3 3  4/3/19 0:00     4   14    5204        A1-266FCH     A1 Silhouette Window Shadings
4 4 4/18/19 0:00     4   16    5204        A1-168-BR     A1 Silhouette Window Shadings
5 5 1/11/19 0:00     1    2    5204         A1-107BR     A1 Silhouette Window Shadings
6 6 1/18/19 0:00     1    3    5204        A1-627FCH     A1 Silhouette Window Shadings
  Fabric.Description Blade.Size Color Other.Notes Setup.Time.In.Minutes Time.Over.3.Hr.Goal
1               #N/A          2   321          BR                   124                   0
2               #N/A          2   107                                90                   0
3               #N/A          2   627                               215                   1
4               #N/A          2   206                               436                   1
5 A1 - Originale 2in          2  1032                               105                   0
6 A1 - Originale 2in          2  1056                               116                   0
  Over60
1      1
2      1
3      1
4      1
5      1
6      1

But I know the str_detct function works.

str_detect(HD$Fabric.Description[2803], 'Room Darkening')
[1] TRUE

Any help would be awesome!

2 Answers 2

1

Seems like a good use case for case_when which is vectorized so you don't need for loop.

library(dplyr)
library(stringr)

HD <- HD %>%
  mutate(Type = case_when(str_detect(Fabric.Description, 'RD|Room Darkening') ~ "RD", 
                          str_detect(Fabric.Description, 'LF|Light Filtering') ~ "LF", 
                          TRUE ~ 'Other'))

As far as your code is concerned it should work if you have initialised the column.

HD$Type <- NA
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you @Ronak Shah and StatsStudent. I was also using the double '==' in my code. ugh
1

I think it's easier to use dplyr here mutate. This should do the trick and is easier to read:

HD %>% mutate(Type = case_when(str_detect(Fabric.Description, "Room Darkening") == 1  ~ "RD",
                               str_detect(Fabric.Description, "Light Filtering") == 1 ~ "LF",
                     TRUE ~ "Other"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.