0

I'm trying to replace text in a column based on 2 other columns in my data using R.

I have this data with these columns:

Id     City        Street             Street_Type           Street_Category 
1     Dallas     State Route 315       Street               Street
2     Dallas     State Route 82        State Highways       Street
3     SF          State St             Street               Street
4     NY city      Corss St            Street               Street
5     SD          Steven Pkwy          Street               Street
6     LA          Harlem Pkwy          Parkway              Parkway

And I want my data to look like :

  Id     City          Street             Street_Type         Street_Category 
  1     Dallas     State Route 315         Street               State Highways
  2     Dallas     State Route 82          State Highways       State Highways
  3     SF          State St               Street               Street
  4     NY City     Corss St               Street               Street
  5     SD         Steven Pkwy             Street               Parkway
  6     LA         Harlem Pkwy             Parkway              Parkway

I want to make changes on the existing column Street_Category where if column Street has the text "State Route" and column Street_Type has text "Street", we replace the text in Street_Category with "State Highways" Also if column Street has the text "Pkwy" and column Street_Type has text "Street", we replace the text in Street_Category with "Parkway".

I have a large dataset with different values that need to be replaced similar to this example. How can I do it among all the datasets I have? Also, I want to take into consideration case sensitivity. For example, I don't want to change the Street_Category of "State St" to "State Highways" because it has the word "State" in it.

I used this code to create Street_Type column but it caused this wrong classification of the Street_Category.

df$Street_Type <- g %>% 
  mutate(Street = case_when( 
    str_detect(Street,"St") ~ "Street", 
    str_detect(Street," State Route") ~ "State Highways",
    str_detect(Street,"Route") ~ "State Highways",
    str_detect(Street,"Pkwy") ~ "Parkway",

    
    TRUE ~ "No type"
  )

But it gave me this the first output, and I tried this code to replace the column based on 2 different columns following the answer in this link :

df[Street == " State Route" &  Street_Type == "Street", Street_Category == "State Highways"]
df[Street == " Pkwy" &  Street_Type == "Street", Street_Category == "Parkway"]

But I get the error message:

Error in `[.data.frame`(df, Street == " State Route" & Street_Type ==  : 
  object 'Street_Category' not found

What am I missing here? I'll be sp thankful if you can point out to the error I'm making here.

4
  • First, you should start your pipe with df <- g, not df$street type. The street type is what you want to create in the mutate. Second, in your case_when command you are only using one condition, i.e based on Street. Why don‘t you simply create a condition with an & operator like you describe? Commented Oct 19, 2021 at 22:22
  • Thank you for your comment! I did fix the df <- g command and it created the Street Type column successfully, but I'm still having issues with the case sensitivity, i.e Steven Pkwy will be created as Street not as Parkway because of the St in Steven. I didn't get what you meant by creating condition with &.. Do you mean with I make it one command instead of 2 commands like : df[Street == " State Route" & Street_Type == "Street", Street_Category == "State Highways"] & df[Street == " Pkwy" & Street_Type == "Street", Street_Category == "Parkway"]. I really appreciate your comment Commented Oct 20, 2021 at 1:34
  • Why row 2 has changed to 'State Highways' ? It does not have Street_Type = 'Street'. Commented Oct 20, 2021 at 1:51
  • @RonakShah I only want to change rows that the first condition should beStreet_Type="Street". this was a typo Commented Oct 20, 2021 at 13:16

1 Answer 1

1

You can try -

library(dplyr)
library(stringr)

df %>%
  mutate(Street_Category = case_when(
      str_detect(Street, 'State Route') & Street_Type == 'Street' ~ "State Highways", 
      str_detect(Street, 'Pkwy') &  Street_Type == 'Street' ~ "Parkway", 
      TRUE ~ Street_Category))

#  Id    City          Street    Street_Type Street_Category
#1  1  Dallas State Route 315         Street  State Highways
#2  2  Dallas  State Route 82 State Highways          Street
#3  3      SF        State St         Street          Street
#4  4 NY city        Corss St         Street          Street
#5  5      SD     Steven Pkwy         Street         Parkway
#6  6      LA     Harlem Pkwy        Parkway         Parkway

data

It is easier to help if you provide data in a reproducible format

df <- structure(list(Id = 1:6, City = c("Dallas", "Dallas", "SF", "NY city", 
"SD", "LA"), Street = c("State Route 315", "State Route 82", 
"State St", "Corss St", "Steven Pkwy", "Harlem Pkwy"), Street_Type = c("Street", 
"State Highways", "Street", "Street", "Street", "Parkway"), Street_Category = c("Street", 
"Street", "Street", "Street", "Street", "Parkway")), row.names = c(NA, -6L), class = "data.frame")
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much!!! this is what I wanted !! Really appreciate your help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.