0

I want to use regex_replace to remove phrases that exist in one (or both) of two columns. So from the product name I want to remove the colour and brand that exist in the colour or brand columns. I have regex like this, but I wonder if you can do it without nesting?

    regexp_replace(regexp_replace(lower(product_name), 
    lower(Manufacturer_Brand_Name), ''),lower(Colour), '')

so instead of the above have something like

regexp_replace(lower(product_name), 
    (lower(Manufacturer_Brand_Name)|lower(Colour)), '')

But obviously you have to put it in quotation marks to read as regex, so then I'm not actually inputting the column names

regexp_replace(lower(product_name), 
    '(lower(Manufacturer_Brand_Name)|lower(Colour))', '')

The expected result is Dell Laptop Blue to just Laptop

1 Answer 1

0

You should use

REGEXP_REPLACE(product_name, CONCAT(r'(?i)', Manufacturer_Brand_Name, '|', Colour), '')    

For example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Dell Laptop Blue' product_name, 'Dell' Manufacturer_Brand_Name, 'blue' Colour
)
SELECT 
  REGEXP_REPLACE(product_name, CONCAT(r'(?i)', Manufacturer_Brand_Name, '|', Colour), '')
FROM `project.dataset.table`

with result

Row f0_  
1   Laptop   
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.