I am looking for a regular expressions pattern which will remove articles(a, an, the), special chars(;,:,% etc) and expand abbreviation(inc.-> 'incorporation', & -> 'and' etc) in snowflake. I am able to do this in snowflake but it not completely correct. Below is my code. The issue is that i want to give pattern (for example output of 'a good book' should be 'good book' but string 'give a book' should remain as
'''
select REGEXP_REPLACE((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE ((
select REGEXP_REPLACE (
(select REGEXP_REPLACE(concat (' ', lower('a book of the great man'), ' '), '(^an )|(^the )|
(^a )'))
, '\\.|\\,|\\(|\\)|\\!|\\\\|/|£|\\$|%|\\^|\\*|-|\\+|=|_|{|}|\\[|\\]|#|~|;|:|''|`|@|<|>|\\?|
¬|\\|')
), ' & ', ' and ')
), ' ltd ', ' limited ')
), '', '')
'''
