My string is as follows:
'NAME NAME NAME 400ML NAME CODE'
I need to identify ML, go left to catch all digits before that and stop on first space to get:
400ML
Current code I have:
SUBSTR(FIELD,CHARINDEX('ML',FIELD), 2)
I suggest using
regexp_substr(field, '\\d+\\s*ML\\b')
This regex will make sure the ML is matched as a whole word, and if there are any whitespaces between a number and ML, they will also be matched.
See the regex demo.
Regex details
\d+ - 1 or more digits\s* - 0 or more whitespacesML - a string ML\b - a word boundary.You can use regexp_substr():
select regexp_substr(field, '[^ ]+ML')
Or for specifically alphanumeric characters:
select regexp_substr(field, '[a-zA-Z0-9]+ML')
If Snowflake is not greedy (which seems unlikely but is possible), then you can do:
select trim(regexp_substr(' ' || field, ' [a-zA-Z0-9]*ML'))
WORD-M 400ML CODE returns ML'WORD-ML 400ML CODE' :)+ in the second method.To extract number with ML as suffix use
select regexp_substr(field, '[0-9]+ML')
EXPLANATION
NODE EXPLANATION
--------------------------------------------------------------------------------
[0-9]+ any character of: '0' to '9' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
ML 'ML'
regexp_substr(field, '\\d+ML')? Are there any non-digit chars you need to match beforeML?MLstirng in it. That way number-ML will not be found in first place.NAMEML CODE MLNAME 400 ML CODE CODE. This can help:regexp_substr(REPLACE(field, ' ', ''), '\\d+ML')ML, you may match them with\s*. Also, what isMLis part of a longer word? Tryregexp_substr(field, '\\d+\\s*ML\\b')