3

I'm trying to extract the text following deviceSerialNumber= that occurs until either a & or the end of the string. Data looks like

someddata&=somedataagain&deviceSerialNumber=device12345&anotherField=moreData someddata&=somedataagain&deviceSerialNumber=deviceabcd

I've tried this

REGEXP_SUBSTR(session_tags || '&', 'deviceSerialNumber(.*)&') from table

but, this returns all of the text until the final & (which is the end of the string, since I'm appending a & for pattern matching purposes). How do I have to modify this regular expression to extract only the text until the first &?

4 Answers 4

8

I know it is late to respond, but here is solution that worked for me.

select regexp_substr(
    'someddata&=somedataagain&deviceSerialNumber=device12345&anotherField=moreData ', 
    'deviceSerialNumber=(.*)&', 0, 1, 'e');
Sign up to request clarification or add additional context in comments.

2 Comments

this is the right solution (i.e. using the e param to match on the subexpression). unfortunate that this answer is buried behind others.
can you please let me know, How do i stop at first '&' and not let it go to end ?
1

Use occurence parameter

REGEXP_SUBSTR(session_tags, 'deviceSerialNumber=(.+?)&',1,1) from table

Or

REGEXP_SUBSTR(session_tags,(?<=deviceSerialNumber=)(.*?)(?=&)) from table

.* will match till the last & and .*? will match till the first &

3 Comments

Hmm, this looked promising, but it still seems to be extracting all the text until the final occurrence of &
deviceSerialNumber=(.+?)& should do it.You might have to use replace deviceSerialNumber= in the final select.
Now, I'm getting a Invalid preceding regular expression prior to repetition operator error. Got it a bunch earlier when I trying other solutions that are on the net. Wondering if it's something specific about the redshift implementation
0

I was running into the same Invalid preceding regular expression prior to repetition operator with regexp_substr.

The work around I eventually settled on was two nested split_parts:

select
params, 
split_part(split_part(params, 'deviceSerialNumber=', 2), '&', 1)
from (
    select 'someddata&=somedataagain&deviceSerialNumber=device12345&anotherField=moreData' as params
    union all
    select 'someddata&=somedataagain&deviceSerialNumber=deviceabcd' as params
) tmp

Comments

-3

Found a hack solution that involves two levels of queries to get around having to use regexp_subtr. The inner query uses substring and position to pull out all of the text after the deviceSerialNumber tag. The outer query uses the same two functions to cut off any text after the next &

select substring(pre_serial_num, 1, position('&' in pre_device_id || '&') - 1) as device_id
from
    (select substring(session_tags,position('deviceSerialNumber' in session_tags) + 20, 40) as pre_device_id 
    from table) a

eg the inner query takes

someddata&=somedataagain&deviceSerialNumber=device12345&anotherField=moreData

someddata&=somedataagain&deviceSerialNumber=deviceabcd

and strips text before the device serial number tag to give you

device12345&anotherField=moreData

deviceabcd

The second query then strips text after the device serial number tag to give you

deviceSerialNumber=device12345

deviceSerialNumber=deviceabcd

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.