1

I have several 1.000 URLs and want to extract some values from the URL parameters. Here some examples from the DB:

["//xxx.com/se/something?SE_{ifmobile:MB}{ifnotmobile:DT}_A_B_C_D_E_F_G_H"]
["//www.xxx.com/se/car?p_color_car=White?SE_{ifmobile:MB}{ifnotmobile:DT}_A_B_C_D_E_F_G_H"]

I want to extract the SE | A | B | C | D | E | F | G | H

I have tried it with REGEXP_EXTRACT

REGEXP_EXTRACT_ALL(Url,r'(?:\?|&)(?:([^_]+)_(?:[^&]*))') as Country

The problem is since I have two '?' the outcome returns SE for the first url, and p for the second url. How can I solve this in one regexp so that I don't get the p but actual SE for the second url as well.

1 Answer 1

1

You can use

[?&]([^_]+)_[^&?]*$

See the regex demo. Details:

  • [?&] - a ? or & char
  • ([^_]+) - Group 1 (the actual output string): one or more chars other than _
  • _ - a _ char
  • [^&?]* - zero or more & or ? chars
  • $ - end of string.
Sign up to request clarification or add additional context in comments.

2 Comments

REGEXP_EXTRACT_ALL(Url, r'[?&]([^_]+)_[^&?]*$ as URL Was this the thought? I get error about unclosed string literal.
@user3052850 Sure, just close it, REGEXP_EXTRACT_ALL(Url, r'[?&]([^_]+)_[^&?]*$')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.