1

How would I use REGEXREPLACE to group URL paths based on just the first portion of the path (after the domain):

/
/foo/
/foo/bar
/xyz/abc
/xyz

The URL's should group as follows:

/
foo
xyz

My biggest issue with this is how to rename the groups without predefining the group (they should use the matched regex string as the name).

5
  • I understand you are using Google Spreadsheets? Could you please share what you have tried for us not to reinvent what you have already tried. Commented May 14, 2018 at 13:34
  • I am using Google Data Studio. I haven't got close to being able to rename URL paths with the string of the first segment in the path. Commented May 14, 2018 at 13:35
  • Ok, but what was your best attempt? What worked wrong? Commented May 14, 2018 at 13:41
  • It is hard to guess what you need, try ^/([^/]+).* and replace with $1. It would be easier to help seeing some code of yours. Commented May 14, 2018 at 13:55
  • Sorry, I have now sort of worked it out. I used regex_extract: REGEXP_EXTRACT(URL, '/([^/]+)') Commented May 14, 2018 at 14:05

1 Answer 1

2

You may actually use

REGEXP_EXTRACT(URL, '^/([^/]+)')

The regex means

  • ^ - start of string
  • / - a slash
  • ([^/]+) - Capturing group 1 (what will be returned): 1 or more chars other than / char.

With a replace operation, you would use

REGEXPREPLACE(URL, "^/([^/]+).*", "$1")

Here, the rest of the string is matched with .* and $1 inserts the capturing group value into the resulting string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.