I wanted to see if it's possible to do this without a regex, although it ends up being a bit verbose :) This answer assumes that the address is always duplicated and ends with .com. Assuming that's the case, it should be possible to use SPLIT to extract the parts that you are interested in:
SELECT
CONCAT(
SPLIT(text, '.com')[OFFSET(0)],
'.com',
SPLIT(text, '.com')[OFFSET(2)]
) AS Landing_Page
FROM (
SELECT 'www.example.comwww.example.com/path/to/page' AS text UNION ALL
SELECT 'apply.example.comapply.example.com/eapp/'
);
If you want the query to be tolerant of non-duplicated addresses, you can make a slight modification:
SELECT
(
SELECT
CONCAT(
parts[OFFSET(0)],
'.com',
parts[OFFSET(ARRAY_LENGTH(parts) - 1)]
)
FROM (SELECT SPLIT(text, '.com') AS parts)
) AS Landing_Page
FROM (
SELECT 'www.example.comwww.example.com/path/to/page' AS text UNION ALL
SELECT 'apply.example.comapply.example.com/eapp/' UNION ALL
SELECT 'www.example.com/path/to/page'
);
Going one step further, you can extract the logic into a UDF:
CREATE TEMP FUNCTION GetLandingPage(text STRING) AS (
(
SELECT
CONCAT(
parts[OFFSET(0)],
'.com',
parts[OFFSET(ARRAY_LENGTH(parts) - 1)]
)
FROM (SELECT SPLIT(text, '.com') AS parts)
)
);
SELECT
GetLandingPage(text) AS Landing_Page
FROM (
SELECT 'www.example.comwww.example.com/path/to/page' AS text UNION ALL
SELECT 'apply.example.comapply.example.com/eapp/' UNION ALL
SELECT 'www.example.com/path/to/page'
);