1

text that I am trying to parse

Text: sometext herex0D Name: Davidx0D some more text: x0D - value1x0D - google maps to find x0D x0D What you did: x0D - i did some stuff x0D

The text I want to extract is anything after

What you did:

So the result should be

x0D - i did some stuff x0D

This is what I have but it is not returning what I want.

select REGEXP_EXTRACT('Text: sometext herex0D Name: Davidx0D some more text: x0D - value1x0D - google maps to find x0D x0D What you did: x0D - i did some stuff x0D', r"What you did:(.*)")

But, if you run this as-is it will return the right result because I pasted the string in query. But if I replace the string with the column name in the table then it doesn't work. I only get

x0D

back

8
  • 1
    works for me with column. something with the data itself - can you compare length of typed string vs. that string in table? Commented Sep 14, 2018 at 18:16
  • Length is different by 2. Length by using column name comes out to 2 more. Commented Sep 14, 2018 at 18:48
  • so this would be a key for addressing the issue I think. can you determine what those are? this would explain effect you see Commented Sep 14, 2018 at 18:53
  • 1
    or even more easy - see my answer below Commented Sep 14, 2018 at 19:24
  • 1
    see update in my answer Commented Sep 14, 2018 at 19:54

1 Answer 1

1

Below example for BigQuery Standard SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT ''''Text: sometext herex0D Name: Davidx0D some more text: x0D - value1x0D - google maps to find x0D x0D What you did: x0D
  i did some stuff x0D''' str
)
SELECT 
  REGEXP_EXTRACT(str, r"What you did:(.*)") result,
  SUBSTR(str, STRPOS(str, "What you did:") + LENGTH("What you did:")) adjusted_result
FROM `project.dataset.table`

Row     result      adjusted_result  
1       x0D         x0D i did some stuff x0D     

Update

So, in your table you have strings with new line that needs to be considered during the regexp parsing

By default . does not match \n
To make it work you need to use flag s like below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT ''''Text: sometext herex0D Name: Davidx0D some more text: x0D - value1x0D - google maps to find x0D x0D What you did: x0D
  i did some stuff x0D''' str
)
SELECT 
  REGEXP_EXTRACT(str, r"What you did:(.*)") result,
  REGEXP_EXTRACT(str, r"(?s)What you did:(.*)") multiline_result
FROM `project.dataset.table`    

with result

Row     result      multiline_result     
1       x0D         x0D i did some stuff x0D     
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.