2

I want to split the field in big query data table.

I give you sample case..

==case 1== Source Filed= "idx1-cnt1-name1,idx2-cnt2-name2... same pattern"

Result table
idx | cnt | name |
idx1,cnt1,name1
idx2,cnt2,name2,....

in query:

select                      
    regexp_extract(split_col, r'([\d]*)-') as ItemIdx,  
    regexp_extract(split_col, r'-([\d]*)-') as Cnt,
    regexp_extract(split_col, r'-([\d]*)$') as TitleIdx
From (
Select pid,now, split(source field, ',') split_col from (       
SELECT * FROM table ))

but I can't make a code this case

this case string has multiple saparator.

==case 2==

Source String = "item1-name1-type1-value1,.... same pattern"

Result Table
name | type
name1, type1
name2,type2

field count is different.

but just I need second, third field value.

how can I make a query..

2
  • Which dbms are you using? (That function is product specific.) Commented Feb 22, 2018 at 10:06
  • it is hard reverse engineer what exactly you need - please provide more representative example and expected result for both cases Commented Feb 22, 2018 at 13:16

1 Answer 1

3

I see you are using BigQuery legacy sql - so below example in it (note - it is quite recommended to use BigQuery Standard SQL whenever you can - so consider migrating) - below is simplified to make logic better readable for you so you can easily extend it to potentially more similar cases

Case 1 / example

#legacySQL
SELECT                      
    REGEXP_EXTRACT(split_col, r'^(.*?)-.*?-.*?$') AS idx,  
    REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?$') AS cnt,
    REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?$)') AS name
FROM (
  SELECT SPLIT(source_field, ',') split_col 
  FROM (SELECT "idx1-cnt1-name1,idx2-cnt2-name2" source_field)
)

result:

Row idx     cnt     name     
1   idx1    cnt1    name1    
2   idx2    cnt2    name2    

Case 2 / example

#legacySQL
SELECT                      
    REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?') AS name,  
    REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?)-') AS type
FROM (
  SELECT SPLIT(source_string, ',') split_col 
  FROM (SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string)
)  

result:

Row name    type     
1   name1   type1    
2   name2   type2    

Below is example of same but for BigQuery Standard SQL (just case 2 as they really similar)

#standardSQL
WITH `project.dataset.table` AS (
  SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string
)
SELECT 
  REGEXP_EXTRACT(split_col, r'^.*?-(.*?)-.*?') AS name,  
  REGEXP_EXTRACT(split_col, r'^.*?-.*?-(.*?)-') AS type
FROM `project.dataset.table`, UNNEST(SPLIT(source_string, ',')) split_col

obviously - same result

Row name    type     
1   name1   type1    
2   name2   type2    

Yet another option would be -

#standardSQL
WITH `project.dataset.table` AS (
  SELECT "item1-name1-type1-value1, item2-name2-type2-value2" source_string
)
SELECT 
  SPLIT(split_col, '-')[SAFE_OFFSET(1)] AS name,  
  SPLIT(split_col, '-')[SAFE_OFFSET(2)] AS type
FROM `project.dataset.table`, UNNEST(SPLIT(source_string, ',')) split_col

and so on ...

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.