0

I have a table with the following structure.

ID | json
bla | [{"user": "[email protected]", "timestamp": 0, "status": 1}, {"user": "[email protected]", "timestamp": 1, "status": 2}];

etc.

Now I want to read them, so that I have the follwing structure in bigquery.

ID | USER | TIMESTAMP | STATUS
bla  [email protected]  0          1
bla  [email protected]  1          2

When doing this:

CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
        return jsonPath(JSON.parse(json), json_path);
"""
OPTIONS (
    library="gs://json_path/jsonpath-0.8.0.js"
);
WITH t AS (
SELECT id, history AS json_column FROM TABLE WHERE history IS NOT NULL
)
SELECT 
id,
  CUSTOM_JSON_EXTRACT(json_column , '$[*].email') AS email,
  CUSTOM_JSON_EXTRACT(json_column , '$[*].status') AS status,
  CUSTOM_JSON_EXTRACT(json_column , '$[*].timestamp') AS timestamp
FROM t 

I don't get this row by row, but I get 1 row with sub rows...

13
  • [{user: '[email protected]', timestamp: 0, status: 1}, {user: '[email protected]', timestamp: 1, status: 2}] is simply a invalid JSON structure -> jsonlint.com -> "Error: Parse error on line 1: [{ user: '[email protected]', tim ---^ Expecting 'STRING', '}', got 'undefined'" .. So don't expected JSON parsing functions to work correctly Commented Oct 2, 2019 at 11:36
  • This is valid JSON -> [{ "user": "[email protected]", "timestamp": 0, "status": 1 }, { "user": "[email protected]", "timestamp": 1, "status": 2 }] Commented Oct 2, 2019 at 11:37
  • im sorry, but this was just example json. I'll change it. But you comment doesn't solve my question Commented Oct 2, 2019 at 11:38
  • "but this was just example json. I'll change it. But you comment doesn't solve my question " In that case that JSON parser which you use is very loose about validation rules, don't think thats a good thing.. Commented Oct 2, 2019 at 11:40
  • Again.. This was example data for stackoverflow. I did not ran a linter on it, it should have served for explain what I have as an input Commented Oct 2, 2019 at 11:41

1 Answer 1

2

Below is for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(json).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT 'bla' id, '[{"user": "[email protected]", "timestamp": 0, "status": 1}, {"user": "[email protected]", "timestamp": 1, "status": 2}]' json
)
SELECT id, 
  JSON_EXTRACT_SCALAR(x, '$.user') AS user,
  JSON_EXTRACT_SCALAR(x, '$.timestamp') AS `timestamp`,
  JSON_EXTRACT_SCALAR(x, '$.status') AS status
FROM `project.dataset.table`,
  UNNEST(json2array(JSON_EXTRACT(json, '$'))) x  

with result

Row id  user    timestamp   status   
1   bla [email protected]  0           1    
2   bla [email protected]  1           2    
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.