7

So I have a raw table with 2 columns:

id (INT64)  |  content (STRING)
------------|--------------------
1           | {"photos": [{"location": {"lat": 111, "lon": 222}, "ts": "2019-12-16", "uri": "aaa"}, {"location": {"lat": 333, "lon": 444}, "ts": "2019-12-17", "uri": "bbb"}]}
------------|--------------------
2           | ....

First column is an integer-typed id, second column is json-formatted string. An example json looks like this:

{
  "photos": [
    {
      "location": {
        "lat": 111, 
        "lon": 222
      }, 
      "ts": "2019-12-16", 
      "uri": "aaa"
    }, 
    {
      "location": {
        "lat": 333, 
        "lon": 444
      }, 
      "ts": "2019-12-17", 
      "uri": "bbb"
    }
  ]
}

Question

How can I format the photos from the raw table into an array of structs/records, i.e. resulting in something like this?

id     |  photos.ts    | photos.uri  |  photos.location.lat  | photos.location.lon
-------|---------------|-------------|-----------------------|--------------------
1      |  2019-12-16   | aaa         |                   111 |                222
       |  2019-12-17   | bbb         |                   333 |                444
-------|---------------|-------------|-----------------------|--------------------
2      | ...           | ...         |                   ... |                ...

Thoughts

  1. JSON_EXTRACT(content, "$.photos") seems to be a good start as it would give me a JSON object array, then I'd need some JS UDF to format the result into BQ STRUCT/RECORD type. Not sure exactly how to do that though -- any help is appreciated!
  2. I'm not sure if this "cleanup" into STRUCT/RECORD is really necessary or worth it. It seems that I can just format photos into an array of STRING:
id (INT64)  |  photos (STRING)
------------|--------------------
1           | {"location": {"lat": 111, "lon": 222}, "ts": "2019-12-16", "uri": "aaa"}
            | {"location": {"lat": 333, "lon": 444}, "ts": "2019-12-17", "uri": "bbb"}
------------|--------------------
2           | ....

, then use JSON_EXTRACT/JSON_EXTRACT_SCALAR in my analytical queries. How big a performance sacrifice would I expect?

Thanks!

1 Answer 1

12

Below example is for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(json).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT 1 id, '{"photos": [{"location": {"lat": 111, "lon": 222}, "ts": "2019-12-16", "uri": "aaa"}, {"location": {"lat": 333, "lon": 444}, "ts": "2019-12-17", "uri": "bbb"}]}' content
)
SELECT id, json2array(JSON_EXTRACT(content, "$.photos")) AS photos
FROM `project.dataset.table`

with output

Row id  photos   
1   1   {"location":{"lat":111,"lon":222},"ts":"2019-12-16","uri":"aaa"}     
        {"location":{"lat":333,"lon":444},"ts":"2019-12-17","uri":"bbb"}     

OR ... you can go further with below

#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(json).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT 1 id, '{"photos": [{"location": {"lat": 111, "lon": 222}, "ts": "2019-12-16", "uri": "aaa"}, {"location": {"lat": 333, "lon": 444}, "ts": "2019-12-17", "uri": "bbb"}]}' content
)
SELECT id, 
  array(
    SELECT AS struct
      JSON_EXTRACT_SCALAR(photo, "$.ts") ts,
      JSON_EXTRACT_SCALAR(photo, "$.uri") uri,
      STRUCT(JSON_EXTRACT(photo, "$.location.lat") AS lat, JSON_EXTRACT(photo, "$.location.lon") AS lon) AS location
    FROM unnest(json2array(JSON_EXTRACT(content, "$.photos"))) photo
  ) AS photos

FROM `project.dataset.table`

which returns

Row id  photos.ts       photos.uri  photos.location.lat photos.location.lon  
1   1   2019-12-16      aaa         111                 222  
        2019-12-17      bbb         333                 444  
Sign up to request clarification or add additional context in comments.

2 Comments

exactly what I was asking for!
Can you include an example on how to run this query on the entire table?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.