0

I wonder whether someone could help me please.

I'm trying to put together a BigQuery script in Standard SQL which finds given fields values and then replaces a specific value.

This is the script that I've put together:

SELECT 
REGEXP_REPLACE(hits.eventInfo.eventLabel, r'.*\,','Apples') as eventLabel
FROM `bigquery.Test.ga_sessions_20181221`,
#hits are categorised as Arrays (REPEATED mode) in Standard SQL.UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY.
UNNEST(hits) hits
WHERE REGEXP_CONTAINS(page.pagePath, r'^/dashboard/.*\properties|^/dashboard/inbox') and REGEXP_CONTAINS(EventInfo.eventLabel, r'.*\,')

The problem I have is that I'm able to create a new column called 'eventLabel', but I can't figure out a way to overwrite the existing 'hits.eventInfo.eventLabel' column.

Could someone perhaps have a look at this please and offer some guidance on where I've gone wrong.

Many thanks and kind regards

Chris

4
  • How do you want to value of eventLabel to be changed ? Please explain the substitution rule Commented Jan 4, 2019 at 11:44
  • have you tried instead of as eventLabel select as hits.eventInfo.eventLabel Commented Jan 4, 2019 at 11:56
  • Do you want to re-create the original table but with different values in hits.eventInfo.eventLabel for some of the hits? Commented Jan 4, 2019 at 12:05
  • Hi all, my apologies for not making this more clear. I'd like to amend the data in the existing column i.e. where appropriate for the value in the 'hits.eventInfo.eventLabel' column to say Apples. I have tried using hits.eventInfo.eventLabel but I receive a syntax error. Many thanks Chris Commented Jan 4, 2019 at 13:00

3 Answers 3

2
+100

I can't figure out a way to overwrite the existing 'hits.eventInfo.eventLabel' column ...

Below is an example for BigQuery Standard SQL

#standardSQL
SELECT visitId, visitNumber, 
  ARRAY(
    SELECT y FROM (
      SELECT * REPLACE(
        IF(eventInfo IS NULL, 
          NULL, 
          STRUCT<eventCategory STRING, eventAction STRING, eventLabel STRING, eventValue INT64>
          (
            eventInfo.eventCategory, 
            eventInfo.eventAction, 
            IF(REGEXP_CONTAINS(page.pagePath, r'your regex here'), 
              REGEXP_REPLACE(eventInfo.eventLabel, r'your regex here','Apples'),
              eventInfo.eventLabel
            ), 
            eventInfo.eventValue
          )
        ) AS eventInfo) 
      FROM t.hits x
    ) y) hits
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170801` t  
Sign up to request clarification or add additional context in comments.

6 Comments

Hi @Mikhail Berlyant, thank you for this, but unfortunately the data isn't changed. Many thanks and kind regards
what do you mean? of course original data will not change if you just run select. you need to set destination table to save "fixed" data into that new table. or you can set same data as destination and set Overwrite Table as a write preference
Hi. I've been back to the script and set this to overwrite the existing table. But unfortunately the 'hits_eventInfo_eventLabel' column is not overwritten. Many thanks
i tested script in my answer - and it DOES work! so you should provide more details about your case and data. I would suspect you are using wrong regexp. but w/o example of your data - it is not possible to help you further
Hi Mikhail, my apologies. I'm not sure whether I'd done my initial testing containing a typo, but having re-written it, it is now working. Thank you for all your help and patience. Kind Regards. Chris
|
1

I think you're looking for an UPDATE statement, see DML syntax, especially the part "UPDATE repeated records" in the examples section.

In this query I'm modifying the given hits array by sub-querying it and building my own new array from it using SELECT AS STRUCT and feeding the output into ARRAY().

If all your regex are correct this should work as expected.

UPDATE `project.dataset.ga_sessions_20190107`
SET hits =
  ARRAY(SELECT AS STRUCT 
       * REPLACE (
  -- correcting eventInfo here
  IF(REGEXP_CONTAINS(page.pagePath, r'^/dashboard/.*/properties|^/dashboard/inbox') and REGEXP_CONTAINS(EventInfo.eventLabel, r'.*\,')
    ,STRUCT(
      eventInfo.eventCategory,
      eventInfo.eventAction,
      REGEXP_REPLACE(eventInfo.eventLabel, r'.*\,','Apples') AS eventLabel,
      eventInfo.eventValue
    )
    ,eventInfo) AS eventInfo)
    FROM UNNEST(hits)
  ) 
WHERE ( -- only relevant sessions
  SELECT COUNT(1)>0 
  FROM UNNEST(hits) 
  WHERE REGEXP_CONTAINS(page.pagePath, r'^/dashboard/.*/properties|^/dashboard/inbox') 
    AND REGEXP_CONTAINS(EventInfo.eventLabel, r'.*\,')
    )

This is untested. Please test first.

3 Comments

Hi Martin, thank you for this. I've run the script but unfortunately I'm receiving the following error: Error: No matching signature for function IF for argument types: BOOL, STRUCT<eventCategory STRING, eventAction STRING, eventLabel STRING, ...>, STRUCT<eventCategory STRING, eventAction STRING, eventValue INT64, ...>. Supported signature: IF(BOOL, ANY, ANY) at [6:3] Many thanks and kind regards
Can you try on a fresh test table? Seems like your test table is confused with the eventInfo schema. It should be STRUCT<eventCategory STRING, eventAction STRING, eventLabel STRING, eventValue INT64> - but the last two are swapped for some reason ...
Hi Martin. Thank you for coming back to me with this. I have used a fresh table but I'm still encountering the same issue. Kind Regards Chris
1

Hope this is is not too late.... I struggled with your same issue of the struct error in the IF and then figured out this simpler update:

update `xxxxxx.test_ga_sessions_20190728`
SET hits =
  ARRAY(
    SELECT AS STRUCT * REPLACE(
      (SELECT AS STRUCT eventInfo.* REPLACE(REGEXP_REPLACE(eventInfo.eventLabel,r'TESTING','Mandarins') AS eventLabel)) AS eventInfo)
    FROM UNNEST(hits)
  )
  where 
  fullVisitorId ='3030555601660252942';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.