0

We have production databases (postgresql and mysql) on Cloud SQL.

How could I export the data from the production databases, and then append to BigQuery datasets?

I DO NOT want to sync or replicate the data into BigQuery because we purge (after backing up) the production databases on regular basis.

The only method I could think of is:

  1. Export to CSV and then drop into Google Cloud Storage
  2. Python scrip to append into BigQuery.

Are there any other more optimal ways?

1 Answer 1

3

BigQuery supports external data sources, specifically federated queries which allow you to read data directly from a Cloud SQL instance.

You can use this feature to select from all the relevant tables in your Postgres/MySQL instances and copy them into BigQuery without any extra ETL process. You can append the data to your existing tables, create a new table every time, or use some other organization that works for you.

BigQuery also supports scheduled queries so you can automate this.

The actual SQL will depend on your data sources but it's not much more than...

INSERT INTO `your_bq_table`
SELECT * 
FROM `external.postgres123.tablename`
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.