0

I have some datasets (27 CSV files, separated by semicolons, summing 150+GB) that get uploaded every week to my Cloud Storage bucket.

Currently, I use the BigQuery console to organize that data manually, declaring the variables and changing the filenames 27 times. The first file replaces the entire previous database, then the other 26 get appended to it. The filenames are always the same.

How can I do it using Python?

2
  • 2
    Did you consider Workflow to achieve that? Commented Nov 17, 2021 at 17:20
  • No. I didn't even know it existed! (shame on me) Commented Nov 18, 2021 at 13:38

1 Answer 1

2

Please, check out Cloud Functions functionality. It allows to use python. After the function is deployed, Cron Jobs can be created. Here is related question: Run a python script on schedule on Google App Engine

Also here is and article which describes, how to load data from Cloud Storage Loading CSV data from Cloud Storage

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.