Skip to content

sungchun12/schedule-python-script-using-Google-Cloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

schedule-python-script-using-Google-Cloud

Use Case: Automates live Chicago traffic data and flows it into Bigquery for interactive real-time analysis

Technical Concept: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job.

Reference: http://zablo.net/blog/post/python-apache-beam-google-dataflow-cron

Shout out to Mylin Ackerman for all his help. Saved me weeks of research with his personal touch. https://www.linkedin.com/in/mylin-ackermann-25a00445/

Order of Operations:

  1. Develop scripts with Google cloud shell or SDK
  2. Deploy on appengine
  3. Deploy cron job
  4. Check BigQuery
  5. Connect with dataviz tool such as Tableau

Setup Prerequisites:

Development Instructions:

  1. Copy github repository into SDK or Google cloud shell(thankfully it has persistent storage, so you don't have to recopy the folder structure)

Deploy Instructions:

  1. Install all required packages into local lib folder: pip install -r requirements.txt -t lib
  2. To deploy App Engine app, run: gcloud app deploy app.yaml
  3. To deploy App Engine CRON, run: gcloud app deploy cron.yaml

Document Context:

  1. app.yaml contains definition of App Engine app, which will spawn Dataflow pipeline
  2. cron.yaml contains definition of App Engine CRON, which will ping one of the App endpoints (in order to spawn Dataflow pipeline)
  3. appengine_config.py adds dependencies to locally installed packages (from lib folder)

About

🕓 Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages