I'd like to find a simple way to :
- run a query (defined in a string variable) once
- use a loop to extract chunks of n (let's say 200) rows from the previous query, the loop is expected to use the cached data from the previous query instead of running a new query for each iteration
Here's the idea so far :
from google.cloud import bigquery
from google.oauth2 import service_account
import pandas
"""
part where jsonPath, project_id and query_string are defined
"""
credentials = service_account.Credentials.from_service_account_file(jsonPath)
job_config = bigquery.QueryJobConfig()
job_config.use_query_cache = True
client = bigquery.Client(credentials = credentials, project = project_id)
query_job = client.query(query_string, job_config = job_config)
"""
run the query once, without importing data locally
"""
"""
looping over query_job to get, for each iteration,
a dataframe that can be appended to a csv file
stored locally.
"""
Could you kindly provide any tips ?
Thanks in advance,