1

I'd like to find a simple way to :

  1. run a query (defined in a string variable) once
  2. use a loop to extract chunks of n (let's say 200) rows from the previous query, the loop is expected to use the cached data from the previous query instead of running a new query for each iteration

Here's the idea so far :

from google.cloud import bigquery
from google.oauth2 import service_account 
import pandas

"""
part where jsonPath, project_id and query_string are defined
"""


credentials = service_account.Credentials.from_service_account_file(jsonPath)
job_config = bigquery.QueryJobConfig()
job_config.use_query_cache = True
client = bigquery.Client(credentials = credentials, project = project_id)

query_job = client.query(query_string, job_config = job_config)


"""
run the query once, without importing data locally 
"""


"""
looping over query_job to get, for each iteration, 
a dataframe that can be appended to a csv file
stored locally.
"""

Could you kindly provide any tips ?

Thanks in advance,

1 Answer 1

2

Unfortunately, you can't read the cache from BigQuery with an API. Using the cache has some exceptions, which you can see in this link.

When a destination table is specified in the job configuration, the Cloud Console, the bq command-line tool, or the API

Another option is to save the data in a BigQuery table (permanent or temporary). With this table, you can read the data from the API.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.