using cache to import data from bigquery to Python

Question

I'd like to find a simple way to :

run a query (defined in a string variable) once
use a loop to extract chunks of n (let's say 200) rows from the previous query, the loop is expected to use the cached data from the previous query instead of running a new query for each iteration

Here's the idea so far :

from google.cloud import bigquery
from google.oauth2 import service_account 
import pandas

"""
part where jsonPath, project_id and query_string are defined
"""


credentials = service_account.Credentials.from_service_account_file(jsonPath)
job_config = bigquery.QueryJobConfig()
job_config.use_query_cache = True
client = bigquery.Client(credentials = credentials, project = project_id)

query_job = client.query(query_string, job_config = job_config)


"""
run the query once, without importing data locally 
"""


"""
looping over query_job to get, for each iteration, 
a dataframe that can be appended to a csv file
stored locally.
"""

Could you kindly provide any tips ?

Thanks in advance,

Raul Saucedo · Accepted Answer · 2022-03-02 17:00:42Z

2

Unfortunately, you can't read the cache from BigQuery with an API. Using the cache has some exceptions, which you can see in this link.

When a destination table is specified in the job configuration, the Cloud Console, the bq command-line tool, or the API

Another option is to save the data in a BigQuery table (permanent or temporary). With this table, you can read the data from the API.

answered Mar 2, 2022 at 17:00

Raul Saucedo

1,7801 gold badge7 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

using cache to import data from bigquery to Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related