I'm using Python 2.7 and the Google Cloud Client Library for Python (v0.27.0) to insert data into a BigQuery table (using table.insert_data()).
One of the fields in my table has type 'DATE'.
In my Python script I've formatted the date-data as 'YYYY-MM-DD', but unfortunately the Google Cloud library returns an 'Invalid date:' error for that field.
I've tried formatting the date-field in many ways (i.e. 'YYYYMMDD', timestamp etc.), but no luck so far...
Unfortunately the API docs (https://googlecloudplatform.github.io/google-cloud-python/latest/) don't mention anything about the required date format/type/object in Python.
This is my code:
from google.cloud import bigquery
import pandas as pd
import json
from pprint import pprint
from collections import OrderedDict
# Using a pandas dataframe 'df' as input
# Converting date field to YYYY-MM-DD format
df['DATE_VALUE_LOCAL'] = df['DATE_VALUE_LOCAL'].apply(lambda x: x.strftime('%Y-%m-%d'))
# Converting pandas dataframe to json
json_data = df.to_json(orient='records',date_format='iso')
# Instantiates a client
bigquery_client = bigquery.Client(project="xxx")
# The name for the new dataset
dataset_name = 'dataset_name'
table_name = 'table_name'
def stream_data(dataset_name, table_name, json_data):
dataset = bigquery_client.dataset(dataset_name)
table = dataset.table(table_name)
data = json.loads(json_data, object_pairs_hook=OrderedDict)
# Reload the table to get the schema.
table.reload()
errors = table.insert_data(data)
if not errors:
print('Loaded 1 row into {}:{}'.format(dataset_name, table_name))
else:
print('Errors:')
pprint(errors)
stream_data(dataset_name, table_name, json_data)
What is the required Python date format/type/object to insert my dates into a BigQuery DATE field?
insert_datano longer exists