1

I have a pandas data frame that looks like this:

enter image description here

It has 6 columns in it. I tried appending it to an existing table in BigQuery with the same schema with this:

import os
from google.cloud import bigquery

# Login credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="secret.json"

# Initialize big query
client = bigquery.Client()

# Table information
project = "xxxxxxxx"
dataset = "Vahan"
table = "rto_data"
table_id = '{}.{}.{}'.format(project, dataset, table)

# Setup for upload
job_config = bigquery.LoadJobConfig()

# Define the table schema
schema = [bigquery.SchemaField(name='State', field_type='STRING', mode='NULLABLE'),
          bigquery.SchemaField(name='RTO', field_type='STRING', mode='NULLABLE'),
          bigquery.SchemaField(name='Registration_Number', field_type='STRING', mode='NULLABLE'),
          bigquery.SchemaField(name='Maker', field_type='STRING', mode='NULLABLE'),
          bigquery.SchemaField(name='Date', field_type='DATE', mode='NULLABLE'),
          bigquery.SchemaField(name='Registrations', field_type='INTEGER', mode='NULLABLE')]

job_config.create_disposition = "CREATE_IF_NEEDED"


# Make the API request
load_result = client.load_table_from_dataframe(dataframe=df,
                                               destination=table_id, 
                                               job_config=job_config)

# Wait for query to finish working
load_result.result()

# Make an API request.
table = client.get_table(table_id)

# Output
print("Loaded {} rows and {} columns to {}".format(table.num_rows, len(table.schema), table_id))

and I'm getting this error: BadRequest: 400 Provided Schema does not match Table advanced-analytics-123456:Vahan.rto_data. Cannot add fields (field: __index_level_0__)

I put the data in a new table and looks like the query is adding a random new column called __index_level_0__

enter image description here

How do I fix this so that I can append the data to my existing table? Your help would be greatly appreciated!

1
  • 1
    Just a note to others: CREATE_IF_NEEDED is the default value of job_config.createDisposition (so no need to specify). Importantly, you do need job_config.writeDisposition = 'WRITE_APPEND' to actually append an existing table. For some reason, not present in this question. Commented Dec 29, 2022 at 22:55

1 Answer 1

8

Maybe you have a __index_level_0__ column in the dataframe? Try dropping the index:

df.reset_index(drop=True, inplace=True)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.