How can I create the columns to my table when using pandas.to_sql?

Question

I was creating a test SQL database with the following code:

import sqlite3

connection = sqlite3.connect('tv.sqlite')

cursor = connection.cursor()

#Create database
sql_query = """ CREATE TABLE tvshows (
    id integer PRIMARY KEY,
    producer text NOT NULL,
    language text NOT NULL,
    title text NOT NULL
)"""

#Execute the SQL query
cursor.execute(sql_query)

I get to define the table columns.

However, I want to import a huge database with 50k+ entries from a CSV that I have. That CSV doesn't have headers (but even if it did, I'd still have the same question).

I found this code online:

import pandas
import sqlite3

connection = sqlite3.connect('tv.sqlite')

cursor = connection.cursor()

#Reads csv and converts to sql
df = pandas.read_csv('tvshows_db.csv')
df.to_sql('tv', connection)

Here's what I get:

How did it create the database without me setting the column names (it uses the first TV show entry as the header values) and their values (type, char limit, etc.)? And how can I do it with pandas.to_sql?

Can you show the hearder from your csv ? df.head() and add it to your question ? — Benjamin Breton
– Benjamin Breton, Commented May 25, 2022 at 8:28
@faindirnomainzein what db client are you using to browse your sqlite db — Alan Garrido
– Alan Garrido, Commented Aug 17, 2023 at 23:28

Onur Guven · Accepted Answer · 2022-05-25 08:31:28Z

1

It is assuming the first row values are the headers and creating it like that. You can set dataframe headers first before you push to db:

df = pandas.read_csv('tvshows_db.csv')
df.columns = ['col1', 'col2']

For column types, if you create a table like in your first code chunk, you can set if_exists parameter to append in to_sql() and it will respect column types when writing to db

df.to_sql('tv', connection, if_exists='append')

answered May 25, 2022 at 8:31

Onur Guven

6304 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Faindirnomainzein Over a year ago

So if I understand correctly. I merge the first and the second code. So I execute the query creating the table columns, then I use your code. However it gives me an error: "has no column named index".

Onur Guven Over a year ago

Execute first code, and add to your second code after reading the csv df.columns = ['col1', 'col2'] where you set column names to your dataframe and then write to db with df.to_sql('tv', connection, if_exists='append')

Faindirnomainzein Over a year ago

It works, thanks. I had to remove the PRIMARY KEY for it to work.

Onur Guven Over a year ago

No problem, keep in mind if you run the second code chunk again it will add the same data to your table again

Collectives™ on Stack Overflow

How can I create the columns to my table when using pandas.to_sql?

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related