0

I was creating a test SQL database with the following code:

import sqlite3

connection = sqlite3.connect('tv.sqlite')

cursor = connection.cursor()

#Create database
sql_query = """ CREATE TABLE tvshows (
    id integer PRIMARY KEY,
    producer text NOT NULL,
    language text NOT NULL,
    title text NOT NULL
)"""

#Execute the SQL query
cursor.execute(sql_query)

I get to define the table columns.

However, I want to import a huge database with 50k+ entries from a CSV that I have. That CSV doesn't have headers (but even if it did, I'd still have the same question).

I found this code online:

import pandas
import sqlite3

connection = sqlite3.connect('tv.sqlite')

cursor = connection.cursor()

#Reads csv and converts to sql
df = pandas.read_csv('tvshows_db.csv')
df.to_sql('tv', connection)

Here's what I get: enter image description here

How did it create the database without me setting the column names (it uses the first TV show entry as the header values) and their values (type, char limit, etc.)? And how can I do it with pandas.to_sql?

3
  • Can you show the hearder from your csv ? df.head() and add it to your question ? Commented May 25, 2022 at 8:28
  • I added an image of the database I get. Commented May 25, 2022 at 8:30
  • @faindirnomainzein what db client are you using to browse your sqlite db Commented Aug 17, 2023 at 23:28

1 Answer 1

1

It is assuming the first row values are the headers and creating it like that. You can set dataframe headers first before you push to db:

df = pandas.read_csv('tvshows_db.csv')
df.columns = ['col1', 'col2']

For column types, if you create a table like in your first code chunk, you can set if_exists parameter to append in to_sql() and it will respect column types when writing to db

df.to_sql('tv', connection, if_exists='append')
Sign up to request clarification or add additional context in comments.

4 Comments

So if I understand correctly. I merge the first and the second code. So I execute the query creating the table columns, then I use your code. However it gives me an error: "has no column named index".
Execute first code, and add to your second code after reading the csv df.columns = ['col1', 'col2'] where you set column names to your dataframe and then write to db with df.to_sql('tv', connection, if_exists='append')
It works, thanks. I had to remove the PRIMARY KEY for it to work.
No problem, keep in mind if you run the second code chunk again it will add the same data to your table again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.