I'm having some issues with the index from a Pandas data frame. What I'm trying to do is load data from a JSON file, create a Pandas data frame and then select specific fields from that data frame and send it to my database.
The following is a link to what's in the JSON file so you can see the fields actually exist: https://pastebin.com/Bzatkg4L
import pandas as pd
from pandas.io import sql
import MySQLdb
from sqlalchemy import create_engine
# Open and read the text file where all the Tweets are
with open('US_tweets.json') as f:
tweets = f.readlines()
# Convert the list of Tweets into a structured dataframe
df = pd.DataFrame(tweets)
# Attributes needed should be here
df = df[['created_at', 'screen_name', 'id', 'country_code', 'full_name', 'lang', 'text']]
# To create connection and write table into MySQL
engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
.format(user="blah",
pw="blah",
db="blah"))
df.to_sql(con=engine, name='US_tweets_Table', if_exists='replace', flavor='mysql')
Thanks for your help!