1

My json data looks like this:

[
    {
        "fields": {
            "bkdate": null,
            "business_credit_card_total_balances": null,
            "business_credit_card_total_limits": null,
            "business_total_monthly_debt_payments": null,
            "business_total_mortgage_monthly_payments": null,
            "created_at": "2016-08-04T00:29:03.067Z",
            "detail_results": null,
            "error_reason": "no reasons",
            "fico_v2": "695",
            "fico_v3": null,
            "loanapp_id": 194,
            "personal_credit_card_total_balances": null,
            "personal_credit_card_total_limits": null,
            "personal_total_monthly_payments": null,
            "report_type": "CreditProfile",
            "result": true,
            "role": "applicant",
            "total_mortgage_monthly_payments": null,
            "username": "cho",
            "version": "CF Microloan",
            "xml_data": "<?xml version=\"1.0\" standalone=\"no\"?><NetConnectResponse xmlns=\"http://www.experian.com/NetConnectResponse\"><CompletionCode>0000</CompletionCode>"
        },
        "model": "common.prequalresult",
        "pk": 1
    }
]

I Want to convert it into SQL table. I tried this:

v = pd.DataFrame(data['fields'])
t = pd.io.json.json_normalize(data['fields'], ['model'], ['pk'], meta_prefix='parent_')

v.to_sql('fields', engine)
t.to_sql('fields', engine)

But it does not work. Can someone work it and output the SQL table?

My errors looks like this:

TypeError                                 Traceback (most recent call last)
<ipython-input-86-a186308b321b> in <module>()
      1 
----> 2 v = pd.DataFrame(data['fields'])
      3 t = pd.io.json.json_normalize(data['fields'], ['model'], ['pk'], meta_prefix='parent_')
      4 
      5 v.to_sql('fields', engine)

TypeError: list indices must be integers or slices, not str

I want to create two tables. One include 'fields', 'model' and 'pk'.And another table include all the values in 'fields'.

The schema looks like this: enter image description here

10
  • what error it is showing? Commented Feb 14, 2020 at 5:24
  • Please add more specific detail about your problem (for example what is not working? what is the error log ...) Commented Feb 14, 2020 at 5:24
  • I want to create two tables. One include 'fields', 'model' and 'pk'.And another table include all the values in 'fields'. The error is TypeError Traceback (most recent call last) <ipython-input-86-a186308b321b> in <module>() 1 ----> 2 v = pd.DataFrame(data['fields']) 3 t = pd.io.json.json_normalize(data['fields'], ['model'], ['pk'], meta_prefix='parent_') 4 5 v.to_sql('fields', engine) TypeError: list indices must be integers or slices, not str Commented Feb 14, 2020 at 5:31
  • Is data['fields'] string type? If it is, v = pd.DataFrame(data['fields']) will make error Commented Feb 14, 2020 at 5:58
  • question is not properly explained.... which are the fields you want to insert into table? Commented Feb 14, 2020 at 5:59

1 Answer 1

1

Considering your data which you want to store in Mysql DB. You can use a One-to-One relationship for storing in DB. Instead of using pandas DataFrame it can be handled properly using SQLAlchemy ORM. The following code gives a better understanding to solve this issue. Right now I have considered only few keys from fields key.

from sqlalchemy import Column, Integer, String, Text, DateTime, Float, Boolean, ForeignKey
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base

from sqlalchemy import create_engine

total_data = [
    {
        "fields": {
            "bkdate": None,
            "business_credit_card_total_balances": None,
            "created_at": "2016-08-04T00:29:03.067Z",
        },
        "model": "common.prequalresult",
        "pk": 1
    },
    {
        "fields": {
            "bkdate": "2016-08-04T00:29:03.067Z",
            "business_credit_card_total_balances": 23,
            "created_at": "2016-08-04T00:29:03.067Z",
        },
        "model": "common.prequalresult",
        "pk": 2
    },
    {
        "fields": {
            "bkdate": "asdfas",
            "business_credit_card_total_balances": 1111,
            "created_at": "2016-08-04T00:29:03.067Z",
        },
        "model": "common.prequalresult",
        "pk": 3
    }
]

engine = create_engine('mysql://user:password@localhost:5432/my_data', echo=False)

Base = declarative_base()


class Article(Base):
    __tablename__ = 'article'
    id = Column(Integer, primary_key=True, autoincrement=True)
    pk = Column(Integer, primary_key=False)
    model = Column(String(100), nullable=True)
    child = relationship('Comment', backref='article', uselist=False)


class Comment(Base):
    __tablename__ = 'comment'
    id = Column(Integer, primary_key=True, autoincrement=True)
    bkdate = Column(String(100), nullable=True)
    business_credit_card_total_balances = Column(Integer, nullable=True)
    created_at = Column(String(100), nullable=True)
    article_id = Column(Integer, ForeignKey('article.id'))


x = Base.metadata.create_all(engine)

Session = sessionmaker(bind=engine)
session = Session()

for temp_data in total_data:
    pk = temp_data['pk']
    model = temp_data['model']
    bkdate = temp_data['fields']['bkdate']
    business_credit_card_total_balances = temp_data['fields']['business_credit_card_total_balances']
    created_at = temp_data['fields']['created_at']
    parent1 = Article(pk=temp_data['pk'], model=model)
    child = Comment(bkdate=bkdate,
                    business_credit_card_total_balances=business_credit_card_total_balances,
                    created_at=created_at,
                    article=parent1)

    session.add(parent1)
    session.add(child)

session.commit()

As field pk given in each dict can be duplicated, so I have created id as a primary key.

Output:

mysql> select * from article;
+----+------+----------------------+
| id | pk   | model                |
+----+------+----------------------+
|  1 |    1 | common.prequalresult |
|  2 |    2 | common.prequalresult |
|  3 |    3 | common.prequalresult |
+----+------+----------------------+

mysql> select * from comment;
+----+--------------------------+-------------------------------------+--------------------------+------------+
| id | bkdate                   | business_credit_card_total_balances | created_at               | article_id |
+----+--------------------------+-------------------------------------+--------------------------+------------+
|  1 | NULL                     |                                NULL | 2016-08-04T00:29:03.067Z |          1 |
|  2 | 2016-08-04T00:29:03.067Z |                                  23 | 2016-08-04T00:29:03.067Z |          2 |
|  3 | asdfas                   |                                1111 | 2016-08-04T00:29:03.067Z |          3 |
+----+--------------------------+-------------------------------------+--------------------------+------------+
Sign up to request clarification or add additional context in comments.

2 Comments

I run the code. But it shows the erorr: ModuleNotFoundError: No module named 'MySQLdb'. I tried:pip install pymysql and I am using Google Colab.
Use pip install pymysql. Also install sqlalchemy in python

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.