3

I am trying to insert data of json type with the copy method into a postgresql db. However I am getting the error mentioned below

invalid input syntax for type json

DETAIL:  Expected end of input, but found ""aco"".
CONTEXT:  JSON data, line 1: "{""aco"...
COPY flights2016jsn04, line 1, column flight: ""{""aco"": [""AXE"", ""AXE"", ""AXE"", ""AXE""], ""dist2Org"": 984753, ""flight_att"": {""ypos"": [8..." 

I am not sure what the issue and a bit comfused on what I am reading on the internet. Below is my code with some sample data.

import json
import io
import pandas as pd
import psycopg2


dict_ = {"dist2Org": 984753, "aco": ["AXE", "AXE", "AXE", "AXE"],
     "flight_att": {"xpos": [823.08988, 6540.32231, 999, 33321],
                    "ypos": [823.08988, 6540.32231, 999, 33321], "zpos": [823.08988, 6540.32231, 999.33321]}}

json_ = json.dumps(dict_)

col_json = ["id", "flight"]
df = pd.DataFrame([65654, json_]).T

df.to_csv('test_df')
output = io.StringIO()
# ignore the index
df.to_csv(output, sep='\t', header=False, index=False)
output.getvalue()
# jump to start of stream
output.seek(0)

conn = None
try:
    # connection string
    conn_string = '{0}{1} {2}'.format("host='localhost' dbname=", 'postgres', "user='postgres' password='xxxx'")
    # conn_string = "host='localhost' dbname='postgres' user='postgres' password='xxxx'"

    # connect to the PostgreSQL database
    conn = psycopg2.connect(conn_string)

    # create a new cursor
    cur = conn.cursor()

    # load data

    cur.copy_from(output, 'flights2016jsn04', null="", columns=(col_json))

    # # commit the changes to the database
    conn.commit()

    # close communication with the database
    cur.close()

except (Exception, psycopg2.DatabaseError) as error:
    print(error)

finally:
    if conn is not None:
        conn.close()
3
  • Try this: cur.copy_from(output, 'flights2016jsn04', null='', columns=(col_json)). JSON doesnt like "" but '' Commented Feb 10, 2019 at 18:06
  • @Martin, Unfortunately that also doesn't work Commented Feb 10, 2019 at 20:18
  • I know this error and it always happens when my JSON has extra quotes or double one. I cant replicate the code, so I will try to shoot my second idea: try to put your json without dumping df = pd.DataFrame([65654, dict_ ]).T Commented Feb 10, 2019 at 20:26

1 Answer 1

4

Postgres seems to only support quoting for csv input. The \t-separated text format expects no quotes. Pandas on the other hand uses " as the default quote char, which occurs a lot in JSON and needs to be escaped.

There are two options:

  • If you know that there will be no \ts in your data you could try df.to_csv(sep='\t', quoting=csv.QUOTE_NONE) instead.
  • Use psycopg2's copy_expert to copy csv instead.
Sign up to request clarification or add additional context in comments.

2 Comments

Yes! df.to_csv(output, sep='\t', quoting=csv.QUOTE_NONE) works. Thanks for the explanation!
Thanks for this! Searched ages to get down to this :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.