How to load a JSON file into Postgres with all of its keys?

Question

I have a XML file, which has been broken down into smaller tables. I can load them to Navicat all fine, except for one table. Here's the XML structure:

<Food> 
  <Id> 100 </Id>
  <Type> Meat </Type>
  <Expiry Date>
    <Chicken>
      2020/12/20
    </Chicken>
    <Beef>
      2020/12/25
    </Beef>
  </Expiry Date>
</Food>

<Food>
  <Id> 200 </Id>
  <Type> Vegetables </Type>
  <Nutrition> B1 </Nutrition>
</Food>

I have turned it into JSON, using xmltodict in Python:

[{
"Id": "100",
"Type": "Meat",
"Expiry Date": {
  "Chicken": "2020/12/20",
  "Beef": "2020/12/25"
  }
},

{
"Id": "200",
"Type": "Vegetables",
"Nutrition": "B1"
}]

However when I load this JSON file onto Navicat (PostgresSQL connection), the SQL table schema only has Id, Type, and Expiry Date. As you can see, there are keys missing in one object but appear in other ones. How can I create a SQL table that has all fields from the JSON file? (Id, Type, Expiry Date, AND Nutrition).

Parse the JSON on the client side and issue the appropriate INSERT statements to the database. Alternatively, use json_array_elements and the ->> operator to do it in the database. — Laurenz Albe
– Laurenz Albe, Commented Dec 14, 2020 at 17:08

Stefanov.sm · Accepted Answer · 2020-12-14 20:08:06Z

1

If you do not have a special reason to first turn it into JSON then you may use XMLTABLE with the 'original' embedded XML like this:

select * 
from xmltable ( '//Food' passing 
    xmlparse (document '<dummyRoot>
    <Food> 
      <Id> 100 </Id>
      <Type> Meat </Type>
      <ExpiryDate>
        <Chicken>
          2020/12/20
        </Chicken>
        <Beef>
          2020/12/25
        </Beef>
      </ExpiryDate>
    </Food>
    <Food>
      <Id> 200 </Id>
      <Type> Vegetables </Type>
      <Nutrition> B1 </Nutrition>
    </Food>
    </dummyRoot>')
  columns 
   "Id" integer,
   "Type" text, 
   "ExpiryDate.Chicken" date path 'ExpiryDate/Chicken',
   "ExpiryDate.Beef" date path 'ExpiryDate/Beef',
   "Nutrition" text
);

<Expiry Date> and </Expiry Date> need to be changed to <ExpiryDate> and </ExpiryDate> respectively to become valid tag names. This is the result:

Id |Type        |ExpiryDate.Chicken|ExpiryDate.Beef|Nutrition|
---|------------|------------------|---------------|---------|
100| Meat       |        2020-12-20|     2020-12-25|         |
200| Vegetables |                  |               | B1      |

Edit Simplified XML query

If JSON is needed then as suggested by Laurenz Albe:

select 
    (j->>'Id')::integer id, 
    j->>'Type' "type", 
    (j->'Expiry Date'->>'Chicken')::date xdate_chicken, 
    (j->'Expiry Date'->>'Beef')::date xdate_beef,
    j->>'Nutrition' nutrition
from jsonb_array_elements
('[{
"Id": "100",
"Type": "Meat",
"Expiry Date": {
  "Chicken": "2020/12/20",
  "Beef": "2020/12/25"
  }
},
{
"Id": "200",
"Type": "Vegetables",
"Nutrition": "B1"
}]') j;

edited Dec 14, 2020 at 20:08

answered Dec 14, 2020 at 18:45

Stefanov.sm

13.3k2 gold badges25 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Hana Over a year ago

Thank you very much. However the XML file has a lot more elements than this, and the nodes within Expiry Date are also a lot more and unpredictable. Besides, there are more than 5 elements similar to Expiry Date. How can I achieve the same thing without having to manually list all the columns out?

Stefanov.sm Over a year ago

Well, I am afraid that I do not know. You can however use json_object_keys to extract the list of element names and then do some work with a text editor in order to list all the columns more easily with no (or less?) errors.

Stefanov.sm Over a year ago

And btw if the structure of Expiry Date is dynamic then why don't you have it as a JSON(B) column in the table? This may not be the most convenient thing to use but might be the best available, at least you will not lose data.

Hana Over a year ago

Oh that's exactly what I just come up with a few minutes ago hahaha. However, if I just load the exact JSON file into Postgres, I will have jsonb type but will lose Nutrition field.

Stefanov.sm Over a year ago

Well, why not a hybrid structure, regular columns (like id, type, nutrition) and structured ones in JSON (like expiry_date, ...)? This will give you the best of both worlds, i.e. keys and details.

Hana · Accepted Answer · 2020-12-24 15:15:50Z

0

I have found the solution for this. Using Python, I read through the XML file to get all possible column names, then inside each column name, I read into its subelements and write it as jsonb in Postgres.

answered Dec 24, 2020 at 15:15

Hana

32 bronze badges

Collectives™ on Stack Overflow

How to load a JSON file into Postgres with all of its keys?

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related