0

I have a XML file, which has been broken down into smaller tables. I can load them to Navicat all fine, except for one table. Here's the XML structure:

<Food> 
  <Id> 100 </Id>
  <Type> Meat </Type>
  <Expiry Date>
    <Chicken>
      2020/12/20
    </Chicken>
    <Beef>
      2020/12/25
    </Beef>
  </Expiry Date>
</Food>

<Food>
  <Id> 200 </Id>
  <Type> Vegetables </Type>
  <Nutrition> B1 </Nutrition>
</Food>

I have turned it into JSON, using xmltodict in Python:

[{
"Id": "100",
"Type": "Meat",
"Expiry Date": {
  "Chicken": "2020/12/20",
  "Beef": "2020/12/25"
  }
},

{
"Id": "200",
"Type": "Vegetables",
"Nutrition": "B1"
}]

However when I load this JSON file onto Navicat (PostgresSQL connection), the SQL table schema only has Id, Type, and Expiry Date. As you can see, there are keys missing in one object but appear in other ones. How can I create a SQL table that has all fields from the JSON file? (Id, Type, Expiry Date, AND Nutrition).

1
  • 1
    Parse the JSON on the client side and issue the appropriate INSERT statements to the database. Alternatively, use json_array_elements and the ->> operator to do it in the database. Commented Dec 14, 2020 at 17:08

2 Answers 2

1

If you do not have a special reason to first turn it into JSON then you may use XMLTABLE with the 'original' embedded XML like this:

select * 
from xmltable ( '//Food' passing 
    xmlparse (document '<dummyRoot>
    <Food> 
      <Id> 100 </Id>
      <Type> Meat </Type>
      <ExpiryDate>
        <Chicken>
          2020/12/20
        </Chicken>
        <Beef>
          2020/12/25
        </Beef>
      </ExpiryDate>
    </Food>
    <Food>
      <Id> 200 </Id>
      <Type> Vegetables </Type>
      <Nutrition> B1 </Nutrition>
    </Food>
    </dummyRoot>')
  columns 
   "Id" integer,
   "Type" text, 
   "ExpiryDate.Chicken" date path 'ExpiryDate/Chicken',
   "ExpiryDate.Beef" date path 'ExpiryDate/Beef',
   "Nutrition" text
);

<Expiry Date> and </Expiry Date> need to be changed to <ExpiryDate> and </ExpiryDate> respectively to become valid tag names. This is the result:

Id |Type        |ExpiryDate.Chicken|ExpiryDate.Beef|Nutrition|
---|------------|------------------|---------------|---------|
100| Meat       |        2020-12-20|     2020-12-25|         |
200| Vegetables |                  |               | B1      |

Edit Simplified XML query

If JSON is needed then as suggested by Laurenz Albe:

select 
    (j->>'Id')::integer id, 
    j->>'Type' "type", 
    (j->'Expiry Date'->>'Chicken')::date xdate_chicken, 
    (j->'Expiry Date'->>'Beef')::date xdate_beef,
    j->>'Nutrition' nutrition
from jsonb_array_elements
('[{
"Id": "100",
"Type": "Meat",
"Expiry Date": {
  "Chicken": "2020/12/20",
  "Beef": "2020/12/25"
  }
},
{
"Id": "200",
"Type": "Vegetables",
"Nutrition": "B1"
}]') j;
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you very much. However the XML file has a lot more elements than this, and the nodes within Expiry Date are also a lot more and unpredictable. Besides, there are more than 5 elements similar to Expiry Date. How can I achieve the same thing without having to manually list all the columns out?
Well, I am afraid that I do not know. You can however use json_object_keys to extract the list of element names and then do some work with a text editor in order to list all the columns more easily with no (or less?) errors.
And btw if the structure of Expiry Date is dynamic then why don't you have it as a JSON(B) column in the table? This may not be the most convenient thing to use but might be the best available, at least you will not lose data.
Oh that's exactly what I just come up with a few minutes ago hahaha. However, if I just load the exact JSON file into Postgres, I will have jsonb type but will lose Nutrition field.
Well, why not a hybrid structure, regular columns (like id, type, nutrition) and structured ones in JSON (like expiry_date, ...)? This will give you the best of both worlds, i.e. keys and details.
0

I have found the solution for this. Using Python, I read through the XML file to get all possible column names, then inside each column name, I read into its subelements and write it as jsonb in Postgres.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.