I'm trying to import data from an XML file to PostgreSQL. I would like to create columns from the attributes in the <row> tag. The problem is when I use xpath to get text, I get nothing.
Here is a sample input testinput.xml file
<tags>
<row Id="1" TagName=".net" Count="316293" ExcerptPostId="3624959" WikiPostId="3607476" />
<row Id="2" TagName="html" Count="1116853" ExcerptPostId="3673183" WikiPostId="3673182" />
<row Id="3" TagName="javascript" Count="2343663" ExcerptPostId="3624960" WikiPostId="3607052" />
</tags>
Here is my myquery.sql
DO $$
DECLARE xml_string xml;
BEGIN
xml_string := XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('/home/me/data/testinput.xml'), 'UTF8'));
DROP TABLE IF EXISTS tags;
CREATE TABLE tags AS
SELECT
(xpath('//ID', x))[1]::text as id,
(xpath('//TagName', x))[1]::text as tagName,
(xpath('//Count', x))[1]::text as count,
(xpath('//ExcerptPostId', x))[1]::text as excerptPostId,
(xpath('//WikiPostId', x))[1]::text as wikiPostId
FROM unnest(xpath('//row', xml_string)) x;
END$$;
SELECT * FROM tags;
I got this
id | tagname | count | excerptpostid | wikipostid
----+---------+-------+---------------+------------
| | | |
| | | |
| | | |
(3 rows)
By the way, my .xml file is around 100GB, so any fast solution would help. Thank you!