I need to import huge XML files to a database. After that, I need to transform it into another format.
At the moment I try to do that using Postgres.
I've already imported a 250 MB file to a table using
insert into test
(name, "element")
SELECT
(xpath('//title/text()', myTempTable.myXmlColumn))[1]::text AS name
,myTempTable.myXmlColumn as "element"
FROM unnest(
xpath
( '//test'
,XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('test.xml'), 'UTF8'))
)
) AS myTempTable(myXmlColumn)
;
But with bigger files (i tried a > 1 GB file I get
SQL Error [22023]: ERROR: requested length too large ERROR: requested length too large ERROR: requested length too large
My goal is to import and transform files with a size ~50 GB.
Any suggestions/alternatives?
Update:
The idea is not to import 1GB files into one field. The code above was able to load AND unnest my 250MB file into 1773844 rows in 3m 57s on my machine. I think this is not bad. After the file is imported I can transform the data relatively fast cause Postgres is good at that.
Any better ideas?
COPYfromSTDIN