1

I have 16000 items in a ruby hash. (downloaded as XML and converted to hash) I need to push these into the database. Once a week some of them change, but I can't tell which ones.

What I've seen suggested is to go right to SQL, because active record was (on that site) 70 times slower on just a straight insert.. Not even thinking about the update/insert

I'm wondering what approach would be best.. Has anyone received a huge (or well smallish) chunk of data that they had to repeatedly insert/update?
Could you offer suggestions.

2 Answers 2

2

The fastest way to load massive amounts of data into PostgreSQL is the COPY command.

Just generate a delimited file with all data, TRUNCATE your table, drop any indexes and constraints, and then load your data with COPY.

After this, run ANALYZE on the target table, and then create indexes and constraints.

http://www.postgresql.org/docs/9.1/static/sql-copy.html

Sign up to request clarification or add additional context in comments.

Comments

1

I have a very similar use case. I read the XML file into the database directly, parse it with xpath() into a temporary table and do all the checking and upgrading with good all SQL. Works very well (and fast) for me.

I posted the code for that recently in a related answer here.
In case you have to deal with non-unique items in XML nodes, here is some more.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.