0

How to import data from mysql to HDFS. I can't use sqoop as it's a HDFS installation not cloudera. I used below link to setup HDFS. My hadoop version is 0.20.2 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

3
  • I don't see what's stopping you using Sqoop as it's not in any way tied to Cloudera specific software. Commented Jun 11, 2012 at 13:42
  • Can you please guide me as in how to configure Sqoop? Commented Jun 13, 2012 at 7:06
  • I was able to do it by installing HIVE and than importing txt files into HDFS using HIVE. .. thanks all Commented Jun 14, 2012 at 10:16

2 Answers 2

1

Not directly related to your question, but if you want to use the database as input to a Map Reduce job, and don't want to copy to HDFS, you could use the DBInputFormat to input directly from the database.

Sign up to request clarification or add additional context in comments.

Comments

0

Apart from sqoop, you could try hiho. I have heard good things about it. (Never used it though)

But mostly what I have seen is people end up writing their own flows to do this. If hiho doesn;t work out, you can dump data from MySql using mysqlimport. Then load into HDFS using a map-reduce job or Pig/Hive.

I have heard Sqoop is pretty good and is widely used (This is hearsay again, I have never used it myself). Now that it is an apache incubator project, I think it might have started supporting apache releases of hadoop, or at least might have made it less painful for non-cloudera versions. The doc does say that it support Apache hadoop v0.21. Try to make it work with your hadoop version. It might not be that difficult.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.