How to import data from mysql to HDFS. I can't use sqoop as it's a HDFS installation not cloudera. I used below link to setup HDFS. My hadoop version is 0.20.2 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
-
I don't see what's stopping you using Sqoop as it's not in any way tied to Cloudera specific software.Lars Francke– Lars Francke2012-06-11 13:42:12 +00:00Commented Jun 11, 2012 at 13:42
-
Can you please guide me as in how to configure Sqoop?Ahmad Osama– Ahmad Osama2012-06-13 07:06:32 +00:00Commented Jun 13, 2012 at 7:06
-
I was able to do it by installing HIVE and than importing txt files into HDFS using HIVE. .. thanks allAhmad Osama– Ahmad Osama2012-06-14 10:16:50 +00:00Commented Jun 14, 2012 at 10:16
2 Answers
Apart from sqoop, you could try hiho. I have heard good things about it. (Never used it though)
But mostly what I have seen is people end up writing their own flows to do this. If hiho doesn;t work out, you can dump data from MySql using mysqlimport. Then load into HDFS using a map-reduce job or Pig/Hive.
I have heard Sqoop is pretty good and is widely used (This is hearsay again, I have never used it myself). Now that it is an apache incubator project, I think it might have started supporting apache releases of hadoop, or at least might have made it less painful for non-cloudera versions. The doc does say that it support Apache hadoop v0.21. Try to make it work with your hadoop version. It might not be that difficult.