0

Hi My reqmnt is to create Analytics from http://10.3.9.34:9900/messages that is pull data from from http://10.3.9.34:9900/messages and put this data in HDFS location /user/cloudera/flume and from HDFS create Analytics report using Tableau or HUE UI . I tried with below code at scala console of spark-shell of CDH5.5 but unable to fetch data from the http link

import org.apache.spark.SparkContext
val dataRDD = sc.textFile("http://10.3.9.34:9900/messages")
dataRDD.collect().foreach(println)
dataRDD.count()
dataRDD.saveAsTextFile("/user/cloudera/flume")

I get below error at scala console:

java.io.IOException: No FileSystem for scheme: http at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2623) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2637) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)

1 Answer 1

2

You can't use a http endpoint as input, it needs to be a file system such as HDFS, S3 or local.

You would need a separate process which is pulling data from this endpoint, perhaps using something like Apache NiFi to land the data on a filesystem where you can then use it as input to Spark.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you , i was able to pull data from http socket using scala code,here is the code import org.apache.spark.SparkContext val data = scala.io.Source.fromURL("10.3.9.34:9900/merged").mkString val list = data.split("\n").filter(_ != "") val rdds = sc.parallelize(list) rdds.saveAsTextFile("/user/cloudera/spark")

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.