6

I have a elasticsearch docker image listening on 127.0.0.1:9200, I tested it using sense and kibana, It works fine, I am able to index and query documents. Now when I try to write to it from a spark App

val sparkConf = new SparkConf().setAppName("ES").setMaster("local")
sparkConf.set("es.index.auto.create", "true")
sparkConf.set("es.nodes", "127.0.0.1")
sparkConf.set("es.port", "9200")
sparkConf.set("es.resource", "spark/docs")


val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
val rdd = sc.parallelize(Seq(numbers, airports))

rdd.saveToEs("spark/docs")

It fails to connect, and keeps on retrying

16/07/11 17:20:07 INFO HttpMethodDirector: I/O exception (java.net.ConnectException) caught when processing request: Operation timed out 16/07/11 17:20:07 INFO HttpMethodDirector: Retrying request

I tried using IPAddress given by docker inspect for the elasticsearch image, that also does not work. However when I use a native installation of elasticsearch, the Spark App runs fine. Any ideas?

3 Answers 3

3

Also, set the config

es.nodes.wan.only to true

As mentioned in this answer if you are having issues writing to ES.

Sign up to request clarification or add additional context in comments.

Comments

1

Couple things I would check:

  • The Elasticsearch-Hadoop spark connector version you are working with. Make sure that it is not beta. There was a fixed bug related to the IP resolving.

  • Since 9200 is the default port, you may remove this line: sparkConf.set("es.port", "9200") and check.

  • Check that there is no proxy configured in your Spark environment or config files.

  • I assume that you run Elasticsaerch and Spark on the same machine. Can you try to configure your machine IP address instead of 127.0.0.1

Hope this helps! :)

Comments

1

Had the same problem and a further issue was that the confs set using sparkConf.set() didn't have an effect. But supplying the confs with the saving function worked, like this:

rdd.saveToEs("spark/docs", Map("es.nodes" -> "127.0.0.1", "es.nodes.wan.only" -> "true"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.