0

I have tried running the Spark application for integration of Hbase and ES. I have tried creating the index in ES and storing the data from HBase, but received an issue “ the user is unauthorized or access denied” when connecting to ES server.

I have checked with the Operations team and bounced the ES server, have attempted running the application and got the attached exception-Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only' at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:327) at org.elasticsearch.spark.rdd.EsSpark$.doSaveToEs(EsSpark.scala:103) at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:79) at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:74) at org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:55)

I'm using Elasticsearch 6.1.1 v.Please let me know if anyone faced this issue and cleared the exception

2 Answers 2

0

When loading data into Elasticsearch using a Spark application, you may encounter authentication issues as Elasticsearch version 6.x and above uses SSL Certificates for authentication. To resolve this issue, you can follow these steps:

Pre-Requisites:

  1. Generate JKS File
  2. Generate PEM file

Resolution steps:

  1. Generate SSL Certificates using the command below:
 keytool -keystore <jks-file> -import -file <pem-file> 
  1. Validate if the certificates are generated using the following command:
 keytool -list -v -keystore <jks-file> 
  1. Provide the SSL certificate path using the Spark parameter driver-java-options as follows:
--driver-java-options="-Djavax.net.ssl.trustStore=<jks-file-location> -Djavax.net.ssl.trustStorePassword=<trust-store-pwd" 

By following these steps, your Spark application can successfully authenticate with the Elasticsearch cluster for data loading.

Sign up to request clarification or add additional context in comments.

Comments

-1

Thanks all for trying this issue, I have identified the issue. It may be helpful for you if you faced the similar issue.

The issue is we are overwriting the spark default configuration in mapr - /opt/mapr/spark/spark-2.1.0/conf

and the spark configuration we are passing in our application were not able to bind to sparkConfig. So it is pointing to local host during index creation(127.0.0.1:9200)- check in your exception log if you faced this

I have changed the configuration details in the application and passed those while creating the sparkSession object and I have tested the application.

Now, the application is working fine and I’m able to create the index in Elastic Search and load the data.

sparkConfig passed while creating the sparkSession:

**

val sparkConf = new SparkConf()
  .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
  .set("spark.es.index.auto.create", "true")
  .set("spark.es.nodes", "yourESaddress")
  .set("spark.es.port", "9200")
  .set("spark.es.net.http.auth.user","*******")
  .set("spark.es.net.http.auth.pass", "*******")
 .set("spark.es.resource", indexName)
  .set("spark.es.nodes.wan.only", "true")
val sparkSession = SparkSession.builder().config(sparkConf).appName("sourcedashboard").getOrCreate()

**

Thankyou..

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.