I am running an Apache Spark job on Amazon EMR that needs to connect to an Amazon MSK cluster configured with IAM authentication. The EMR cluster has an IAM role with full MSK permissions, and I can successfully access MSK bootstrap brokers via telnet and using Python Kafka clients with the same permissions.
However, when running my Spark Structured Streaming job on EMR, it fails with the error:
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: describeTopics
My Spark submit command includes all the necessary Kafka and AWS MSK IAM authentication jars, specifically:
spark-sql-kafka-0-10_2.12-3.5.1.jar
kafka-clients-3.5.1.jar
spark-token-provider-kafka-0-10_2.12-3.5.6.jar
EMR Version : emr-7.2.0 MSK Version : 3.6.0
The Spark streaming read is configured as follows:
python
spark.readStream.format("kafka") \
.option("kafka.bootstrap.servers", "<broker1:9098,broker2:9098,...>") \
.option("subscribe", "my_topic") \
.option("kafka.security.protocol", "SASL_SSL") \
.option("kafka.sasl.mechanism", "AWS_MSK_IAM") \
.option("kafka.sasl.jaas.config", "software.amazon.msk.auth.iam.IAMLoginModule required;") \
.option("kafka.sasl.client.callback.handler.class", "software.amazon.msk.auth.iam.IAMClientCallbackHandler") \
.load()
I have verified:
- EMR IAM role has required MSK permissions (Connect, DescribeCluster, DescribeTopic, etc.)
- Network connectivity to MSK brokers on port 9098 (SASL_SSL)
- Using compatible versions of Kafka client and IAM auth jars
I do NOT want to manually manage or distribute custom truststore files, as I expected the EMR JVM to trust MSK's default certificates automatically.
What could be the cause of the TimeoutException waiting for node assignment from Kafka when all connectivity checks pass and IAM permissions are verified?
Are there any best practices or additional configurations needed specifically on EMR or Spark to authenticate successfully with MSK using IAM?
Any guidance or examples of a working Spark + MSK IAM auth setup on EMR would be highly appreciated!
Thank you.