2

I need to know how to how to parse XML file in Spark. I am receiving streaming data from kafka and then need to parse that streamed data.

Here is my Spark code to receive data:

directKafkaStream.foreachRDD(rdd ->{
            rdd.foreach(s ->{
                System.out.println("&&&&&&&&&&&&&&&&&" +s._2 );
            });

And results:

<root>
<student>
<name>john</name>
<marks>90</marks>
</student>
</root>

How to pass these XML elements?

2
  • 1
    Have you searched for previous questions on this? Such as: stackoverflow.com/questions/33078221/xml-processing-in-spark Commented Sep 26, 2016 at 7:11
  • @Binary Nerd, Thanks for response. My spark application is reading data line by line. So i need to parse line by line without using start element and/or end element. Commented Sep 26, 2016 at 8:43

2 Answers 2

3

Thanks guys.. Problem Solved. Here is the solution.

String xml = "<name>xyz</name>";
DOMParser parser = new DOMParser();
try {
    parser.parse(new InputSource(new java.io.StringReader(xml)));
    Document doc = parser.getDocument();
    String message = doc.getDocumentElement().getTextContent();
    System.out.println(message);
} catch (Exception e) {
    // handle SAXException 
}
Sign up to request clarification or add additional context in comments.

1 Comment

@MasudRahman, please look at the mentioned link stackoverflow.com/questions/33078221/xml-processing-in-spark/…
2

As you are processing streaming data, it would be helpful to use databricks's spark-xml lib for xml data processing.

Reference: https://github.com/databricks/spark-xml

2 Comments

Thanks for response. My spark application is reading data line by line. So i need to parse line by line without using start element and/or end element.
I spent couple of hours with this, and then I found it does not read self-closing rows.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.