1

I am using Kafka and Spark Structured Streaming. I am receiving kafka messages in following format.

{"deviceId":"001","sNo":1,"data":"aaaaa"}
{"deviceId":"002","sNo":1,"data":"bbbbb"}
{"deviceId":"001","sNo":2,"data":"ccccc"}
{"deviceId":"002","sNo":2,"data":"ddddd"}

I am reading it like below.

Dataset<String> data = spark
      .readStream()
      .format("kafka")
      .option("kafka.bootstrap.servers", bootstrapServers)
      .option(subscribeType, topics)
      .load()
      .selectExpr("CAST(value AS STRING)")
      .as(Encoders.STRING());
Dataset<DeviceData> ds = data.as(ExpressionEncoder.javaBean(DeviceData.class)).orderBy("deviceId","sNo"); 
ds.foreach(event -> 
      processData(event.getDeviceId(),event.getSNo(),event.getData().getBytes())
);}

private void processData(String deviceId,int SNo, byte[] data) 
{
  //How to check previous processed Dataset???
} 

In my json message "data" is String form of byte[]. I have a requirement where I need to process the binary "data" for given "deviceId" in order of "sNo". So for "deviceId"="001", I have to process the binary data for "sNo"=1 and then "sNo"=2 and so on. How can I check state of previous processed Dataset in Structured Streaming?

2
  • What did you try so far? Commented Feb 24, 2017 at 7:44
  • I have updated my code. Please check. I am doing orderBy and then forEach to process data. I am stuck at processData method how to handle current and previous data from Dataset received by streaming. Commented Feb 24, 2017 at 8:09

1 Answer 1

1

If you are looking for state management like DStream.mapWithState then it is not supported yet in Structured Streaming. Work is in progress. Please Check https://issues.apache.org/jira/browse/SPARK-19067.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.